Bugzilla Version 2.20.1
View Bug Activity | Format For Printing | Clone This Bug
Happens fairly regularly when I have a domU crash - running on a v40z. Jul 3 05:34:04 ubuntu kernel: device vif3.0 left promiscuous mode Jul 3 05:34:04 ubuntu kernel: xen-br0: port 3(vif3.0) entering disabled state Jul 3 05:34:04 ubuntu kernel: xen-br0: port 3(vif3.0) entering disabled state Jul 3 05:34:04 ubuntu kernel: ------------[ cut here ]------------ Jul 3 05:34:04 ubuntu kernel: kernel BUG at include/linux/dcache.h:293! Jul 3 05:34:04 ubuntu kernel: invalid operand: 0000 [#1] Jul 3 05:34:04 ubuntu kernel: SMP Jul 3 05:34:04 ubuntu kernel: Modules linked in: video thermal processor fan button battery ac mptscsih mptbase Jul 3 05:34:04 ubuntu kernel: CPU: 0 Jul 3 05:34:04 ubuntu kernel: EIP: 0061:[<c0193100>] Not tainted VLI Jul 3 05:34:04 ubuntu kernel: EFLAGS: 00010246 (2.6.11.12-xen0) Jul 3 05:34:04 ubuntu kernel: EIP is at sysfs_remove_dir+0x100/0x110 Jul 3 05:34:04 ubuntu kernel: eax: 00000000 ebx: cf4630d4 ecx: dc0b6d50 edx: cf4630d4 Jul 3 05:34:04 ubuntu kernel: esi: cbd1ea20 edi: d24d853c ebp: 00000006 esp: c089de64 Jul 3 05:34:04 ubuntu kernel: ds: 007b es: 007b ss: 0069 Jul 3 05:34:04 ubuntu kernel: Process events/0 (pid: 10, threadinfo=c089c000 task=c075ca40) Jul 3 05:34:04 ubuntu kernel: Stack: c0191f02 c9ef0918 ddd7f800 cf4630d4 cbd1ea20 ddd7f800 00000006 c0211070 Jul 3 05:34:04 ubuntu kernel: cf4630d4 00000002 cf463040 c03f087a cf4630d4 cf463040 cbd1ea20 c03f1948 Jul 3 05:34:04 ubuntu kernel: cbd1ea20 ddd7f800 c04c84a0 ddd7f800 00000006 ddd7f944 c012cd55 c04c84a0 Jul 3 05:34:04 ubuntu kernel: Call Trace: Jul 3 05:34:04 ubuntu kernel: [<c0191f02>] sysfs_hash_and_remove+0x52/0xe9 Jul 3 05:34:04 ubuntu kernel: [<c0211070>] kobject_del+0x20/0x30 Jul 3 05:34:04 ubuntu kernel: [<c03f087a>] br_del_if+0x3a/0x5c Jul 3 05:34:04 ubuntu kernel: [<c03f1948>] br_device_event+0xb8/0x100 Jul 3 05:34:04 ubuntu kernel: [<c012cd55>] notifier_call_chain+0x25/0x40 Jul 3 05:34:04 ubuntu kernel: [<c03a4a2f>] unregister_netdevice+0x14f/0x270 Jul 3 05:34:04 ubuntu kernel: [<c03a4b65>] unregister_netdev+0x15/0x1e Jul 3 05:34:04 ubuntu kernel: [<c02be4f5>] netif_destroy+0x75/0x90 Jul 3 05:34:04 ubuntu kernel: [<c02bdeb4>] netif_ctrlif_rx+0x64/0xb0 Jul 3 05:34:04 ubuntu kernel: [<c0105550>] __ctrl_if_rxmsg_deferred+0x40/0x50 Jul 3 05:34:04 ubuntu kernel: [<c012fbc8>] worker_thread+0x1d8/0x260 Jul 3 05:34:04 ubuntu kernel: [<c0105510>] __ctrl_if_rxmsg_deferred+0x0/0x50 Jul 3 05:34:04 ubuntu kernel: [<c011a930>] default_wake_function+0x0/0x20 Jul 3 05:34:04 ubuntu kernel: [<c011a930>] default_wake_function+0x0/0x20 Jul 3 05:34:04 ubuntu kernel: [<c012f9f0>] worker_thread+0x0/0x260 Jul 3 05:34:04 ubuntu kernel: [<c01341ad>] kthread+0xbd/0x100 Jul 3 05:34:04 ubuntu kernel: [<c01340f0>] kthread+0x0/0x100 Jul 3 05:34:04 ubuntu kernel: [<c0107b15>] kernel_thread_helper+0x5/0x10 Jul 3 05:34:04 ubuntu kernel: Code: 89 44 24 08 8b 00 89 04 24 e8 0d 25 fb ff 8b 54 24 08 8b 42 04 89 04 24 e8 7e e0 07 00 8b 44 24 08 89 04 24 e8 f2 24 fb ff eb 92 <0f> 0b 25 01 53 65 42 c0 e9 13 ff ff ff 8d 76 00 83 ec 20 89 5c
reproducable on a poweredge 1425, the key to triggering it seems to be to running a guest that crashes on an SMP DOM0
*** Bug 120 has been marked as a duplicate of this bug. ***
Debugging this some more, and there are two ways to get into br_del_if() function which calls the sysfs call which ultimately oops when the dentry->d_count is 0. One way is from the bridge IOCTL, the function add_del_if() which presumably is called from brctl in some manner (I've not looked at the code). The other path is from the bridges notifier block which registers for net_dev events, the handler is br_device_event, on NETDEV_UNREGISTER event, it calls br_del_if(). In this bug's case, the IOCTL is triggered from brctl to take out the vif, and br_del_if() is called once, then for some reason, the notifier chain triggers and calls br_del_if() a second time. I've not figured out WHY the chain isnt called every time the vif is removed, but whenever the notify chain is called, it is called on a bridge port that has already been removed. Here is some trace info: *DOMU Created * xen-br0: port 3(vif20.0) entering learning state xen-br0: topology change detected, propagating xen-br0: port 3(vif20.0) entering forwarding state *DOMU Destroyed, brctl down called sending IOCTL * WARK: add_del_if() calling br_del_if(db8e0240,d9b82c00) WARK: calling br_sysfs_removeif(p=3) WARK: calling delete on &p->kobj WARK: del_nbp running on da77adc0 xen-br0: port 3(vif20.0) entering disabled state xen-br0: port 3(vif20.0) entering disabled state WARK: calling unregister_netdev(-642241536) * Everything OK, notifier chain not called * * DOMU Created * xen-br0: port 3(vif21.0) entering learning state xen-br0: topology change detected, propagating xen-br0: port 3(vif21.0) entering forwarding state * DOMU Destroyed, brctl down called sending IOCTL * WARK: add_del_if() calling br_del_if(db8e0240,db105400) WARK: calling br_sysfs_removeif(p=3) WARK: calling delete on &p->kobj WARK: del_nbp running on daa87ec0 xen-br0: port 3(vif21.0) entering disabled state xen-br0: port 3(vif21.0) entering disabled state WARK: calling unregister_netdev(-619686912) * For some reason the notifier is triggered *THIS TIME* * and we call br_del_if() a second time on same br and port WARK: notify chain calling br_del_if(db8e0240,db105400) WARK: calling br_sysfs_removeif(p=3) * ooops, not suppose to call dget on d_count=0 * * DOM0/Xen go BOOM * Next thing is to hack up the br_device_event() to see what events are coming through.
Created an attachment (id=43) [edit] Workaround for double call to br_del_if() There appears to be a race between when the first br_del_if() call sets port state to disabled, and then when the dev->br_port pointer is set to NULL. Every so often when unregister_netdevice() triggers the notify call change, br_device_event() routine gets NETDEV_UNREGISTER event, but device->br_port isn't NULL yet. br_port is set to NULL after br_del_if() call del_npb() which calls destroy_npb(). So, good sequence: br_del_if() del_nbp() // port state set to disabled, but dev->br_port is not NULL br_stp_disable_port() destroy_nbp_rcu() // set dev->br_port = NULL, among other things destroy_nbp() notify_call_chain() br_device_event() dev->br_port == NULL, bail. ----------------------------------------- bad sequence: br_del_if() del_nbp() // port state set to disabled, but dev->br_port is not NULL br_stp_disable_port() // notify call chain gets scheduled notify_call_chain() br_device_event() dev->br_port != NULL br_del_if() BOOM!
The real race is between when a call_rcu() callback runs and when the netif_destroy() calls unregister_netdev(). When we get an oops, the brctl delif IOCTL has run, and br_del_if() has been called setting the port state to DISABLED, and then queues up an rcu callback, destroy_npb(), which will set dev->br_port = NULL. add_del_if() // IOCTL handler from brctl delif // xen-br0 $VIF br_del_if() del_nbp() br_stp_disable_port() // Set port->state = BR_STATE_DISABLED call_rcu(destroy_nbp_rcu) // setup rcu callback to run // destory_nbp() which will set // dev->br_port = NULL After the xen scripts have called the brctl command, xend sends the disconnect and destroy control messages, which trigger: netif_destroy() unregister_netdev() unregister_netdevice() notify_call_chain(NETDEV_UNREGISTER) br_device_event() // The first thing done here is to check // the device's br_port to see if it is // NULL If the destory_nbp_rcu() callback isn't fired before br_device_event() checks dev->br_port, then the NULL check fails and a second call to br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref counts of dentrys. [2]
This changeset includes the best fix until Linux Bridge code is updated to deal with the race. # HG changeset patch # User kaf24@firebug.cl.cam.ac.uk # Node ID d61f8afd292c146a08b8047f95941bcd9477d504 # Parent 37030498b9782169fccc14b064daf83714557c8d 1. Remove workaround patch 2. Update scripts/vif-bridge to not call 'brctl delif'