Xen 
 
Home Products Support Community News
 
   
First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 90
Product:  
Component:  
Status: RESOLVED
Resolution: FIXED
Assigned To: Ryan Harper <ryanh@us.ibm.com>
Hardware:  
OS:  
Version:  
Priority:  
Severity:  
Reporter: Kip Macy <kip.macy@gmail.com>
Add CC:
CC:
Remove selected CCs
URL:
Summary:

Attachment Type Created Size Actions
Workaround for double call to br_del_if() patch 2005-08-18 20:44 450 bytes Edit
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 90 depends on: Show dependency tree
Show dependency graph
Bug 90 blocks:

Additional Comments:







View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Opened: 2005-07-03 05:39
Happens fairly regularly when I have a domU crash - running on a v40z. 

Jul  3 05:34:04 ubuntu kernel: device vif3.0 left promiscuous mode
Jul  3 05:34:04 ubuntu kernel: xen-br0: port 3(vif3.0) entering disabled state
Jul  3 05:34:04 ubuntu kernel: xen-br0: port 3(vif3.0) entering disabled state
Jul  3 05:34:04 ubuntu kernel: ------------[ cut here ]------------
Jul  3 05:34:04 ubuntu kernel: kernel BUG at include/linux/dcache.h:293!
Jul  3 05:34:04 ubuntu kernel: invalid operand: 0000 [#1]
Jul  3 05:34:04 ubuntu kernel: SMP 
Jul  3 05:34:04 ubuntu kernel: Modules linked in: video thermal processor fan
button battery ac mptscsih mptbase
Jul  3 05:34:04 ubuntu kernel: CPU:    0
Jul  3 05:34:04 ubuntu kernel: EIP:    0061:[<c0193100>]    Not tainted VLI
Jul  3 05:34:04 ubuntu kernel: EFLAGS: 00010246   (2.6.11.12-xen0) 
Jul  3 05:34:04 ubuntu kernel: EIP is at sysfs_remove_dir+0x100/0x110
Jul  3 05:34:04 ubuntu kernel: eax: 00000000   ebx: cf4630d4   ecx: dc0b6d50  
edx: cf4630d4
Jul  3 05:34:04 ubuntu kernel: esi: cbd1ea20   edi: d24d853c   ebp: 00000006  
esp: c089de64
Jul  3 05:34:04 ubuntu kernel: ds: 007b   es: 007b   ss: 0069
Jul  3 05:34:04 ubuntu kernel: Process events/0 (pid: 10, threadinfo=c089c000
task=c075ca40)
Jul  3 05:34:04 ubuntu kernel: Stack: c0191f02 c9ef0918 ddd7f800 cf4630d4
cbd1ea20 ddd7f800 00000006 c0211070 
Jul  3 05:34:04 ubuntu kernel:        cf4630d4 00000002 cf463040 c03f087a
cf4630d4 cf463040 cbd1ea20 c03f1948 
Jul  3 05:34:04 ubuntu kernel:        cbd1ea20 ddd7f800 c04c84a0 ddd7f800
00000006 ddd7f944 c012cd55 c04c84a0 
Jul  3 05:34:04 ubuntu kernel: Call Trace:
Jul  3 05:34:04 ubuntu kernel:  [<c0191f02>] sysfs_hash_and_remove+0x52/0xe9
Jul  3 05:34:04 ubuntu kernel:  [<c0211070>] kobject_del+0x20/0x30
Jul  3 05:34:04 ubuntu kernel:  [<c03f087a>] br_del_if+0x3a/0x5c
Jul  3 05:34:04 ubuntu kernel:  [<c03f1948>] br_device_event+0xb8/0x100
Jul  3 05:34:04 ubuntu kernel:  [<c012cd55>] notifier_call_chain+0x25/0x40
Jul  3 05:34:04 ubuntu kernel:  [<c03a4a2f>] unregister_netdevice+0x14f/0x270
Jul  3 05:34:04 ubuntu kernel:  [<c03a4b65>] unregister_netdev+0x15/0x1e
Jul  3 05:34:04 ubuntu kernel:  [<c02be4f5>] netif_destroy+0x75/0x90
Jul  3 05:34:04 ubuntu kernel:  [<c02bdeb4>] netif_ctrlif_rx+0x64/0xb0
Jul  3 05:34:04 ubuntu kernel:  [<c0105550>] __ctrl_if_rxmsg_deferred+0x40/0x50
Jul  3 05:34:04 ubuntu kernel:  [<c012fbc8>] worker_thread+0x1d8/0x260
Jul  3 05:34:04 ubuntu kernel:  [<c0105510>] __ctrl_if_rxmsg_deferred+0x0/0x50
Jul  3 05:34:04 ubuntu kernel:  [<c011a930>] default_wake_function+0x0/0x20
Jul  3 05:34:04 ubuntu kernel:  [<c011a930>] default_wake_function+0x0/0x20
Jul  3 05:34:04 ubuntu kernel:  [<c012f9f0>] worker_thread+0x0/0x260
Jul  3 05:34:04 ubuntu kernel:  [<c01341ad>] kthread+0xbd/0x100
Jul  3 05:34:04 ubuntu kernel:  [<c01340f0>] kthread+0x0/0x100
Jul  3 05:34:04 ubuntu kernel:  [<c0107b15>] kernel_thread_helper+0x5/0x10
Jul  3 05:34:04 ubuntu kernel: Code: 89 44 24 08 8b 00 89 04 24 e8 0d 25 fb ff
8b 54 24 08 8b 42 04 89 04 24 e8 7e e0 07 00 8b 44 24 
08 89 04 24 e8 f2 24 fb ff eb 92 <0f> 0b 25 01 53 65 42 c0 e9 13 ff ff ff 8d 76
00 83 ec 20 89 5c

------- Comment #1 From Kip Macy 2005-07-04 22:27 -------
reproducable on a poweredge 1425, the key to triggering it seems to be to
running a guest that crashes on an SMP DOM0

------- Comment #2 From Ryan Harper 2005-08-17 18:34 -------
*** Bug 120 has been marked as a duplicate of this bug. ***

------- Comment #3 From Ryan Harper 2005-08-18 17:36 -------
Debugging this some more, and there are two ways to get into br_del_if()
function which calls the sysfs call which ultimately oops when the
dentry->d_count is 0.  One way is from the bridge IOCTL, the function
add_del_if() which presumably is called from brctl in some manner (I've not
looked at the code).  The other path is from the bridges notifier block which
registers for net_dev events, the handler is br_device_event, on
NETDEV_UNREGISTER event, it calls br_del_if().

In this bug's case, the IOCTL is triggered from brctl to take out the vif, and
br_del_if() is called once, then for some reason, the notifier chain triggers
and calls br_del_if() a second time.  I've not figured out WHY the chain isnt
called every time the vif is removed, but whenever the notify chain is called,
it is called on a bridge port that has already been removed.  

Here is some trace info:

*DOMU Created *
xen-br0: port 3(vif20.0) entering learning state
xen-br0: topology change detected, propagating
xen-br0: port 3(vif20.0) entering forwarding state
*DOMU Destroyed, brctl down called sending IOCTL *
WARK: add_del_if() calling br_del_if(db8e0240,d9b82c00)
WARK: calling br_sysfs_removeif(p=3)
WARK: calling delete on &p->kobj
WARK: del_nbp running on da77adc0
xen-br0: port 3(vif20.0) entering disabled state
xen-br0: port 3(vif20.0) entering disabled state
WARK: calling unregister_netdev(-642241536)
* Everything OK, notifier chain not called *

* DOMU Created *
xen-br0: port 3(vif21.0) entering learning state
xen-br0: topology change detected, propagating
xen-br0: port 3(vif21.0) entering forwarding state
* DOMU Destroyed, brctl down called sending IOCTL *
WARK: add_del_if() calling br_del_if(db8e0240,db105400)
WARK: calling br_sysfs_removeif(p=3)
WARK: calling delete on &p->kobj
WARK: del_nbp running on daa87ec0
xen-br0: port 3(vif21.0) entering disabled state
xen-br0: port 3(vif21.0) entering disabled state
WARK: calling unregister_netdev(-619686912)
* For some reason the notifier is triggered *THIS TIME*
* and we call br_del_if() a second time on same br and port
WARK: notify chain calling br_del_if(db8e0240,db105400)
WARK: calling br_sysfs_removeif(p=3)
* ooops, not suppose to call dget on d_count=0 *
* DOM0/Xen go BOOM *

Next thing is to hack up the br_device_event() to see what events are coming
through.

------- Comment #4 From Ryan Harper 2005-08-18 20:44 -------
Created an attachment (id=43) [edit]
Workaround for double call to br_del_if()

There appears to be a race between when the first br_del_if() call sets port
state to disabled, and then when the dev->br_port pointer is set to NULL. 
Every so often when unregister_netdevice() triggers the notify call change,
br_device_event() routine gets NETDEV_UNREGISTER event, but device->br_port
isn't NULL yet.  br_port is set to NULL after br_del_if() call del_npb() which
calls 
destroy_npb().	

So, good sequence:

br_del_if()
  del_nbp()
     // port state set to disabled, but dev->br_port is not NULL
     br_stp_disable_port()
     destroy_nbp_rcu()
	// set dev->br_port = NULL, among other things
	destroy_nbp()

notify_call_chain()
  br_device_event()
     dev->br_port == NULL, bail.

-----------------------------------------
bad sequence:

br_del_if()
  del_nbp()
     // port state set to disabled, but dev->br_port is not NULL
     br_stp_disable_port()
// notify call chain gets scheduled 
notify_call_chain()
  br_device_event()
     dev->br_port != NULL
     br_del_if()
       BOOM!

------- Comment #5 From Ryan Harper 2005-08-19 18:58 -------
The real race is between when a call_rcu() callback runs and when the
netif_destroy() calls unregister_netdev().

When we get an oops, the brctl delif IOCTL has run, and br_del_if() has
been called setting the port state to DISABLED, and then queues up an
rcu callback, destroy_npb(),  which will set dev->br_port = NULL.

add_del_if()                       // IOCTL handler from brctl delif
                                   // xen-br0 $VIF
  br_del_if()
    del_nbp()
      br_stp_disable_port()        // Set port->state = BR_STATE_DISABLED
      call_rcu(destroy_nbp_rcu)    // setup rcu callback to run
                                   // destory_nbp() which will set
                                   // dev->br_port = NULL

After the xen scripts have called the brctl command, xend sends the
disconnect and destroy control messages, which trigger:

netif_destroy()
   unregister_netdev()
      unregister_netdevice()
         notify_call_chain(NETDEV_UNREGISTER)
            br_device_event()    // The first thing done here is to check
                                 // the device's br_port to see if it is
                                 // NULL

If the destory_nbp_rcu() callback isn't fired before br_device_event()
checks dev->br_port, then the NULL check fails and a second call to
br_del_if() is invoked and we blow up in sysfs/kobject BUG_ON() for ref
counts of dentrys. [2]

------- Comment #6 From Ryan Harper 2005-08-25 20:24 -------
This changeset includes the best fix until Linux Bridge code is updated to deal
with the race.

# HG changeset patch
# User kaf24@firebug.cl.cam.ac.uk
# Node ID d61f8afd292c146a08b8047f95941bcd9477d504
# Parent  37030498b9782169fccc14b064daf83714557c8d
1. Remove workaround patch
2. Update scripts/vif-bridge to not call 'brctl delif'

First Last Prev Next    No search results available      Search page      Enter new bug