Xen 
 
Home Products Support Community News
 
   
First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 1301
Product:  
Component:  
Status: ASSIGNED
Resolution:
Assigned To: Xen Bug List <xen-bugs@lists.xensource.com>
Hardware:  
OS:  
Version:  
Priority:  
Severity:  
Reporter: adolfo.ferreira@hotmail.com
Add CC:
CC:
Remove selected CCs
URL:
Summary:

Attachment Type Created Size Actions
Debug of Windows 2003 R2 SP2 text/plain 2008-07-18 09:31 18.81 KB Edit
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 1301 depends on: Show dependency tree
Show dependency graph
Bug 1301 blocks:

Additional Comments:








View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Opened: 2008-07-18 08:32
Im trying to transfer files using Windows 2003 to/from SAMBA and my connection
crashes.
My explorer.exe crashes and my Session in Samba dies but samba keeps running.
Samba just produces core.
This issue was produced using Windows 2003 with 0.8.9 then I upgraded to 0.9.10
and still happens so I created another VM with Windows 2003 and installed
0.9.10 and the same issue happened. I also tried Windows 2003 with 0.9.11-pre7.

 I will describe my scenario below:

Dom0:
 OS: Debian etch 64 bits
Machine: PowerEdge 2950
Xen 3.2.1
eth0 Broadcom gigabit
 eth1 Broadcom gigabit

DomU:
OS: Windows 2003 R2 SP3 64bits
Windows PV 0.9.11-pre7 (from meadowcourt.org)
Type: HVM
Xen Net Driver using bridge eth1

DomU:
OS: Debian etch 64bits
Type: PVM
eth0 using bridge eth1
Samba 3.0.30

 As icarus901@freenode suggested me I generated debug information as described
in XenWiki. I had use Windgb tool.
 I uploaded to the website my configs, logs and debug. The debug from windgb
was named Debug-16-07-2008-15_49.txt

 Im pretty sure this is a bug because if I boot Windows 2003 without /gplpv I
can comunicate with Samba without problem using RealTek emulation card.
 I also tried the following: I connected eth0 and eth1 from Dom0 to the same
physical switch. I connected Windows 2003 to eth0 (bridge) and Samba to eth1
(bridge). The communication was ok between then.

I also tried the same scenario but with Samba 3.0.24 and it happened too.

------- Comment #1 From adolfo.ferreira@hotmail.com 2008-07-18 09:31 -------
Created an attachment (id=779) [edit]
Debug of Windows 2003 R2 SP2

------- Comment #2 From adolfo.ferreira@hotmail.com 2008-07-18 09:32 -------
(From update of attachment 779 [edit])
I used Windgb as XenWiki described.

------- Comment #3 From adolfo.ferreira@hotmail.com 2008-07-18 11:37 -------
Ive tried the following scenarios too:

windows <-> bridge1 <-> eth0 <-> SWITCH  <-> eth1 <-> bridge2 <-> samba WORKS
with XEN PV Driver 0.9.10
windows <-> bridge1  <-> samba DOESNT WORK with XEN PV Driver
windows <-> bridge1  <-> samba WORKS without /gplpv

------- Comment #4 From James Harper 2008-07-19 03:53 -------
Given that it works when routing packets via a real switch, I'm wondering if we
have a problem somewhere with GSO (or possibly, but less likely, csum offload).

GSO packets will stay in 'large' format until Linux determines that they need
to be split up into MSS sized chunks, which will happen when the packet
traverses a physical interface (eg in the example where you included a switch).
The realtek emulated adapter definitely doesn't do GSO.

Can you please try turning off GSO (large send offload) in the network settings
in Windows (properties of the network adapter) and see if that makes a
difference. If not, please try disabling GSO on the samba domain too.

If that does make a difference, we then need to determine why, but lets cross
that one when we come to it.

------- Comment #5 From adolfo.ferreira@hotmail.com 2008-07-21 05:05 -------
I forgot to mention but I also tried to disable GSO in windows before reporting
this issue but the problem did appear too.
I tried again now and the same happened so I saw in Linux the GSO setting but
it was already off. TSO(tcp segmentation offload was ON but I couldn`t set to
off using ethtool.

------- Comment #6 From adolfo.ferreira@hotmail.com 2008-07-21 05:06 -------
After this try I also rebooted and tested. After it I also disabled checksum
offload in Windows but the issue happened too.

------- Comment #7 From James Harper 2008-07-21 05:09 -------
I guess it's packet trace time then. Can you fire up wireshark and get a packet
capture of the point when it breaks? Before you send me anything, please check
and see if you can see any packets bigger than 1500 bytes (or 1514 including
the header). This would indicate that GSO isn't being disabled properly.

------- Comment #8 From adolfo.ferreira@hotmail.com 2008-07-21 05:21 -------
I could notice that after disabling checksum offload my Samba didn`t crashed
anymore and I was able to transfer small files like 60mb. Other files like 600
mb the transfer started but next to the middle of the transfer the time counter
became crazy and  was increasing time too fast. Later the transfer ended
without corruption of data. I did other equal transfer and the file was not
transfered.
Conclusion: Samba didnt crash when I disabled checksum offload but the trasfer
was not able to estimate the right time.

------- Comment #9 From adolfo.ferreira@hotmail.com 2008-07-21 05:27 -------
I disabled checksum offload in WINDOWS only.
My windows settings was: checksum offload OFF; GSO OFF;
Linux GSO off; TSO on


(In reply to comment #8)
> I could notice that after disabling checksum offload my Samba didn`t crashed
> anymore and I was able to transfer small files like 60mb. Other files like 600
> mb the transfer started but next to the middle of the transfer the time counter
> became crazy and  was increasing time too fast. Later the transfer ended
> without corruption of data. I did other equal transfer and the file was not
> transfered.
> Conclusion: Samba didnt crash when I disabled checksum offload but the trasfer
> was not able to estimate the right time.
> 

------- Comment #10 From adolfo.ferreira@hotmail.com 2008-07-21 05:50 -------
Folks,

After disabling checksum offload seens to be working fine now! It`s transfering
fast but when I want to transfer files with size lasrge than 700mb the transfer
time can not estimate correctly.
Can I help you with more information to detect why checksum offload is not
working properly?

------- Comment #11 From James Harper 2008-07-21 06:03 -------
> After disabling checksum offload seens to be working fine now! It`s transfering
> fast but when I want to transfer files with size lasrge than 700mb the transfer
> time can not estimate correctly.

Is the estimate correct when you use the qemu drivers?

In doing some testing I have noticed that sometimes the xennet drivers perform
badly, like 1/10th of the normal performance. I think something is happening
that is causing packets to be corrupted or dropped, causing tcp to go into slow
restart. On a 10 second test this could have a big impact if it happens a few
times. Of course the rings filling up would cause this too.

> Can I help you with more information to detect why checksum offload is not
> working properly?

If you could do a packet trace (I recommend wireshark for this) with GSO
disabled and CSUM offload enabled (eg your failing configuration) and confirm
to me that packets are all <1500/1514 bytes it would be useful. I'm wondering
if GSO is staying enabled even if you disabled it. A packet much larger than
1500 bytes would indicate this. Disabling csum offload should make sure it is
never enabled so you may be masking the problem by disabling that.

Otherwise, I really need to see the packet that is killing samba... preferably
on the Linux side of things... that's going to be tricky to get though.

I might put some ASSERT statements in the code to detect for things that
shouldn't happen and see if they get tripped up....

------- Comment #12 From adolfo.ferreira@hotmail.com 2008-07-21 06:59 -------
Im preparing another VM only for debug tests. Soon today I will snif the
network with wireshark and tcpdump.

------- Comment #13 From adolfo.ferreira@hotmail.com 2008-07-22 04:15 -------
Harper,
I did as you said. GSO ON and Checksum offload ON in Windows side. I`ve got TCP
segment data with size of 1460 bytes. 
Using wireshark I could notice a highlighting red in checksum saying:
"Checksum: 0x18b6 [incorrect, should be 0x8bb4 (maybe caused by "TCP checksum
offload"?)]"
If you need a sniffing on Linux side please let me know.

------- Comment #14 From adolfo.ferreira@hotmail.com 2008-07-22 05:37 -------
Sorry. I wrote wrong. I did as you said GSO OFF and CheckSum ON in Windows
side.

First Last Prev Next    No search results available      Search page      Enter new bug