tg3 network cards

Ryan Steele ryans at aweber.com
Thu Dec 4 00:45:05 CET 2008


This may or may not be the proper list/outlet for this, so if it's not, 
feel free to let me know and I'll pursue it elsewhere, but since it came 
up during a FAI installation I'll start here.

I've got a server with a Broadcom network card (a 5721) and I'm using 
the tg3 driver, and the box with that NIC that just absolutely refuses 
to get through the initrd.  The installation hangs in the initramfs in 
/scripts/live on the function do_netmount(), and I'm pretty sure it's 
because the 'ipconfig' binary included in the initrd is killing 
networking.  I've spent a few days now hacking the initrd, the init 
script, and it's functions to determine the path it takes to get there, 
which appears to be:


1. init is invoked
2. init sources /scripts/live (since boot=live in the pxelinux.cfg), and 
then calls the function mountroot(), which is defined in /scripts/live
3. mountroot calls several other functions within the /scripts/live 
script, and eventually gets to do_netmount()
4. inside do_netmount, it encounters a line in which the binary 
'ipconfig' (yes, ipconfig, not ifconfig) is called, and this is where it 
hangs.  I've added some debugging code for clarity:


do_netmount ()
{
   rc=1

   modprobe -q af_packet # For DHCP

   udevtrigger
   udevsettle

   echo -e "\nThis is right before we 'ipconfig ${DEVICE}'\n" 
 >/dev/console 2>&1
   ipconfig ${DEVICE} | tee /netboot.config

   echo -e "\nRight before sourcing ipconfig output\n" >/dev/console 2>&1
   # source relevant ipconfig output
   OLDHOSTNAME=${HOSTNAME}
   . /tmp/net-${DEVICE}.conf
 
  <snip>
}


With that debugging output in place, the last output to the console is:



This is right before we 'ipconfig eth0'


[  100.068705] tg3: eth0: Link is up at 1000 Mbps, full duplex.
[  100.068767] tg3: eth0: Flow control is off for TX and off for RX.
[  393.930374] Machine check events logged
[  699.732829] Machine check events logged



...and from there it just hangs indefinitely.  I know for a fact that 
the kernel module, tg3.ko, is being loaded by load_modules, so that's 
not the problem - in fact I'm almost 100% positive that 'ipconfig' is 
killing network connectivity.  I initially thought I was (and maybe I 
still am) getting bitten by a really crappy Broadcom card/driver, but 
when I tested ipconfig in a VM (extracted the initrd, ran 'bin/ipconfig 
eth0 | tee /outfile'), the only way I could get the machine to be 
operable over the network again is to pop in to the VM console and issue 
an '/etc/init.d/networking restart'.  That may not be a great litmus 
test since it's using a virtual interface, not a real hardware interface 
- but it does mimic the behavior I see on real hardware exactly.

I would love to hear any insights and/or similar experiences others may 
have had with this.  I have several other servers that use different 
network drivers (igb, e1000, etc.) that all seem to work just fine, 
which furthers my feelings that this Broadcom card is just poorly 
supported on Linux.  I've tried both the tg3.ko that ships with Ubuntu, 
and compiling the driver myself, both with the same results.


More information about the linux-fai mailing list