Strangeness with ipconfig

Alex Meyer alex at thefind.com
Tue Apr 6 02:03:34 CEST 2010


Let me preface this by saying I'm somewhat experienced with FAI.  I've
used it with sarge and with etch and now I'm starting on the lenny
version.  After my experiences with etch, I was expecting a very smooth
transition to lenny.

Unfortunately, I'm running into difficulty.  It seems that the
transition to using debian live has led to a problem in my situation.
In my case, the problem is a very long hang waiting for a DHCP response.

This is actually the third round of DHCP.  First is PXE from the BIOS,
then it appears that the kernel does DHCP right before 'Freeing unused
kernel memory'.  Finally, after 'Running /scripts/live-premount'
finishes, a userspace DHCP attempt is made, but it hangs for many minutes.

I've been fishing around in the initrd trying to debug this.  I think
I've found a workaround, but it pains me to think about the implications.

Basically, to debug what was going on, I copied the 'strace' binary into
the initrd.  Then, I changed scripts/live around line 707 to this:

  strace -f ipconfig ${DEVICE} | tee /netboot.config

I was hoping to see a clue as to what was going wrong.  I already knew
from tcpdump that the DHCP server was receiving the requests and sending
replies.  Instead, I saw everything working perfectly with no hangs.
So, as a hack, I changed the line to:

  strace -o /dev/null ipconfig ${DEVICE} | tee /netboot.config

And now everything works as it should.  Of course, now I need a
custom-mangled initrd.  And now I wonder what this ipconfig binary is
and why it's so unreliable for me.  I assume that running it in strace
changes some aspects of timing and/or memory layout, thereby avoiding a
bug in it.

When I take out the strace, it goes back to not working.  The DHCP
server sends the responses and usually gets an ICMP port unreachable
packet for UDP port 68.  This leads me to believe that ipconfig doesn't
properly set up its networking when run outside of strace.

My environment is i386.  The target machine is a recent two-core Intel box.

Can anyone help me get things working without this hack?

Thanks.

-- 
-- Alex Meyer --


More information about the linux-fai mailing list