FAI 2.4 upgrade problems

Felix Kühling fxkuehl at gmx.de
Thu Feb 27 11:29:27 CET 2003


On Wed, 26 Feb 2003 14:12:19 -0600
Justin Doiel <jdoiel at engr.uark.edu> wrote:

> 
> Heya.
> 
> I'm presently going through the same pain, so heres my two cents. :)
> 
> On Wed, Feb 26, 2003 at 06:39:21PM +0100, Felix K?hling wrote:
> > Hi,
> > 
> > I used FAI 2.3.4 with Woody and DHCP. On the upgrade to 2.4 I'm
> > experiencing 4 serious problems that prevent me from completing an
> > installation successfully.
> > 
> > 1. I had trouble making a boot floppy. The newly ext2-formatted floppy
> > was automagically mounted as vfat and when trying to rmdir lost+found
> > make-fai-bootfloppy aborted. I fixed it by adding "-t ext2" to the mount
> > command line.
> 
> I didnt run into this one at all. my boot floppies work perfectly.

I'm not really sure it's related to the upgrade. Could it just be a
strange floppy? Anyway, the fix was rather simple once I had figured it
out :)

> > 2. I can't get a shell after install or sysinfo (didn't try other
> > actions). No matter what I do (pressing <RETURN> or ctrl-c) it always
> > reboots. Somehow I managed to get a shell after messing around a bit in
> > fai_end, but I didn't really understand how.
> 
> I'm also having this problem, no matter what, it reboots.
> I THINK it may have something to do with me removing tcsh from my setup, but am not sure.

I never used tcsh.

> > I added a line "/sbin/sulogin" in fai_end. But the shell behaved quite
> > strangely. There was no prompt and backspace repeated deleted characters
> > on the terminal. After I made some mistake that killed the shell init
> > entered runlevel 2 and I got a usable shell.
> > 
> > 3. The logfiles cannot be saved on the install server. I get two error
> > messages about rcmd problems. Sorry, I forgot the exact words. Maybe
> > this is related to the next one.
> > 
> 
> what method did you use? i used the SSH method, installed an SSH client on the root
> filesystem, and had to manually copy in a /root/.ssh/known_hosts with my server in it.
> i still get an error message, but it works.

I use rsh. I didn't change anything in the list of packages in nfsroot
during the upgrade to 2.4.

> 
> <snip>
> # extra packages which will be installed into nfsroot
> # add lvm, raidtools2 only if needed
> NFSROOT_PACKAGES="expect pump ssh"
> </snip>

NFSROOT_PACKAGES="ssh expect reiserfsprogs dpkg-dev rsh-client"

> 
> i ALSO had to read through the scripts before finding out there is a LOGSERVER
> variable that needs to be set in fai.conf, EG:
> 
> <snip>
> # /boot/fai;chmod g+w /boot/fai. If the variable is undefined, this
> # feature is disabled
> LOGUSER=faimaster
> LOGSERVER=ageruka
> # use ssh or rsh for copying log files to user fai and for changing
> # tftp symbolic link
> #FAI_REMOTESH=rsh
> #FAI_REMOTECP=rcp
> FAI_REMOTESH=ssh
> FAI_REMOTECP=scp
> </snip>

LOGUSER=fai
# use ssh or rsh for copying log files to user fai and for changing
# tftp symbolic link
FAI_REMOTESH=rsh
FAI_REMOTECP=rcp

Hmm, that could be the problem. I never had a LOGSERVER= line in my
fai.conf. Not with 2.3.4 and not in the maintainer version of 2.4. But
it used to work with 2.3.4 anyway.

> 
> <NOTE>LOGUSER=faimaster is a local change, i dont like naming users after services.</NOTE>
> 
> > 4. The DHCP information doesn't make it into environment variables. I
> > tried the dhclient -lf /dev/null command line as in get-boot-info
> > manually in the shell, but it didn't output anything to stdout. If I
> > understand get-boot-info correctly dhclient is *supposed* to output
> > variable definitions for all DHCP parameters. They are redirected to
> > /tmp/fai/bootlog and sourced later by task_confdir.
> > 
> 
> dont have that problem here. :P

Do you use DHCP or BOOTP? After reading a few scripts and manpages I
have an idea what the problem could be. I saw a dhclient-perl script in
.../nfsroot/sbin which seems to translate dhcp options into shell
variables. It gets called by dhclient through /sbin/dhclient-script. But
there is this line which makes me worry:

# exit if no data is available
exit 0 unless $ENV{new_option_170};

As suggested in the new documentation I had removed the option_17x
options from my dhcpd.conf and used the new FAI_LOCATION variable in
fai.conf and class/LAST.var for setting FAI_ACTION. But here it looks as
if at least option_170 is still needed. Is this a bug in the script or
the documentation?

I havn't tested this theory yet. Maybe later today. I'll let you know
what I find.

> > Thus IPADDR ends up undefined and as a result my 01alias doesn't add
> > most of the classes (NETWORK, LILO, BOOT, ...). Eventually I get an
> > unbootable system without a (simple) way to get a shell after the
> > installation and no logfiles on the install server. That makes debugging
> > real fun! ;-)
> > 
> 
> hmm. that one either.

Well, it's a consequence of the DHCP variable problem and specific to
the way I detect the configuration (by IPADDR).

> > If you need any more details or have patches for me to test, just let me
> > know. The most important first step is probably to get the shell working
> > after the installation.
> > 
> > Regards,
> >    Felix
> 
> Good luck, hope I helped instead of confusing.
> 
> Justin Doiel <jdoiel at engr.uark.edu>
> 

Felix

               __\|/__    ___     ___     ___
__Tschüß_______\_6 6_/___/__ \___/__ \___/___\___You can do anything,___
_____Felix_______\Ä/\ \_____\ \_____\ \______U___just not everything____
  fxkuehl at gmx.de    >o<__/   \___/   \___/        at the same time!



More information about the linux-fai mailing list