[WBEL-users] A story of Authentication, and a few questions

Kirby C. Bohling kbohling@birddog.com
Thu, 30 Sep 2004 18:09:58 -0500


On Thu, Sep 30, 2004 at 03:11:00PM -0700, bishop wrote:
> Dear list,

<snip... rough description>

> 	Question1:  the auth problem seen in RH9 with
> 	an absent LDAP server seems to have been solved.
> 	Agree?

Nope!  In my experience, RH7.1-RH9 and RHEL3 (and probably 2.1, but
I don't have a convient way to test) all have a problem with LDAP if
LDAP is down.  You cannot login to any Linux machine via the console
or via a network connection if LDAP is down, but the network is up.
If the network is down, you can login.

The fix I've found for this is (and it only works for more recent
version of nss_ldap (I want to say anything past version 192, but
don't quote me on that).  I know it's a later version then what is
shipped with RH7.1.

I believe the settings you need to add, aren't described in the
default config file shipped by RH.  RH for some reason has never
updated their default config to describe new options, however if you
read the docs you'll find them described.  I've ended up just
downloading the original tarball and perusing it's config to see the
new settings.

bind_timelimit 2
bind_policy soft
timelimit 30

That will allow you to deal with an LDAP server being down but the
network up.  However, if you have a very heavily loaded LDAP server
that takes longer then two seconds to respond, people won't be able
to log in with non-local accounts.

The other alternative is to binary edit the login binary (I'm not
kidding, they literally say that's what you should do if you read
the source code).  You could rebuild login if you liked, upping the
timeout.  It's set to 60 seconds, which using the default LDAP
settings will timeout even if you are using the password in the
local /etc/shadow file.

> The interesting part is, netstat told me a story that didn't include a 
> running LDAP server, a bit surprising.  Of course, nothing in chkconfig, 
> because openldap-servers wasn't installed.  Mystery solved, then.

Not sure what this means.

> 
> 	Incidentally, apt-get bails here because of the
> 	redundant file dependency of the conversion script
> 
> >Resolving dependencies
> >...Segmentation fault
> >[root@golem root]# !!
> >yum install openldap-servers
> >
> >Unable to find pid
> >Gathering header information file(s) from server(s)

I don't understand any of this.  This isn't like what I normally
see.

> 
> So I fumbled with Yum, as it (downloaded the freaking library of 
> congress, segfaulted, downloaded MORE/other new stuff when I reran it, 
> and then, finally) installed openldap-servers.

If you headers are out of date, yes.  You'll download a ton of
headers the first time.  A bunch of RPM's.  If you failed to get all
of the headers when it segfaulted, you'll get more stuff the second
time.  It sure sounds like you have a corrupt RPM database (I know
that after a fresh install of WBEL I've had a corrupt RPM database).

http://beau.org/pipermail/whitebox-users/2004-May/001470.html

That's the e-mail where Vincent Raffensberger told me the solution
that solved my problems.  Ever since then if I install off of the
original WBEL media, the first thing I do is:

yum -y update rpm
yum -y update yum
yum -y update 

I've know lots of people who ended up with a corrupted install if
they didn't do that.  I have no idea why.  I know there are several
bug fixes for RPM that aren't on the original media.  Unknown if it
is reproduceable off of respin 1.

> 	Newbie Question2:  is that normal, for it to download a
> 	whack of stuff and then, when rerun with !!, download
> 	MORE stuff?  What the heck else could it need now that
> 	it didn't need a moment ago?  And is header list actually
> 	longer than the list of charges at nurnberg?

If it segfaulted while grabbing them.  Yes it is.  Yes, the number
of headers is incredibly long.  Even over a highspeed connection
it's very slow, because it's constantly starting up and tearing down
HTTP connections.  I believe it's not doing keep-alive, which is
most of the problem.  The amount of data actually transferred is
very small.

> 
> >Resolving dependencies
> >Dependencies resolved
> >I will do the following:
> >[install: openldap-servers 2.0.27-17.i386]
> 
> Okay.  So it looks like it's done.
> 
> Questions;  answer what ya like:
>  3 anyone seen the random death of yum like that?

No, but I have seen rpm which yum is invoking or using the same API.

>  4 hands up who thinks I need to re-run redhat-auth-config again.

Not sure.

>  5 I'm thinking I'll need to set up replication to get any useful
>    reliability in this auth scheme the next time the whim of the
>    backhoe cuts me off.  Agree?

Yes.  We have to LDAP machines that are inside of our server room.
If we had a remote location, we'd have two LDAP servers setup their.

>  6 How easy/versatile is the CNAME-ish search referral in LDAP?
>    Easy as it should be?

	No, it's not in my experience.  You might try looking into
"CARP", or "VRRP".  That should help you deal with these problems.
Those essentially build a virtual network interface on two machines.
If the first machine doesn't respond to the network request, the
second machine will.  It's pretty slick from what I've seen, but
I've never actually set it up.  I believe you can use that to a
failover LDAP server using only one IP (we setup a virtual IP for
the LDAP servers, each of the LDAP machines has another real IP on
them).

	We solved it by just putting LDAP on the server that if for some
reason it stops working, everyone goes home until it's fixed.  LDAP
has never stopped working since then.

	Never discount the fact that I don't know all of the right
places to set that up.  If you use CNAME in DNS, it will round
robin, so half of all connections will fail if one of the servers is
down.  I know that at least in HTTP using some of the LDAP auth
modules you can setup round robin that works.  I'm not sure if you
can in nss_ldap (if someone knows how, I'm all ears).  I know that
several of us have tried and failed to get it to work.  However,
that might be a problem of having tried on RH7.1 and not retried
since.

	I personally like the VRRP solution, as it's highly portable
from service to service.  I don't have to be that smart in the
configuration, and if I add additional redundant machines, it just
works without touching a bunch of other machines.  If the
configuration is stupid for the LDAP library, it still just works.
It's just completely transparent to everyone else.

	Thanks,
		Kirby