[WBEL-users] Network Connectivity on WBEL 4

Kirby C. Bohling kbohling at birddog.com
Sun Jul 24 03:54:55 CDT 2005



On Sun, Jul 24, 2005 at 03:04:45AM -0500, Dan Herrstrom wrote:
> On 7/24/05, Kirby C. Bohling <kbohling at birddog.com> wrote:
> > On Sun, Jul 24, 2005 at 01:06:25AM -0500, Dan Herrstrom wrote:
> > > Greetings all.
> > >
> > > I just installed WBEL 4 on my "server" box and for some reason I
> > > cannot get it to talk to my network. The machine has two cards, a
> > > Netgear FA311 10/100 and a Netgear GA311 Gigabit. I'm not sure exactly
> > > which one is which eth device but I have assigned the IP 172.16.1.2 to
> > > eth0 and 172.16.1.3 to eth1. If i give simply ping 172.16.1.2 or ping
> > > 172.16.1.3 it succeeds but if i give ping -I eth0 172.16.1.3 or ping
> > > -I eth1 172.16.1.2 i receive "Destination Host Unreachable". I also
> > > cannot ping this machine from my Windows 2000 workstation nor can I
> > > ping it from the server. It acts as though all the necessary modules
> > > are installed and I know the cards to be in working order as they
> > > worked fine when I had Fedora Core 2 installed on the machine. Any
> > > suggestions would be appreciated.
> > >
> > > Thanks for your help..
> > 
> > You can start by running "ifconfig -a" or "ip addr show" as root.
> > Post that.  Then we can see how it's actually set up, not how you
> > describe it.  A lot of times, the mismatch between what you think
> > and what is, is the problem (that's not an insult, I've been bitten
> > enough times by this that I don't trust how I think something is
> > configured, I double check the tools agree with me).  Next, it'd be
> > helpful if you would describe the rest of the network, and how it's
> > configured TCP/IP and physically cabled.
> [root at mainframe bin]# ip addr show
> eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
>  link/ether 00:09:5b:e0:f4:f8 brd ff:ff:ff:ff:ff:ff
>  inet 172.16.1.2/24 brd 172.16.1.255 scope global eth0
> 	valid_lft forever preferred_lft forever
> eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
>  link/ether 00:09:5b:61:ce:99 brd ff:ff:ff:ff:ff:ff
>  inet 172.16.1.3/24 brd 172.16.1.255 scope global eth1
> 	valid_lft forever preferred_lft forever
> 
> At present the only other connection to the network besides the
> machine in question is from my workstation which is runnign Windows
> 2000 Professional SP4 and is set to use the IP 172.16.1.250 and Subnet
> mask 255.255.255.0. Attempts to ping either card on 'mainframe' result
> in "Request timed out." All connections are using Cat 5 UTP to a 5
> port NetGear GS605 switch. All link lights are lit (green on the
> switch for the gigabit card, amber for the 100Mbps cards)

Do the lights blink as when you do the ping?  I know it's silly, but
that's actually how I figured out my problems with my bad switch
(the lights blinked on the port that the packet came out, but didn't
on the port the packet should have been received on).  That will at
least tell you if the packet is leaving the Linux machine.  If it
is, does the Window's machine blink immediatly afterwards?

If you can control the network enough (ensure that it's traffic
free).  Take the Workstation run the whatever windows utility that
will show you how many packets have been sent and received (one of
the "LAN Connection" windows shows it).  Record the TX and RX
counters on both machines.  Then run ping so it sends only one
packet.  Now examine all of the counters again.  Which ones
changed should help figure out what activity is actually happening.
I'd try doing that, and running ping in both directions (from
Windows to Linux, and from Linux to Windows).

> > 
> > I'm not shocked the "ping -I" failed.  You are forcing it to go out
> > the wrong interface.  On a machine I have handy with two interfaces:
> > 
> > (Output editted slightly to remove unimportant bits...)
> > [root at harrier ~]# ip addr show eth0
> > 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
> >     inet 10.10.2.18/16 brd 10.10.255.255 scope global eth0
> > [root at harrier ~]# ip addr show eth1
> > 3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
> >     inet 10.15.0.2/16 brd 10.15.255.255 scope global eth1
> > 
> > [root at harrier ~]# ping -I eth1 10.10.2.18
> > PING 10.10.2.18 (10.10.2.18) from 10.15.0.2 eth1: 56(84) bytes of data.
> > From 10.15.0.2 icmp_seq=1 Destination Host Unreachable
> > ...
> > 
> > So I would ignore the "ping -I" symptoms if I were you.  I get the
> > same thing on a machine that works fine.
> > 
> > Next, the cards should have the MAC addr's printed on them.
> > Normally the output of "ip addr show" will show you both of them.
> > You can check that the physical card is the one you think it is to
> > ensure the interface you think is eth0 is in fact what the Linux
> > kernel things is eth0.
> The gigabit card has the MAC 00:09:5b:e0:f4:f8 (same as eth0).
> The 10/100 card has the MAc 00:09:5b:61:ce:99 (same as eth1).

Check.

> > 
> > Next, did you turn any of the firewall security on?  I've seen many
> > a problem resolved by doing "service iptables stop".  When you
> > installed, generally the firewall by default is set fairly secure.
> > If you didn't turn it down, there are a lot of things that won't
> > work.  You will want to go back and re-secure the firewall later,
> > but that will at least help you identify that it is the firewall.
> > Everyone says "high" security on the firewall.  That always seems
> > like a good idea.  I always end up having to disable it, trouble
> > shoot my problem.  Poke the holes I need, enable the firewall again.
> [root at mainframe bin]# service iptables status
> Firewall is stopped.
> [root at mainframe bin]# service iptables start
> [root at mainframe bin]# service iptables status
> Firewall is stopped.
> [root at mainframe bin]# service iptables stop
> [root at mainframe bin]# service iptables status
> Firewall is stopped.

Curious.  It says stopped, even after you started it...  That's
weird.  What's iptables -L say?  That'll print out the actual rules,
and more specifically the default policy.

> > 
> > Finally, you might be looking in all the wrong places.  I've have
> > several times where "ping" didn't work not because the machine I
> > just setup was configured wrong.  The packets got off the new
> > machine and to the destination machine just fine.  However, the
> > existing machines couldn't route packets back, because it was
> > missing a route or something else was misconfigured.  So be aware of
> > that.  That handiest way to see that is to run a packet sniffer on
> > the destination.  The poor man's way of doing that is to watch the
> > blinky lights on the switches.
> > 
> > When all else fails, blame the wiring and switches...  Just today, I
> > had a networking problem that didn't make any sense.  I just
> > couldn't ping from one machine to the other.  Turns out I had a bum
> > switch from NetGear.  Took me 2 hours to track it down.
> Ok just to rule anything out I plugged in my laptop
> (Windows XP SP2, IP 172.16.1.149/Mask 255.255.255.0). Pings to either
> IP on mainframe fail however when I ping the workstation:
> C:\Documents and Settings\dlh004> ping 172.16.1.250
> 
> Pinging 172.16.1.250 with 32 bytes of data:
> 
> Reply from 172.16.1.250: bytes=32 time<1ms TTL=128
> Reply from 172.16.1.250: bytes=32 time<1ms TTL=128
> Reply from 172.16.1.250: bytes=32 time<1ms TTL=128
> Reply from 172.16.1.250: bytes=32 time<1ms TTL=128
> 
> Ping statistics for 172.16.1.250:
> 	Packets: Sent = 4, Received = 4, Lost = 0 (0% loss)
> Approximate round trip times in milli-seconds:
> 	Minimum = 0ms, Maximum = 0ms, Average= 0ms

Just to be anal retentive, when you used the laptop, did you pull
the wires from one of the mainframe NIC's?  In theory you could be
dealing with bad ports on the switch, or bad cabling.  Grabbing
another cable, or using another port on the switch wouldn't test
that.  I'd try of the cables from mainframe leaving it on the same
port on the NIC.  Then I'd try using the IP from the mainframe on
the Window's machine.

Hmmm, could it be the "SELinux" stuff.  I don't know if that has
anything to do with networking (my co-worker setup all of our
CentOS4/WBEL4 machines).  I know it's broken some things on some
machines we've had around.  Anything interesting in dmesg?  Does the
dmesg output change when you run ping?

I'm pretty much out of suggestions after that... Good luck.

    Thanks,
        Kirby


More information about the Whitebox-users mailing list