Overview of Class Six

by Torleif Mohling
CU Boulder, dept of Computer Science
(04/17/2000)

0 Contents

1 Setting up a Client Host's Nameservice

The last item on the agenda in terms of setting up the network personality of a simple host is to configure your nameservice files. And, finally, one last look at the afterboot manpage:

    BIND Name Server (DNS)
      If you are using the BIND Name Server, check the /etc/resolv.conf file.
      It may look something like:

domain nts.umn.edu nameserver 128.101.101.101 nameserver 134.84.84.84 search nts.umn.edu. umn.edu. lookup file bind

If using a caching name server add the line "nameserver 127.0.0.1" first. For a local caching name server to run you will need to set "named_flags" in /etc/rc.conf and create the named.boot file in the appropriate place for named(8). The same holds true if the machine is going to be a name server for your domain. In both these cases, make sure that named(8) is running (otherwise there are long waits for resolver timeouts).

As you know, nameservice is the system by which a client host can query a name server to translate symbolic hostnames (e.g. bfs.cs.colorado.edu ) into IP addresses (e.g. 128.138.202.9 ). We will talk about the Domain Name System ( DNS ) in more detail in a future class, as well as the basic configuration concepts necessary for a DNS server. Setting up a client host for using DNS is much easier. Software applications make use of DNS through a special C library called the resolver library. The process of converting a hostname into an IP address is known as address resolution .

To configure a a client host's nameservice you need to edit the /etc/resolv.conf file. Here is the file from saclass :

    saclass % cat /etc/resolv.conf
    search cs.colorado.edu
    nameserver 128.138.202.19
    lookup file bind
The first line of this file gives a list of domains to be searched to find a given host. The list may contain arbitrarily many space seperated domain names. For example, if your home machine was in the domain of your ISP, say indra.net and you wanted to also reference cs.colorado.edu hosts by only their unqualified hostname (e.g. bfs ) then you could use a search line as follows:

    search indra.net cs.colorado.edu
The second line specifies the IP address of a nameserver that will be contacted to for nameservice queries. There may be up to three of these lines (and there should be for robustness) and each nameserver is queried in order with subsequent servers only being used if previous ones do not return answers (i.e. the nameserver machine is down or broken).

The final line is an option that can be specified in the OpenBSD resolv.conf file only. This line specifies that the resolver libraries should first check the local /etc/hosts file to find IP addresses and only query a nameserver if the hosts file does not contain a match (i.e. the specified hostname is not in the hosts file). You can change this behavior in exactly the same fashion on a RedHat (or other SysV-like OS's) box by editing the /etc/nsswitch.conf file; the relevant line this file might look like this:

    hosts:    files dns
Again, this tells the resolver library to first check the /etc/hosts file and then contact a nameserver.

Both the OpenBSD and RedHat resolver libraries may be give a number of different options to control such things as the timeout length between unanswered queries and the number of attempts tried before giving up on a given server. Read the following manpages for details:

2 The tcpdump Command

The tcpdump command is arguably the most important tool in the un*x network administrator's arsenal. It is used to capture and print out the header information for packets on a TCP/IP network. The command requires root priviledges to place the host (on which the tcpdump command is run) network interface into promiscuous mode so that all packets seen by host's interface will be captured and sent on to tcpdump for processing. Normally a network interface uses very fast hardware-level logic to compare a packet's destination link-layer address (i.e. ethernet ) with its own, enabling it to swiftly reject packets or pass them on up to the kernel's link-layer device code for processing.

By default, tcpdump will pick the first active network interface it finds to listen on for packets (with a simple host that has only one network interface tcpdump should always find it). To specify a specific device use the -i <device> arguments. For example:

    coatlicue % sudo tcpdump -i le1
Also by default, tcpdump will output header information about every packet that is captured. On a busy subnet or a busy server host, this traffic can be huge and it will scroll by incredibly quickly. Fortunately, tcpdump offers a fairly simple but extensive set of keywords and logic operators so that you can build arbitrarily complex filters to output only the packets you want to see. For example, to only see ICMP packets:

    coatlicue% sudo tcpdump -i le1 icmp
    Password:
    tcpdump: listening on le1
    17:13:48.748089 xibalba > coatlicue: icmp: echo request
    17:13:48.748757 coatlicue > xibalba: icmp: echo reply
    17:13:49.742626 xibalba > coatlicue: icmp: echo request
    17:13:49.743304 coatlicue > xibalba: icmp: echo reply
To avoid nameservice queries, one specifies the -n argument:

    coatlicue% sudo tcpdump -i le1 -n icmp
    tcpdump: listening on le1
    17:15:35.558706 10.0.0.2 > 10.0.0.1: icmp: echo request
    17:15:35.559403 10.0.0.1 > 10.0.0.2: icmp: echo reply
    17:15:36.553549 10.0.0.2 > 10.0.0.1: icmp: echo request
    17:15:36.554224 10.0.0.1 > 10.0.0.2: icmp: echo reply
You can specify logical conjunctions or disjunctions as well, along with tcp or udp port number modifiers. Look what happens to the output if restrict on source ip address as well as ICMP traffic only:

    coatlicue% sudo tcpdump -i le1 -n icmp and ip src 10.0.0.1
    tcpdump: listening on le1
    17:20:45.275391 10.0.0.1 > 10.0.0.2: icmp: echo reply
    17:20:46.267767 10.0.0.1 > 10.0.0.2: icmp: echo reply
    17:20:47.267707 10.0.0.1 > 10.0.0.2: icmp: echo reply
    17:20:48.267881 10.0.0.1 > 10.0.0.2: icmp: echo reply
Now we only see the ICMP echo reply packets, because the echo request packets have a different source IP address.

You can similarly restrict on a tcp or udp port number. For example, just see SSH packets with out any other restrictions, you could:

    coatlicue% sudo tcpdump -i le0 tcp port 22
To add in the fact that you only want to see SSH packets to or from a specific host you could:

    coatlicue% sudo tcpdump -i le0 tcp port 22 and ip src or dst 10.0.0.1
     

2.1 A Closer Look at Tcp

The Transmission Control Protocol is one of the most important, after IP, for conveying the vast majority of traffic on the internet. How does TCP offer reliable service when its underlying carrier, IP, is not reliable? What are port numbers? How is it that a TCP session is considered to be connected or stateful?

2.2 Port Numbers - TCP and UDP both

This section applies equally to the UDP protocol as well.

You can think of port numbers as if they were post-office boxes. To reach a given service, one sends mail addressed to a specific post-office-box number located in a certain town (i.e. IP address).

Server daemon processes are configured to listen on a specific port number (and to use either TCP or UDP). To contact a given server, one needs to transmit a packet (again, either TCP or UDP) with the corect destination port number set in the protocol's header ; i.e. that would be the port number on which the desired daemon process was configured to listen .

When you, for example, initiate an SSH session (which uses TCP by default) your SSH client program is first issued an arbitrary temporary TCP port number to use as its source port number. Then the SSH client sends a TCP packet to the destination host to start the session. You can use the -v flag to the ssh command to get verbose output that is useful for debugging a failing session. Take a look more closely at the output and notice the messages about (TCP) port numbers:

    xibalba % ssh -v tlaloc.cs.colorado.edu
    SSH Version 1.2.26-CSOps-vsnprintf-patched [i586-unknown-linux], protocol version 1.5.
    Standard version.  Does not use RSAREF.
    xibalba: Reading configuration data /etc/ssh/ssh_config
    xibalba: ssh_connect: getuid 500 geteuid 0 anon 0
    xibalba: Connecting to tlaloc.cs.colorado.edu [128.138.243.136] port 22.
    xibalba: Allocated local port 1021.
    xibalba: Connection established.
    ...
The last three lines of output shown above are the most important for this discussion. The first of these tells you the destination IP address and (TCP) port number:

    xibalba: Connecting to tlaloc.cs.colorado.edu [128.138.243.136] port 22.
The next line following that tells you your (arbitrary) source (TCP) port number:

    xibalba: Allocated local port 1021.
And then the last line shown above indicates that the TCP session has been established . We'll look at what that means more closely in the next section.

Port numbers as you can see from the TCP and UDP header descriptions are 16-bit quantities meaning that port numbers can range from 0 to ~ 65K. Port numbers below 1K (1024) are special: they can only be used by processed owned by root ; that's one reason the arbitrary port numbers used by many different TCP-based client programs are often much greater than 1024.

The port numbers on which many well-known services listen are listed in the file /etc/services ; I will often use grep to reference services in this file:

    tlaloc % grep ftp /etc/services
    ftp-data        20/tcp
    ftp             21/tcp
    tftp            69/udp
    sftp            115/tcp
    bftp        152/tcp                 # Background File Transfer Protocol
    ftp-ftam    8868/tcp                 # FTP->FTAM Gateway
The ports for ftp and ftp-data (20 and 21 respectively) are both used during a typical FTP session. The other ports listed are for various FTP-like services; you would have to read manpages to figure those out.

Note that the /etc/services file simply lists common services and associated TCP or UDP port numbers. This file is not related in any way with services that are available on a given host. To know what services are available on a given host you need to look at the system's startup files as well as the file inetd.conf . (I'll talk about the inetd daemon next week). You could also run a portscanner program to figure out the port numbers for a target host that have daemon(s) listening on them.

2.3 The TCP session

First let's look simply more closely at the packets going back and forth for an existing SSH session using tcpdump . This is for a session running between my home PC xibalba (running RedHat) and tlaloc a departmental admin host (used for backups and running *Solaris*). By the way, the -t flag to tcpdump tells it not to print timestamp information at the head of each output line.

    coatlicue% sudo tcpdump -i le1 -t ip src or dst 10.0.0.2 and tcp port 22 and not ip src or dst 10.0.0.1
    tcpdump: listening on le1
    xibalba.1016 > tlaloc.cs.colorado.edu.ssh: P 3662642749:3662642769(20) ack 3232075105 win 32120 <nop,nop,timestamp 78368385 259077902> (DF) [tos 0x10]
    tlaloc.cs.colorado.edu.ssh > xibalba.1016: P 1:21(20) ack 20 win 10136 <nop,nop,timestamp 259100459 78368385> (DF) [tos 0x10]
Here we can see more clearly how, in the first line, a TCP packet travelling from the SSH client (on xibalba ) to the SSH server (on tlaloc ) has an arbitrary TCP source port while the destination port is the well-known port 22 for SSH:

    .. xibalba.1016 > tlaloc.cs.colorado.edu.ssh ..
And, conversely, how packets sent from the server back to the client have a source port of 22 and the destination port is the "arbitrary" port:

    .. tlaloc.cs.colorado.edu.ssh > xibalba.1016 ..
It is important to note that, once a TCP session has been established, the so-called "arbitrary" port is definately no longer arbitrary. As we'll see below, it is an important part of the state that makes up a TCP connection. It is only arbitrary when it is first given to the client process. You might also notice that the client-side's port number in this case is below 1024. The reason is that the SSH client program has the setuid permissions bit set:

    xibalba % ls -l /usr/local/ssh/bin/ssh1
    -rws--x--x   1 root     root       682884 Nov  2  1998 /usr/local/ssh/bin/ssh1*
      ^^^
.. meaning that the SSH client process is also owned by root.

     
There are a number of important items of information that each side (i.e. client and server) of a TCP connection keeps track of. This is part of what is known as the state of a TCP session.

These four bits of information define that basic framework of TCP session state. Other things that are kept track of include

Now let's look at the initial TCP packets that get transmitted to setup the TCP-based SSH connection. The client initiates the connection by sending a special packet to destination server and port number that has the SYN flag turned on in the TCP header. The server responds in turn except this packet has both the SYN flag turned on to complete the session negotiation as well as the ACK flag turned on to acknowledge the client's packet (this is using the same tcpdump command as above):

    xibalba.1021 > tlaloc.cs.colorado.edu.ssh: S 2201118588:2201118588(0) win 32120 <mss 1460,sackOK,timestamp 79021300[|tcp]> (DF)
    tlaloc.cs.colorado.edu.ssh > xibalba.1021: S 4079345241:4079345241(0) ack 2201118589 win 10136 <nop,nop,timestamp 259753321 79021300,nop,[|tcp]> (DF)
The first packet is the session negotiating packet from the client. Note the S flag that follows the destination host.port:

      xibalba.1021 > tlaloc.cs.colorado.edu.ssh: S 
                                                ^^^
The S indicates that the TCP packet has the SYN flag set. Following the SYN flag indicator is the packet's TCP sequence number. Following the sequence number in the second, return , packet from the server is the ack keyword indicating that the ACK flag was set as well as the sequence number of the packet being acknowledged.

     
Here is the general format of output from tcpdump for actual TCP packets:

    timestamp src-IP.port > dest-IP.port flags seq-num ack window urgent options
For a more detailed description, see the tcpdump(1) manpage under "OUTPUT FORMAT".

     
Finally, when a TCP session is terminated, each side exchanges packets which have the FIN flag turned on.

To get a greater understanding of TCP, I refer the interested reader to Richard Stephens "TCP/IP - the protocols" or one of several other texts available on the subject.

2.4 Generating Test Traffic

3 Private Networks - revisited

Setting up NAT on an OpenBSD box is easy - just edit the file /etc/ipnat.rules . You must also enable ipnat and ipfilter in the /etc/rc.conf file; make sure that the following lines exist in that file:

    ipfilter=YES
    ipnat=YES               # for "YES" ipfilter must also be "YES"
    ipfilter_rules=/etc/ipf.rules   # Rules for IP packet filtering
    ipnat_rules=/etc/ipnat.rules    # Rules for Network Address Translation

The /etc/ipnat.rules file for coatlicue (my home network gateway) looks like this:

    coatlicue% cat /etc/ipnat.rules
    # $Id: class08.txt,v 1.2 2000/04/17 22:57:24 tor Exp tor $
    #
    # See /usr/share/ipf/nat.1 for examples.
    # edit the nat= line in /etc/rc.conf to enable Network Address Translation

# map internal 10-net addresses onto le0's address map le0 10.0.0.0/24 -> le0/32 portmap tcp/udp 10000:20000

# for icmp inside -> outside map le0 10.0.0.0/24 -> le0/32

# http -> xibalba rdr le0 0.0.0.0/0 port http -> 10.0.0.2 port http

# ssh -> xibalba rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port ssh

The first line says to map 10.0.0.0/24 subnet addresses for traffic out the le0 interface. The mapping mechanism makes use of the specified range of port numbers to perform the translations. (more on how it works below.) This mechanism only works for tcp and udp based connections.

The next (second) line allows other kinds of traffic (like icmp ) to be mapped.

The last two lines are called redirects and they are used for mapping specific incoming traffic from the outside to be redirected to an internal IP address. The first of these simply redirects the http protocol (TCP port 80) while the second redirects tcp port 114 to an internal address, but changes the port number to be that of SSH (port 22).

There are excellent template files to work with in /usr/share/ipf . Most of the files in this directory have to do with ipf which is the IP firewall mechanism available with OpenBSD; there are some ipnat examples there as well, though. I simply copied the template files and changed some of the numbers.

The command ipnat is used to manipulate the rules used by NAT. For example, to the delete (Clear) the current rules and load new ones, you could do the following:

    coatlicue % sudo ipnat -C
    coatlicue % sudo ipnat -f /etc/ipnat.rules
     
You can also look at the current mappings:

    coatlicue% sudo ipnat -l
    List of active MAP/Redirect filters:
    rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp
    rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp
    map le0 10.0.0.0/24  -> 198.11.19.5/32  portmap tcp/udp 10000:20000
    map le0 10.0.0.0/24  -> 198.11.19.5/32 
     
Following the list of rules output by ipnat -l you also get a list of active mappings:

    List of active sessions:
    MAP 10.0.0.2        0     <- -> 198.11.19.5     0     [128.138.238.18 0]
    MAP 10.0.0.2        0     <- -> 198.11.19.5     0     [128.138.240.1 0]
    MAP 10.0.0.2        1027  <- -> 198.11.19.5     13955 [128.138.129.76 53]
    MAP 10.0.0.2        2022  <- -> 198.11.19.5     13946 [128.138.129.25 80]
    MAP 10.0.0.2        1022  <- -> 198.11.19.5     11319 [128.138.243.135 22]
    MAP 10.0.0.2        1023  <- -> 198.11.19.5     11318 [128.138.242.212 22]
    MAP 10.0.0.10       2186  <- -> 198.11.19.5     13542 [204.71.200.67 80]
    MAP 10.0.0.10       2155  <- -> 198.11.19.5     13511 [151.193.165.62 80]
    MAP 10.0.0.10       2150  <- -> 198.11.19.5     13506 [151.193.165.62 80]
    MAP 10.0.0.10       1600  <- -> 198.11.19.5     13332 [199.45.146.2 80]
    MAP 10.0.0.10       1599  <- -> 198.11.19.5     13331 [199.45.146.2 80]
    MAP 10.0.0.10       1598  <- -> 198.11.19.5     13330 [199.45.146.2 80]
    MAP 10.0.0.10       1597  <- -> 198.11.19.5     13329 [199.45.146.2 80]
    MAP 10.0.0.10       1595  <- -> 198.11.19.5     13327 [199.45.146.2 80]
    MAP 10.0.0.10       2049  <- -> 198.11.19.5     13252 [192.156.134.160 21]
    MAP 10.0.0.10       2054  <- -> 198.11.19.5     13142 [216.15.42.166 8265]
    MAP 10.0.0.10       1301  <- -> 198.11.19.5     13094 [144.198.225.50 80]
    MAP 10.0.0.10       1298  <- -> 198.11.19.5     13091 [144.198.225.50 80]
    MAP 10.0.0.10       2048  <- -> 198.11.19.5     13014 [204.132.155.200 22]
    MAP 10.0.0.10       2122  <- -> 198.11.19.5     12499 [216.65.3.233 80]
    MAP 10.0.0.36       1086  <- -> 198.11.19.5     13269 [216.15.42.166 80]
    MAP 10.0.0.36       1084  <- -> 198.11.19.5     13267 [216.15.42.166 80]
The format of a mapping is:

    MAP <real-src-IP> <real-src-port> <- -> <nat-src-IP> <nat-src-port>
                                                   [ <dest-IP> <dest-port> ]

The first two mappings in the above table don't have port numbers (which you can tell by the fact that they are listed as 0. ) These are mappings associated with icmp. The remaining are an assortment of TCP connections ranging from SSH and HTTP to FTP.

     
Now let's look at some tcpdump output to illustrate ipnat in action! These examples will all be originating from my internal host xibalba which has an IP address of 10.0.0.2.

First, an example that uses portnumbers. I'll use nslookup to demonstrate. Here is the query:

    xibalba % nslookup freshmeat.net
    Server:  otis.Colorado.EDU
    Address:  128.138.129.76

Non-authoritative answer: Name: freshmeat.net Addresses: 209.207.224.211, 209.207.224.212

And here is tcpdump output from each interface on the coatlicue:

    coatlicue% sudo tcpdump -n -i le1 port domain
    tcpdump: listening on le1
    10:07:09.350885 10.0.0.2.1027 > 128.138.129.76.53: 58849+ (45)
    10:07:09.382786 128.138.129.76.53 > 10.0.0.2.1027: 58849* 1/4/5 (273) (DF)
    10:07:09.385611 10.0.0.2.1027 > 128.138.129.76.53: 58850+ (31)
    10:07:09.413888 128.138.129.76.53 > 10.0.0.2.1027: 58850 2/4/4 (210) (DF)

coatlicue% sudo tcpdump -n -i le0 port domain tcpdump: listening on le0 10:07:09.351475 198.11.19.5.13959 > 128.138.129.76.53: 58849+ (45) 10:07:09.382185 128.138.129.76.53 > 198.11.19.5.13959: 58849* 1/4/5 (273) (DF) 10:07:09.386189 198.11.19.5.13959 > 128.138.129.76.53: 58850+ (31) 10:07:09.413280 128.138.129.76.53 > 198.11.19.5.13959: 58850 2/4/4 (210) (DF)

And now an example using ping. With this example, we'll see a problem that you will encounter using NAT. First, here is the ping command itself:

    xibalba % ping boulder.colorado.edu
    PING boulder.colorado.edu (128.138.240.1) from 10.0.0.2 : 56(84) bytes of data.
    64 bytes from 128.138.240.1: icmp_seq=0 ttl=250 time=23.3 ms
    64 bytes from 128.138.240.1: icmp_seq=1 ttl=250 time=34.0 ms
    64 bytes from 128.138.240.1: icmp_seq=2 ttl=250 time=23.8 ms

And then look at the tcpdump output (again, listening on each interface of the gateway box):

    coatlicue% sudo tcpdump -i le1 -n icmp
    tcpdump: listening on le1
    09:29:40.032990 10.0.0.2 > 128.138.238.18: icmp: echo request
    09:29:40.055660 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
    09:29:41.024305 10.0.0.2 > 128.138.238.18: icmp: echo request
    09:29:41.047365 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
    09:29:42.024220 10.0.0.2 > 128.138.238.18: icmp: echo request
    09:29:42.048341 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
    coatlicue% sudo tcpdump -i le0 -n icmp
    tcpdump: listening on le0
    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)

Now let's look at the problem that occurs when we immediately try to run the same ping command, except this time from the gateway itself:

    coatlicue% ping boulder.colorado.edu
    PING boulder.colorado.edu (128.138.240.1): 56 data bytes
    --- boulder.colorado.edu ping statistics ---
    4 packets transmitted, 0 packets received, 100% packet loss

And the tcpdump output:

On the external interface:

    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
    198.11.19.5 > 128.138.238.18: icmp: echo request
    128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)

And on the internal interface:
     
    128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
    128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
    128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)

Can you guess what is happening? Because there are no portnumbers to remap and thus keep track of similar connections, the mapping that was previously made to allow the ICMP traffic to go between xibalba and boulder is still sending that traffic to xibalba! Doh! For small networks, this should be a problem, but on a larger scale, you can imagine it might be: if two different folks on two different internal hosts try and ping the same external host, one of them will have the problem.

The above problem occurs on my crufty old OpenBSD-2.5 gateway (I really need to upgrade :) running an also out-dated version of NAT. The above problem does not occur under O'BSD 2.6!

     
Here is another interesting problem. This one occured when my gateway box received a DHCP address on boot, instead of having a static IP address like it does now.

Here was my primary symptom:

    on my pc running debian linux (10.0.0.2) I can no longer
    get to the outside world unless I ssh first to my gateway
    at (10.0.0.1) running sparc/openbsd

my gateway is forwarding the (icmp) packets just fine:

    coatlicue% sudo tcpdump -i le1 -env icmp      # 10.0.0.1 interface
    tcpdump: listening on le1
    0:a0:24:15:57:b7 8:0:20:1f:46:d9 0800 98: 10.0.0.2 > 128.138.238.18: icmp: echo request (ttl 64, id 39601)

coatlicue% sudo tcpdump -i le0 -env icmp # 198.11.19.11 interface tcpdump: listening on le0 8:0:20:1f:46:d9 0:0:c:2b:e0:64 0800 98: 198.11.19.11 > 128.138.238.18: icmp: echo request (ttl 63, id 39601)

but no icmp pkts are being returned to xibalba. :(

however, from the gw it of course works:

    8:0:20:1f:46:d9 0:0:c:2b:e0:64 0800 98: 198.11.19.39 > 128.138.238.18: icmp: echo request (ttl 255, id 4777)
    0:0:c:2b:e0:64 8:0:20:1f:46:d9 0800 98: 128.138.238.18 > 198.11.19.39: icmp: echo reply (DF) (ttl 253, id 35817)

Woah, can you see the problem? In the first case - the routed pkt is sent with a src IP ending in .11 whereas in the latter case the src ip ends in .39

What happened?

The DSL connection dropped. I ifconfig'd the interface down, power-cycled the DSL modem and then ifconfig'd the interface back up. here's what the gw's routing table looks like:

    coatlicue% netstat -rn
    Routing tables

Internet: Destination Gateway Flags Refs Use Mtu Interface default 198.11.19.1 UGS 3 229 - le0 10.0.0/24 link#2 UC 0 0 - le1 10.0.0.1 127.0.0.1 UGHS 0 0 - lo0 10.0.0.2 0:a0:24:15:57:b7 UHL 3 15224 - le1 10.0.0.32 link#2 UHL 1 1 - le1 10.0.0.34 link#2 UHL 1 6434 - le1 127/8 127.0.0.1 UGRS 0 0 - lo0 127.0.0.1 127.0.0.1 UH 4 355 - lo0 198.11.19.0/25 link#1 UC 0 0 - le0 198.11.19.1 0:0:c:2b:e0:64 UHL 1 0 - le0 198.11.19.39 127.0.0.1 UGHS 0 0 - lo0 224/4 127.0.0.1 URS 1 15 - lo0

hmm. looks ok. I'm still confused maybe. now let's look at the ipnat table:

    coatlicue% sudo !!
    sudo ipnat -l
    List of active MAP/Redirect filters:
    map le0 10.0.0.0/24  -> 198.11.19.11/32  portmap tcp/udp 10000:20000
    rdr le0 128.125.138.150/32 port 20 -> 10.0.0.3 port 20 tcp
    map le0 10.0.0.0/24  -> 198.11.19.11/32 
    rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp
    rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp

List of active sessions: MAP 10.0.0.2 0 <- -> 198.11.19.11 0 [128.138.238.18 0] MAP 10.0.0.2 1019 <- -> 198.11.19.11 13109 [206.168.118.178 22] MAP 10.0.0.2 1020 <- -> 198.11.19.11 13108 [128.138.202.9 22] MAP 10.0.0.2 1021 <- -> 198.11.19.11 13107 [128.138.192.220 22] MAP 10.0.0.2 1023 <- -> 198.11.19.11 13106 [128.138.242.212 22] MAP 10.0.0.2 1418 <- -> 198.11.19.11 13096 [128.138.242.195 80] MAP 10.0.0.2 1412 <- -> 198.11.19.11 13089 [128.138.202.19 80] MAP 10.0.0.2 1411 <- -> 198.11.19.11 13088 [128.138.202.19 80] MAP 10.0.0.2 1409 <- -> 198.11.19.11 13038 [129.128.5.191 80] MAP 10.0.0.2 1404 <- -> 198.11.19.11 13031 [128.138.192.84 26846] MAP 10.0.0.2 1403 <- -> 198.11.19.11 13030 [128.138.192.84 19002] MAP 10.0.0.2 1400 <- -> 198.11.19.11 13023 [128.138.242.195 80] MAP 10.0.0.2 1399 <- -> 198.11.19.11 13022 [128.138.242.195 80] MAP 10.0.0.2 1397 <- -> 198.11.19.11 13020 [128.138.242.195 80] MAP 10.0.0.2 1346 <- -> 198.11.19.11 12947 [206.25.182.132 80] MAP 10.0.0.2 1332 <- -> 198.11.19.11 12933 [206.25.182.132 80] MAP 10.0.0.2 1022 <- -> 198.11.19.11 12753 [128.138.243.135 22] MAP 10.0.0.10 2049 <- -> 198.11.19.11 19753 [128.138.150.15 22]

AHA. here is the problem. I need to do the following:

    coatlicue% sudo ipnat -C
    7 entries flushed from NAT list
    coatlicue% sudo ipnat -l
    List of active MAP/Redirect filters:

List of active sessions: coatlicue% sudo ipnat -f /etc/ipnat.rules coatlicue% sudo ipnat -l List of active MAP/Redirect filters: map le0 10.0.0.0/24 -> 198.11.19.39/32 portmap tcp/udp 10000:20000 map le0 10.0.0.0/24 -> 198.11.19.39/32 rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp

List of active sessions: coatlicue%

We needed to flush the old table and re-read the ipnat.rules file. now ping should work again. it does. The problem was that the NAT mappings were still thinking that the gateway's external address was 198.11.19.11 when it should have changed things to be 198.11.19.39!

     
     
     
If you are setting up a home gateway, chances are that you will also need to set up an IP firewall. We looked at such a firewall briefly (and it was included with the handout for class seven) for a cisco router. OpenBSD uses a facility known as IPfilter that is very similar in concept to the techniques used on the Cisco. Again, there are excellent and well-commented examples found in the /usr/share/ipf directory on an OpenBSD machine. We will revisit ipf in class 10.

4 Networking Hardware

Hubs are simple network devices that provide basic ethernet connectivity. To understand them, consider the previous notion of several hosts attached to old thick-net style ethernet cable:

          <some chunk of ethernet with five hosts attached to it>
        ===+=============+==============+==============+==============+====
           |             |              |              |              |
           |             | <- drop ---> |              |              |
           |             |    cables    |              |              |
         +-+-+         +-+-+          +-+-+          +-+-+          +-+-+
         | A |         | B |          | C |          | D |          | E |
         +---+         +---+          +---+          +---+          +---+
Modern ethernet installations don't use the bulky co-axial thick-net cable. The function of that cable has been collapsed inside a hub to which drop cables are directly connected:

         +---------------+
         |  HUB          |
         +-+--+--+--+--+-+
           |  |  |  |  |
           |  |  |  |  +----------------------------------------------+
           |  |  |  +----------------------------------+              |
           |  |  +----------------------+              |              |
           |  +----------+              |              |              |
           |             |              |              |              |
           |             | <- drop ---> |              |              |
           |             |    cables    |              |              |
         +-+-+         +-+-+          +-+-+          +-+-+          +-+-+
         | A |         | B |          | C |          | D |          | E |
         +---+         +---+          +---+          +---+          +---+
In all other respects, a hub can be treated the same: all ethernet transmissions made by any one host plugged into the hub are heard by all other hosts plugged into the hub.

Hubs may be cascaded either using a special uplink port on one of the hubs or by using a special 10baseT drop cable called a crossover cable that has the data transmit and receive pins flipped (similar in concept to a null modem serial cable).

There are important length restrictions for various types of ethernet (i.e. for 10baseT over copper, versus 100baseT over copper, versus 100baseT over fiber optic cable). See 'http://www.uwsg.indiana.edu/usail/external/ethernet/ethernet-guide.html' for specific details.

     
Switches can perform the same function as hubs in an ethernet network. One key difference between them however is that switches learn the ethernet addresses of hosts that are directly plugged into them. This allows traffic to be partitioned at the link layer such that only packets destined for a particular ethernet address are forward down a particular port on the switch. This prevents the use of programs like tcpdump to snoop on traffic not sourced or destined for the machine running tcpdump.

     
Routers were originally defined as network hosts with more than one network interface. They were capable of forwarding IP traffic from the interface on one network out the interface on another network. Modern routers generally are not hosts per se, as they tend to be high-performance, dedicated machines that don't really support a user environment, but only direct traffic, so to speak. The important concept is that routers deal with traffic at the network layer, that is, based on IP addresses.

5 Routing

Up till now we have been talking about very simple routing for hosts with only one network interface and a default route. Setting up routing information for such hosts involves manually editing certain files (as we have seen) and changes are only required when there are changes to the subnet to which such hosts are attached. Routing of this nature is known as static routing because the route information is configured by hand or during system startup through invocations of the route command. Static routing is useful for small, stable environments.

For more complex environments dynamic routing may be required. As its name may imply, this style of routing requires special routing daemons which update a host's routing tables as the network environment changes. A primitive and ubiquitous routing daemon that is shipped with most every version of Un*x is called routed and makes use of a routing protocol called RIP . We will look at routed and a more powerful daemon gated .

It should be re-iterated that routing is a network layer concept. Routers forward packets onto their proper destination based solely on the packets' destination IP addresses and the contents of the machine's routing table.

5.1 Static Routing

My private home network is a good example of where static routing makes sense. Here is a picture of the topology of it:

            198.11.19.5 +-----------+
    =---- DSL ----------+ coatlicue |
    198.11.19.1         | BSD sparc |
                        +-+---------+
                          | 10.0.0.1
                          |          /----- [ laptop drop ]
                       +--+----+    /
             (upstairs)| 10b/T +----   10.0.0.2+----------+
                       | Hub   +---------------+ xibalba  |
                       +--+----+               |  RH pc   |
                          |                    +----------+
                          |
                       +--+----+
           (downstairs)| 10b/T |      10.0.0.10+-----------+
                       | Hub   +---------------+ Macintosh |
                       +-------+               +-----------+
Because the network is very small and the IP information almost never changes, it would be absurd to use anything but static routing. Let's have another look at the routing table on the gateway host coatlicue :

    coatlicue% netstat -rn
    Routing tables

Internet: Destination Gateway Flags Refs Use Mtu Interface default 198.11.19.1 UGS 5 187476 - le0 10.0.0/24 link#2 UC 0 0 - le1 10.0.0.1 127.0.0.1 UGHS 0 13 - lo0 10.0.0.2 0:a0:24:15:57:b7 UHL 4 158520 - le1 10.0.0.10 0:0:94:ad:97:8f UHL 0 3121 - le1 10.0.0.36 link#2 UHL 2 54 - le1 127/8 127.0.0.1 UGRS 0 0 - lo0 127.0.0.1 127.0.0.1 UH 2 0 - lo0 198.11.19.0/25 link#1 UC 0 0 - le0 198.11.19.1 0:0:c:c:2d:d0 UHL 1 2 - le0 224/4 127.0.0.1 URS 1 12 - lo0

For now, don't worry about the fact that I also run NAT which allows many private addresses to be mapped into a single real IP address that can be used on the internet. We'll look at NAT later.

The gateway host has two network interfaces: one for the internal network and one for the connection to the internet. You can see that there are routes in the table for each subnet that coatlicue is attached to:

    10.0.0/24          link#2             UC          0        0      -  le1
    198.11.19.0/25     link#1             UC          0        0      -  le0
You can also see the default route:

    default            198.11.19.1        UGS         5   187476      -  le0
When the machine receives a packet on one interface, it checks the destination IP address of the packet and forwards the packet out its other interface if it knows that the destination IP can be reached through that interface.

Here is the ifconfig information for coatlicue:

    coatlicue% ifconfig -a
    lo0: flags=8009<UP,LOOPBACK,MULTICAST>
            inet 127.0.0.1 netmask 0xff000000 
    lo1: flags=8008<LOOPBACK,MULTICAST>
    le0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST>
            media: Ethernet 10baseT
            inet 198.11.19.5 netmask 0xffffff80 broadcast 198.11.19.127
    le1: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST>
            media: Ethernet 10baseT
            inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
Now look at the Configuration information for the internal host xibalba :

    xibalba % ifconfig -a
    eth0      Link encap:Ethernet  HWaddr 00:A0:24:15:57:B7  
              inet addr:10.0.0.2  Bcast:10.0.0.255  Mask:255.255.255.0
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
    ...

xibalba % netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 10.0.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 eth0 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 10.0.0.1 0.0.0.0 UG 0 0 0 eth0

     
To configure an OpenBSD box to be a router, you need to enable IP forwarding in the kernel. This is simple to achieve by editing the /etc/sysctl.conf file and making sure the following line appears in the file:

    net.inet.ip.forwarding=1        # 1=Permit forwarding (routing) of packets
The process for RedHat also necessitates enabling IP forwarding in the kernel, however, this usually requires that the kernel by recompiled to include the configuration change. See: http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html for details about rbuilding the Linux Kernel.

     
Let's re-examine the concept of the default route. First, in general, when a gateway machine receives a packet on one of its interfaces it will consult its routing table and forward the packet based on the best match of the packet's destination IP address with the information in the table. If no match can be made then the packet is sent via the machine's default route, with the hope that a router further on will know what to do with the packet.

     
Normally, when I have a default route, I can simply connect to another host without any problem. For example, I can SSH to bfs.cs.colorado.edu easily enough (I'm using IP addresses instead of hostnames to avoid nameservice issues):

    coatlicue % ssh 128.138.202.9
    Host key not found from the list of known hosts.
    Are you sure you want to continue connecting (yes/no)? yes
    ...
    bfs:tor % 
Now, this of course fails to work if I remove my default route:

    coatlicue% sudo route delete default
    delete net default

coatlicue% ssh 128.138.202.9 Secure connection to 128.138.202.9 refused; reverting to insecure method. Using rsh. WARNING: Connection will not be encrypted. 128.138.202.9: No route to host

SSH tries to connect to bfs on TCP port 22 which fails, then SSH falls back to try the RSH protocol which also fails. Then SSH tells us: No route to host . We are not all surprised to see this error message, having just nuked our default route from the routing table...

Instead of adding the default route back into the routing table, let's add a specific route to the 128.138.202 subnet:

    coatlicue% sudo route add -net 128.138.202.0/24 198.11.19.1
    add net 128.138.202.0: gateway 198.11.19.1
Now we can SSH to bfs again just fine:

    coatlicue% ssh 128.138.202.9
    Last login: Mon Feb 28 22:11:25 2000 from coatlicue.colora
    ...
    bfs %
However, we can't get to anywhere else:

    coatlicue% ssh 128.138.192.205
    Secure connection to 128.138.192.205 refused; reverting to insecure method.
    Using rsh.  WARNING: Connection will not be encrypted.
    128.138.192.205: No route to host
I'll restore my default route so I can get some work done...

    coatlicue% sudo route delete -net 128.138.202 198.11.19.1
    delete net 128.138.202: gateway 198.11.19.1
    coatlicue% sudo route add default 198.11.19.1
    add net default: gateway 198.11.19.1
     
If you have a simple network toplogy, as above, but your gateway's external address is obtained via DHCP, you are still basically using static routing. Certainly for all hosts on the internal subnet, their routing information need not ever change. The DHCP client daemon that runs on the gateway machine receives routing information from the DHCP server when the interface is configured. The DHCP client updates the machine's routing table accordingly. In a way, this is almost like dynamic routing which we will look at next. The difference is that DHCP is not a routing protocol, it is a dynamic IP address assignment protocol and mechanism only. It is capable of installing a route to a local subnet via the dynamically configured interface, as well as a default route. It usually invokes the route command to do this and that is the extent of its capabilities.

5.2 Dynamic Routing

When we move into the realm of larger and more complex network topologies, we also begin to deal more directly with connecting to the internet . From the point of view of the 'net , the world consists of large entites called autonomous systems . Autonomous Systems ( AS ) are basically high-level domains that have individually maintained interior routing policies and protocols and that interact with other AS's with different, mutually agreed upon exterior routing policies and protocols.

For example, the University of Colorado is basically an AS. It has an internal routing policy among the campus backbone routers and (basically) a single route to the rest of the internet. The border router between campus and the internet cooperates with the internal routing policy on its internal network connection and cooperates with an external routing policy that is coordinated between the network administrators of neighboring Autonomous Systems and their border routers.

The concept of the Autonomous System can actually be applied recursively within a given AS. (Take the above example, for example :) The CS dept domain, cs.colorado.edu , can be looked at as a mini-AS within the (greater) colorado.edu domain. Internal to the CS department networks we have an internal routing policy and mechanism, while our border router also obeys the agreed upon routing policy and mechanism that is in use on the campus backbone .

Given the above, you will not be surprised to know that dynamic routing protocols are divided into two broad categories:

The concept of reachability changes depending on one's perspective with regard to a border router. For a host inside a given domain, the reachability information provided to that host by the border gateway includes the entire rest of the internet (i.e. a default route). For other border routers (i.e. neighboring AS peers) outside the given domain the reachability information provided by the particular border router is only that router's internal network.

I'm not going to discuss any of the exterior routing protocols (e.g. EGP or BGP). If you find yourself in a situation requiring this knowledge, you'll have to get a good book about it anyway! The above mentioned O'Reilly book is a great place to start. Additionally, configuration details will inevitably be specific to the particular router hardware on which you are working.

     
The routing decisions made by both simple network hosts and by large border routers are based solely on the contents of the particular machine's routing table (as I have already noted). For the simple host, there is not much involved: decide whether a packet is destined for the local host, the local subnet, or the default route. For border routers the decisions are much more complex. Have a look at the routing table for the CS department's border router, a cisco 7000 :

    gw#show ip route
    Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP 
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area  
           E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
           i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
           U - per-user static route 

Gateway of last resort is 128.138.80.141 to network 0.0.0.0

128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks O 128.138.80.88/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.0.0/19 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.80/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 S 128.138.192.192/26 [1/0] via 128.138.243.194 O 128.138.1.0/24 [110/13] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.84/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.72/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.76/30 [110/3] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.68/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.40.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.233.192/26 [110/15] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.34.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 ....

I truncated the list (there are about 320 routes in the table). You will see that most of the routes are actually reachable via the router's "Gateway of last resort" (cisco's name for the default route). The routing protocol by which a given route was added to the table is indicated by the capital letter at the far left.

You can imagine a border router for one of the large internet carriers like mci or sprint would have an absolutely huge routing table!

5.2.1 The traceroute Command

The traceroute command is used to show the routers a packet will travel through to get to a given destination. For example:

    xibalba % traceroute bfs
    traceroute to bfs.cs.colorado.edu (128.138.202.9), 30 hops max, 38 byte packets
     1  coatlicue (10.0.0.1)  1.308 ms  1.194 ms  1.150 ms
     2  its-dsl1.Colorado.EDU (198.11.19.1)  19.669 ms  20.740 ms  28.076 ms
     3  hut-its-7206.Colorado.EDU (128.138.80.33)  21.714 ms  21.315 ms  20.827 ms
     4  engr-hut.Colorado.EDU (128.138.80.202)  21.750 ms  20.623 ms  21.492 ms
     5  cs-gw.Colorado.EDU (128.138.80.142)  23.105 ms  22.042 ms  21.432 ms
     6  bfs.cs.colorado.edu (128.138.202.9)  23.753 ms *  34.424 ms
Each line of output from traceroute (except for the last) is a router in between my home PC xibalba and the CSEL lab server bfs . The first hop is my gateway box coatlicue as we would expect. The second hop is also not too surprising: it is my gateway's default route. From there we hop to two other routers in the campus backbone before reaching the CS department's border router and finally bfs itself.

Traceroute is very useful for diagnosing routing problems, as you can probably imagine.

5.2.2 RIP and the routed Daemon

RIP is probably the most commonly used interior routing protocol. The protocol is implemented as a daemon called routed ("route-dee") that is shipped with virtually every version of Un*x. RIP makes use of the UDP transport protocol to convey messages between RIP-aware servers.

When the routed program starts up it solicites routing information from any neighboring routers. That is, a RIP query packet is broadcast on each subnet to which the host running routed is connected. Any other hosts running routed who hear the query will respond with RIP response packets delineating the routes they know. During normal operation, RIP servers will continue to send update packets to all listening servers. The theory is, if a particular RIP server fails to send updates for X amount of time, then the routes which were previously advertised by that server are assumed to be broken and are removed from the routing tables. (X is usually 180 seconds).

Simple network hosts with only a single interface and default route can run routed -q . The -q flag to routed tells it to run in quiet mode meaning that routed will never advertise routes, it will only listen to the routes advertised by other RIP servers. It is bad form to run routed on such client hosts without the -q flag.

The -s flag to routed is the opposite of -q and it is also the default (so it doesn't need to be specified) when routed is run on a host with more than one network interface.

A RIP server assigns a metric or cost to each route it advertises. The measurement is also called a hop-count and it basically tells the number of routers that will be traversed between this server and the advertised destination. If two routes are received that have the same destination, routed will only keep the one with the lowest cost. (This is a method used to avoid some kinds of routing loops ).

You can preconfigure the routing information that routed will start with by editing the file /etc/gateways . One reason you may need to do something like this is if a particular route is known to exist, but it does not advertise, so routed can never learn it during normal operation. There are two basic types of entries allowed in this file: routes for hosts and routes for nets . Here are the formats:

    net Nname[/mask] gateway Gname metric value <passive | active | extern>

host Hname gateway Gname metric value <passive | active | extern>

For a network route, one needs to specify the subnet and netmask , the gateway , the cost and a special tag at the end that tells whether the route is passive , active or external .

For example, if I were to set up RIP on my gateway coatlicue , I might add a line as follows to /etc/gateways:

    net 0.0.0.0 gateway 198.11.19.1 metric 1 passive
The 0.0.0.0 address is routed's way of saying default route . The keyword passive is used to inform routed that the indicated gateway will not provide RIP updates about its status. The keyword must be specified in such cases so that routed doesn't delete the route from the routing tables (which it does when it doesn't get updates from a given gateway). Entries which have the active keyword are almost unnecessary, as the assumption is that routed will start receiving updates about such routes anyway.

RIP is subject to a number of different problems, most of which have been solved with either subtle implementation changes or a new revision of RIP called RIPv2.

5.2.3 Classless Inter Domain Routing - aka CIDR

These are the CIDR house rules :)

The CIDR specification arose as an attempt to deal with the fact that while there were many unused IP addresses, they were unavailable due to tradition class boundaries. With CIDR, these traditional class A, B and C boundaries are abolished. The point is simply to say that networks can have an arbitrary number of bits in the IP address devoted to the subnet instead of the traditional 8-bits for class A address, 16-bits for class B's and so on. This is where the /N convention arose:

    128.138.202.0/24    designates 24 bits for the network, 8 for the host
    128.138.192.192/26  designates 26 bits for the net, 6 for the host
You already know that we subdivide many of our CS dept nets into 6-bit networks. Take a look at this snippet from the /etc/networks file:

    cu-cs-capp      128.138.242.0   # [6] CS CAPP Lab (lynda.mcginley)
    cu-cs-serl      128.138.242.64  # [6] CS SW Engring Rsrch Lab (lynda.mcginley)
    cu-cs-cappfast  128.138.242.128 # [6] CS CAPP Fast Ethernet (lynda.mcginley)
    cu-cs-fs        128.138.242.192 # [6] CS Fast Servers (lynda.mcginley)
The /etc/networks file gives you the symbolic name of the network and the IP address of the network. Here at CU, we also have a convention of placing the number of host-bits as the first bit of information in the comment following each entry. Anyway, you can see that what could have been a single 24-bit (i.e. 8 host bits) network has been divided into four 26-bit networks (i.e. 6 host bits). The netmask for a 26-bit network is 255.255.255.192 .

Another feature of CIDR is that one can aggregate networks together into a single entry in a routing table (when each of those networks are contigious with one another and reachable by the same gateway). The above four subnets are all connected to our internal router named cs-gw3 . Look at the single static routing entry on our external router ( cs-gw ) for these nets:

    ip route 128.138.242.0   255.255.255.0   128.138.243.194 100
The four 26-bit nets have been aggregated together into a single 24-bit net entry in the routing table. This is a 4:1 reduction, helping our routing table to be much smaller.

5.2.4 The OSPF Protocol

The Open Shortest Path First protocol was developed to deal with many of the shortcomings of RIP. It is in a class of protocols known by the Shortest Path First algorithm used to choose optimal routes. The protocol is also known as a link state protocol. The link-state concept differs from RIP's distance-vector mechanism in that OSPF will compute a link-state based primarily on whether or not a gateway is actually functioning. Other variables that make up the link-state include the round trip time for a packet to reach a given gateway as well as the MTU of the link; if two routes are known to reach the same destiniation, then the route which is fatter and faster will be considered optimal. OSPF is also capable of load balancing (aka "equal-cost multi-path" routing) whereby heavy volumes of traffic can be distributed over a number of routes to reduce the load over each route. Don't forget that OSPF is also an interior routing protocol.

An important concept for the operation of OSPF is that of the area . There are two main kinds: stub areas and backbone areas. Stub areas have only a single gateway, and this is usually a border router that also sits in a backbone area.

Each OSPF router figures out the link-state of it's immediately connected routes and floods this information to every other OSPF router in the system. Then each router builds a directed graph or the network topology from its own point of view. The tree is then pruned using the SPF algorithm. The fact that each OSPF router maintains these link-state databases along with their communication style with each other make a network using OSPF converge much more quickly (than with RIP) when existing routes fail or new routes appear.

Let's revisit the routing table on the CS department router cs-gw :

     
    Gateway of last resort is 128.138.80.141 to network 0.0.0.0

128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks O 128.138.80.88/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.0.0/19 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.80/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 S 128.138.192.192/26 [1/0] via 128.138.243.194 O 128.138.1.0/24 [110/13] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.84/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.72/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.76/30 [110/3] via 128.138.80.141, 1d02h, FastEthernet4/1 O 128.138.80.68/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.40.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.233.192/26 [110/15] via 128.138.80.141, 1d02h, FastEthernet4/1 O IA 128.138.34.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1 ....

Notice the second line above:

      128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks
This information is derivable from the OSPF link-state database. The other routes in the above table that were installed by OSPF (which you can tell by the O at the far left of each entry in the table) are some of those 242 subnets; the fact that they all are reached via the default route ( 128.138.80.141 ) is basically coincidental. It is a combination of the fact that the CS network is a stub OSPF area and the specific configuration (i.e. the topology ) of the colorado.edu network overall.

Notice that many of the routes shown above are to /30 subnets. Each of these subnets has a netmask of 255.255.255.252 which leaves only two bits for the host. Basically, a /30 subnet has only two usable IP addresses. For example, the default route for cs-gw is to 128.138.80.141 which is also a /30 subnet:

    128.138.80.140             this is the subnet
    128.138.80.141             this is an ITS (CU) backbone router
    128.138.80.142             this is the CS dept router (cs-gw)
    128.138.80.143             this is the IP broadcast address for the subnet
Given that a /30 subnet has only two bits for the host, it is commonly used for point-to-point like connections between routers. One end of the connection gets the first usable address, the other end gets the other usable address. To reiterate, with only two bits for possible host identification (i.e. bits number 30 and 31 of the 32 bit IP address):

      bit-30  bit-31        desc
       ----    ----         ----
        0       0            defines the particular subnet
        0       1            first usable address
        1       0            second usable address
        1       1            IP broadcast address for the subnet
Of course, the IP broadcast address is kind of moot with a link like this, but it comes along for the ride because it's part of the IP-subnet concept.

     

     

5.2.5 The gated Daemon and some routing theory

These days most complex routing situations are dealt with using dedicated router hardware. If you instead want to use a Un*x box for this purpose then you will most likely have to obtain and configure gated . The gated package is quite complex and very powerful, as it can make use of many different routing protocols. In fact, you can configure gated to listen to information from many different routing protocol servers and compute a preference value for each based on your configuration instructions. Routes with lower preference values are selected first.

Gated can converse with the following routing protocols:

It also listens for ICMP redirects, the IS-IS protocol and others.

You configure gated by editing the file /etc/gated.conf . The file is divided into six major sections which must appear in the following order:

Not all of the above sections must be present in the gated.conf file, but if more than one of the above statement types do appear in the file than the order must be honored or you will get syntax errors !

The following is a sample gated configuration file in use in the CS department. It is on a machine that we call a backup router. It is important to note that we don't use OSPF on our internal networks, so this example of gated is only dealing with RIP.

    # 
    # GateD configuration file for the CSOps subnet
    #
    # $Header: /home/fcsk/tor/working/saclass/RCS/class07.txt,v 1.2 2000/03/05 22:22:53 tor Exp $
    #
    # THIS FILE IS UNDER RCS!  The master copy resides in 
    #       /csops/private/config/gated/suod
    #
    ##############################################################################
    #
    # Ensure GateD comes up if DNS is hosed
    options noresolv ;
    options syslog upto warning ;

# # Prevent any interfaces from being marked `down' interfaces { interface all passive ; # Prefer the leaf interface interface 128.138.192.205 preference 10 ; interface 128.138.243.135 preference 15 ; };

# # Don't flush the routing table (DOH!) kernel { options noflushatexit ; } ;

# # Configure RIP rip yes { # Add 3 to all advertisements interface all metricout 3 ; # Force RIP v.1 advertisements interface all version 1 ; # List of routers to accept RIP info from trustedgateways # Backbone Routers, 243 interface 128.138.243.129 128.138.243.131 128.138.243.135 128.138.243.137 128.138.243.138 128.138.243.140 128.138.243.143 128.138.243.144 128.138.243.167 128.138.243.171 # CSOps 128.138.192.193 128.138.192.197 128.138.192.205 ;

traceoptions policy request response other ; } ;

# # Filter RIP announcements export proto rip interface 128.138.243.135 { proto direct { } ; } ;

# # Logging and tracing and other options traceoptions "/var/log/gated.log" replace size 100k files 3 # choose from below (at least one must be uncommented) #parse #adv #symbols #iflist #general #state #normal #policy #task #timer #route none ;

# end

As you know, most of our departmental subnets connect to a single router ( cs-gw3 ). This router is potentially a single point of failure in that if it failed, communication from any of the subnets to any other or to the outside world is effectively cut off. To deal with such a problem we have a number of special Un*x hosts that each have network interfaces on a shared subnet as well as a second interface on a different subnet. We setup gated on each of these hosts so that if the default route through the cisco router (i.e. cs-gw3 ) goes down, then they start routing traffic instead. We call these hosts backup routers.

Of course, traffic to the outside world may still not work in this case. Having a redundant link to the rest of the internet is a whole other issue: do you contract with a second ISP? Do you then divide all your traffic between the two routes to the outside world? Cost is big factor here, of course, not to mention the routing considerations. Of questions that can be asked in regards to routing under such circumstances: can traffic be simply split in half for load-balancing? what if the two routes are not actually to the same destination (which is good for robust connectivity)? Local routers will have to understand the external topology somewhat in order to route packets efficiently...

You can use the traceroute command to see a path that your packets will travel to reach a particular destination. Can you find an destination that gives different results for multiple invocations of the same traceroute command? Do you find that packets traveling to relatively local destinations often go all over the country before reaching, e.g. Denver? Sometimes you will see this sort of thing occur, like packets travelling to San Jose before reaching their destination at a computer somewhere else in boulder! Generally this is due to misconfigured routers either internal-to or between the large internet carriers like MCI or qwest. Next time you notice delays downloading a web document, try using traceroute to see where the delays are occuring. Often it will again be caused by the same types of problems (if it's not just an overloaded web-server :).

     

If you are interested in using gated for OSPF routing, look here in particular:

For gated info in general:

6 Providing (or Denying) Services with inetd

Most important services provided by a unix machine are implemented in the form of server daemons (as we have seen before). Often these daemons are simply started up at boot time and they run continuously until they are manually killed or the machine is shutdown (or they fail because of an error :). Generally, for services like http on a busy machine, this makes perfect sense.

There are many lesser-used services, though, that one would like to provide, but would rather not have running all the time using up system resources unnecessarily. That is the purpose of the inetd daemon. The inetd daemon loads its configuration information from the file /etc/inetd.conf . This file is a list of services for which inetd acts as an agent. Inetd will listen on the well known ports of the services specified (i.e. the symbolic service name is given here and it is referenced in the /etc/services file). When a connection on a particular port (that inetd is listening on) is asked for, inetd will spawn a new process and start up the relevant server daemon, handing off the TCP or UDP connection to that new process.

Let's have a look at the /etc/inetd.conf file from OpenBSD:

     
    #       $OpenBSD: inetd.conf,v 1.31 1999/04/10 05:13:42 deraadt Exp $
    #
    # Internet server configuration database
    #
    ftp             stream  tcp     nowait  root    /usr/libexec/ftpd       ftpd -US
    lld
    #telnet         stream  tcp     nowait  root    /usr/libexec/telnetd    telnetd 
    -k
    #shell          stream  tcp     nowait  root    /usr/libexec/rshd       rshd -L
    #login          stream  tcp     nowait  root    /usr/libexec/rlogind    rlogind
    #exec           stream  tcp     nowait  root    /usr/libexec/rexecd     rexecd
    #uucpd          stream  tcp     nowait  root    /usr/libexec/uucpd      uucpd
    #finger         stream  tcp     nowait  nobody  /usr/libexec/fingerd    fingerd 
    -lsm
    #ident          stream  tcp     nowait  nobody  /usr/libexec/identd     identd -
    elo
    #tftp           dgram   udp     wait    root    /usr/libexec/tftpd      tftpd -s

Actually, that is just the first 15 lines or so from the file - enough for our purposes. You will note that most of the entries in this file have been commented out. In fact, the default installation of OpenBSD has very few entries in this file actually enabled. One that I turned on was for ftp.

The format of each entry is:

    service  socket-type  protocol  [no]wait  user  daemon  daemon-arguments
The service is a symbolic service name that can be referenced in the /etc/services file. By default, inetd listens for each specified service on every network interface. Some versions of inetd, however (notably OpenBSD and Linux) let you specify a subset of those interfaces if that is desired - e.g. for security purposes. For example, on my home network, I might want to only allow ftp on the inside net. To do that, the service would look like: 10.0.0.1:ftp instead of just ftp. See the inetd(8) manpage for more details.

The socket-type will make sense if you actually do any network programming in C. TCP connections will always be type stream and UDP connections will always be type dgram.

The protocol is generally tcp or udp although you will also see rpc/tcp and rpc/udp.

You must specify whether inetd should wait until a given process completes before initiating a new one (of the same daemon) or not.

The user which will be the owner of the daemon process when inted starts it.

The absolute path to the server daemon itself.

Last come the arguments that are passed to the server daemon. Note well that the first of these arguments is the name of the process itself. If you are familiar with the execve(2) system call (A C language system call) this will make sense to you. You will also see that it is important for implementing tcpwrappers as well.

7 Tcpwrappers - Controlling and Logging Service Use

An easy and convenient way to control and log access to services is through the used of tcpd - the "TCP Wrappers" package. The package was written by Wietse Venema of satan fame. It is available via FTP; see: ftp://ftp.porcupine.org/pub/security/index.html

Setting up the TCP wrappers is easy enough after you install the package. Simply change your /etc/inetd.conf file to have tcpd invoked instead of the actual server, like, for example telnet. Have a look at some excerpts from an inetd.conf file that makes use of TCP wrappers:

    ftp     stream  tcp     nowait  root    /usr/local/tcpd/bin/tcpd in.ftpd -l
    telnet  stream  tcp     nowait  root    /usr/local/tcpd/bin/tcpd in.telnetd
    shell   stream  tcp     nowait  root    /usr/local/tcpd/bin/tcpd in.rshd
    login   stream  tcp     nowait  root    /usr/local/tcpd/bin/tcpd in.rlogind
    exec    stream  tcp     nowait  root    /usr/local/tcpd/bin/tcpd in.rexecd
The wrappers allow you to log the connection as well as control access to the services based on source IP addresses. Logging is done with the local7 syslog facility.

To control access you use the /etc/hosts.allow and /etc/hosts.deny files. Each line these files has a basic format of:

    daemon_list : client_list [ : shell_command ]

There are some exceptions. Let's look at some examples. Here is the first line for the CS dept /etc/hosts.allow file:

    ALL : PARANOID : banners /usr/local/tcpd/lib/paranoid : DENY
This first line is one of the exceptions. The first field ALL indicates, as you might expect, that this line applies to all daemons. In the next field, PARANOID is a special client type that says to verify that the DNS A and PTR records for the requesting host match each other. The connection is denied if the two records do not correspond to each other. If DENY or ALLOW option is specified, then it must occur last on the line. The banners option will transmit a text message contained in the specified directory. The text message that is transmitted is found in a file of the same name as the service (daemon). I.e take a look in the directory /usr/local/tcpd/lib/paranoid on bfs. See the hosts_options(5) manpage for the whole scoop on it.

Other lines in this file look less unusual:

    in.fingerd : .cs.colorado.edu : ALLOW

This one is more straight forward: the service is finger and it is allowed for every host in the cs.colorado.edu domain.

     

8 Xinetd - The Linux Inetd Replacement

Newer distributions of Linux, like RedHat-7.0 ship with xinetd instead of (or in addition to) inetd. The overall concept is identical, except that xinetd integrates tcpwrapper functionality directly. Overall configuration takes place in the /etc/xinetd.conf file, while server-specific information is placed in individual files (named for the server of course) placed in /etc/xinetd.d .

Here is an example /etc/xinetd.conf file:

    xibalba % cat /etc/xinetd.conf
    #
    # Simple configuration file for xinetd
    #
    # Some defaults, and include /etc/xinetd.d/

defaults { instances = 60 log_type = SYSLOG authpriv log_on_success = HOST PID log_on_failure = HOST RECORD }

includedir /etc/xinetd.d

In the above config file we are just setting some default behavior. Specifically, the instances indicates the number of servers that may exist simultaneously for a given service; log_type sets the syslog facility-name to which log messages will be sent; and finally the information that should be logged for either successful or failed attempts to access services.

Individual services place their information in files in the /etc/xinetd.d directory. Here is a listing of that directory on a default RedHat-7.0 install:

    xibalba % ls /etc/xinetd.d
    finger  linuxconf-web  ntalk  rexec  rlogin  rsh  swat  talk  telnet  tftp
And finally, here is a look at the telnet file:

    xibalba % cat !$/telnet
    cat /etc/xinetd.d/telnet
    # default: on
    # description: The telnet server serves telnet sessions; it uses \
    #       unencrypted username/password pairs for authentication.
    service telnet
    {
    		flags           = REUSE
    		socket_type     = stream        
    		wait            = no
    		user            = root
    		server          = /usr/sbin/in.telnetd
    		log_on_failure  += USERID
    }
To implement tcpwrapper-like functionality using the native xinetd syntax you use the only_from keyword. For example:

            only_from       = 128.138.202.0/24
You may specify this keyword several times to allow access from several places.

In the CS department we have been using inetd for a very long time in conjunction with tcpwrappers , and we have an infrastructure in place to automatically maintain the necessary files on all of our machines. With a little work (and the newest xinetd binaries) we were abel to get xinetd to work just fine with tcpwrappers instead of its own only_from syntax. We wrote a perl script to generate xinetd files from an existing inetd.conf file (there are other utilities available to do this). Here is what the converted file for telnet looks like:

    #
    # Warning, this is a *generated* file.
    # Any changes you make WILL be overwritten.
    # Edit /local/etc/inetd.conf.local and re-run mkinetd.conf instead.
    #

service telnet { socket_type = stream protocol = tcp wait = no user = root flags = NAMEINARGS server = /usr/local/tcpd/bin/tcpd server_args = /usr/sbin/in.telnetd }


Go to: top / index

Source document: class06.txt
Last modified: 0
Category: guide
Obsoletes: