The last item on the agenda in terms of setting up the network personality of a simple host is to configure your nameservice files. And, finally, one last look at the afterboot manpage:
BIND Name Server (DNS)
If you are using the BIND Name Server, check the /etc/resolv.conf file.
It may look something like:
domain nts.umn.edu
nameserver 128.101.101.101
nameserver 134.84.84.84
search nts.umn.edu. umn.edu.
lookup file bind
If using a caching name server add the line "nameserver 127.0.0.1" first.
For a local caching name server to run you will need to set "named_flags"
in /etc/rc.conf and create the named.boot file in the appropriate place
for named(8). The same holds true if the machine is going to be a name
server for your domain. In both these cases, make sure that named(8) is
running (otherwise there are long waits for resolver timeouts).
As you know, nameservice is the system by which a client host can query
a name server to translate symbolic hostnames (e.g. bfs.cs.colorado.edu )
into IP addresses (e.g. 128.138.202.9 ).
We will talk about the Domain Name System ( DNS ) in more detail in a future class,
as well as the basic configuration concepts necessary for a DNS server.
Setting up a client host for using DNS is much easier.
Software applications make use of DNS through a special C library
called the resolver library. The process of converting a hostname into
an IP address is known as address resolution .
To configure a a client host's nameservice you need to edit the /etc/resolv.conf file. Here is the file from saclass :
saclass % cat /etc/resolv.conf
search cs.colorado.edu
nameserver 128.138.202.19
lookup file bind
The first line of this file gives a list of domains to be searched to find
a given host. The list may contain arbitrarily many space seperated domain names.
For example, if your home machine was in the domain of your ISP, say indra.net and
you wanted to also reference cs.colorado.edu hosts by only their unqualified
hostname (e.g. bfs ) then you could use a search line as follows:
search indra.net cs.colorado.edu
The second line specifies the IP address of a nameserver that will be
contacted to for nameservice queries. There may be up to three of these
lines (and there should be for robustness) and each nameserver is queried
in order with subsequent servers only being used if previous ones do
not return answers (i.e. the nameserver machine is down or broken).
The final line is an option that can be specified in the OpenBSD resolv.conf file only. This line specifies that the resolver libraries should first check the local /etc/hosts file to find IP addresses and only query a nameserver if the hosts file does not contain a match (i.e. the specified hostname is not in the hosts file). You can change this behavior in exactly the same fashion on a RedHat (or other SysV-like OS's) box by editing the /etc/nsswitch.conf file; the relevant line this file might look like this:
hosts: files dns
Again, this tells the resolver library to first check the /etc/hosts file
and then contact a nameserver.
Both the OpenBSD and RedHat resolver libraries may be give a number of different options to control such things as the timeout length between unanswered queries and the number of attempts tried before giving up on a given server. Read the following manpages for details:
The tcpdump command is arguably the most important tool in the un*x network administrator's arsenal. It is used to capture and print out the header information for packets on a TCP/IP network. The command requires root priviledges to place the host (on which the tcpdump command is run) network interface into promiscuous mode so that all packets seen by host's interface will be captured and sent on to tcpdump for processing. Normally a network interface uses very fast hardware-level logic to compare a packet's destination link-layer address (i.e. ethernet ) with its own, enabling it to swiftly reject packets or pass them on up to the kernel's link-layer device code for processing.
By default, tcpdump will pick the first active network interface it finds to listen on for packets (with a simple host that has only one network interface tcpdump should always find it). To specify a specific device use the -i <device> arguments. For example:
coatlicue % sudo tcpdump -i le1
Also by default, tcpdump will output header information about every packet
that is captured. On a busy subnet or a busy server host, this traffic can be
huge and it will scroll by incredibly quickly. Fortunately, tcpdump offers
a fairly simple but extensive set of keywords and logic operators so
that you can build arbitrarily complex filters to output only the packets
you want to see. For example, to only see ICMP packets:
coatlicue% sudo tcpdump -i le1 icmp
Password:
tcpdump: listening on le1
17:13:48.748089 xibalba > coatlicue: icmp: echo request
17:13:48.748757 coatlicue > xibalba: icmp: echo reply
17:13:49.742626 xibalba > coatlicue: icmp: echo request
17:13:49.743304 coatlicue > xibalba: icmp: echo reply
To avoid nameservice queries, one specifies the -n argument:
coatlicue% sudo tcpdump -i le1 -n icmp
tcpdump: listening on le1
17:15:35.558706 10.0.0.2 > 10.0.0.1: icmp: echo request
17:15:35.559403 10.0.0.1 > 10.0.0.2: icmp: echo reply
17:15:36.553549 10.0.0.2 > 10.0.0.1: icmp: echo request
17:15:36.554224 10.0.0.1 > 10.0.0.2: icmp: echo reply
You can specify logical conjunctions or disjunctions as well,
along with tcp or udp port number modifiers. Look what happens
to the output if restrict on source ip address as well as ICMP
traffic only:
coatlicue% sudo tcpdump -i le1 -n icmp and ip src 10.0.0.1
tcpdump: listening on le1
17:20:45.275391 10.0.0.1 > 10.0.0.2: icmp: echo reply
17:20:46.267767 10.0.0.1 > 10.0.0.2: icmp: echo reply
17:20:47.267707 10.0.0.1 > 10.0.0.2: icmp: echo reply
17:20:48.267881 10.0.0.1 > 10.0.0.2: icmp: echo reply
Now we only see the ICMP echo reply packets, because the
echo request packets have a different source IP address.
You can similarly restrict on a tcp or udp port number. For example, just see SSH packets with out any other restrictions, you could:
coatlicue% sudo tcpdump -i le0 tcp port 22
To add in the fact that you only want to see SSH packets to or from a specific
host you could:
coatlicue% sudo tcpdump -i le0 tcp port 22 and ip src or dst 10.0.0.1
The Transmission Control Protocol is one of the most important, after IP, for conveying the vast majority of traffic on the internet. How does TCP offer reliable service when its underlying carrier, IP, is not reliable? What are port numbers? How is it that a TCP session is considered to be connected or stateful?
This section applies equally to the UDP protocol as well.
You can think of port numbers as if they were post-office boxes. To reach a given service, one sends mail addressed to a specific post-office-box number located in a certain town (i.e. IP address).
Server daemon processes are configured to listen on a specific port number (and to use either TCP or UDP). To contact a given server, one needs to transmit a packet (again, either TCP or UDP) with the corect destination port number set in the protocol's header ; i.e. that would be the port number on which the desired daemon process was configured to listen .
When you, for example, initiate an SSH session (which uses TCP by default) your SSH client program is first issued an arbitrary temporary TCP port number to use as its source port number. Then the SSH client sends a TCP packet to the destination host to start the session. You can use the -v flag to the ssh command to get verbose output that is useful for debugging a failing session. Take a look more closely at the output and notice the messages about (TCP) port numbers:
xibalba % ssh -v tlaloc.cs.colorado.edu
SSH Version 1.2.26-CSOps-vsnprintf-patched [i586-unknown-linux], protocol version 1.5.
Standard version. Does not use RSAREF.
xibalba: Reading configuration data /etc/ssh/ssh_config
xibalba: ssh_connect: getuid 500 geteuid 0 anon 0
xibalba: Connecting to tlaloc.cs.colorado.edu [128.138.243.136] port 22.
xibalba: Allocated local port 1021.
xibalba: Connection established.
...
The last three lines of output shown above are the most important for this
discussion. The first of these tells you the destination IP address and
(TCP) port number:
xibalba: Connecting to tlaloc.cs.colorado.edu [128.138.243.136] port 22.
The next line following that tells you your (arbitrary) source (TCP) port
number:
xibalba: Allocated local port 1021.
And then the last line shown above indicates that the TCP session has
been established . We'll look at what that means more closely in
the next section.
Port numbers as you can see from the TCP and UDP header descriptions are 16-bit quantities meaning that port numbers can range from 0 to ~ 65K. Port numbers below 1K (1024) are special: they can only be used by processed owned by root ; that's one reason the arbitrary port numbers used by many different TCP-based client programs are often much greater than 1024.
The port numbers on which many well-known services listen are listed in the file /etc/services ; I will often use grep to reference services in this file:
tlaloc % grep ftp /etc/services
ftp-data 20/tcp
ftp 21/tcp
tftp 69/udp
sftp 115/tcp
bftp 152/tcp # Background File Transfer Protocol
ftp-ftam 8868/tcp # FTP->FTAM Gateway
The ports for ftp and ftp-data (20 and 21 respectively) are both used
during a typical FTP session. The other ports listed are for various
FTP-like services; you would have to read manpages to figure those out.
Note that the /etc/services file simply lists common services and associated TCP or UDP port numbers. This file is not related in any way with services that are available on a given host. To know what services are available on a given host you need to look at the system's startup files as well as the file inetd.conf . (I'll talk about the inetd daemon next week). You could also run a portscanner program to figure out the port numbers for a target host that have daemon(s) listening on them.
First let's look simply more closely at the packets going back and forth for an existing SSH session using tcpdump . This is for a session running between my home PC xibalba (running RedHat) and tlaloc a departmental admin host (used for backups and running *Solaris*). By the way, the -t flag to tcpdump tells it not to print timestamp information at the head of each output line.
coatlicue% sudo tcpdump -i le1 -t ip src or dst 10.0.0.2 and tcp port 22 and not ip src or dst 10.0.0.1
tcpdump: listening on le1
xibalba.1016 > tlaloc.cs.colorado.edu.ssh: P 3662642749:3662642769(20) ack 3232075105 win 32120 <nop,nop,timestamp 78368385 259077902> (DF) [tos 0x10]
tlaloc.cs.colorado.edu.ssh > xibalba.1016: P 1:21(20) ack 20 win 10136 <nop,nop,timestamp 259100459 78368385> (DF) [tos 0x10]
Here we can see more clearly how, in the first line, a TCP packet travelling from
the SSH client (on xibalba ) to the SSH server (on tlaloc ) has an arbitrary
TCP source port while the destination port is the well-known port 22 for SSH:
.. xibalba.1016 > tlaloc.cs.colorado.edu.ssh ..
And, conversely, how packets sent from the server back to the client have
a source port of 22 and the destination port is the "arbitrary" port:
.. tlaloc.cs.colorado.edu.ssh > xibalba.1016 ..
It is important to note that, once a TCP session has been established,
the so-called "arbitrary" port is definately no longer arbitrary.
As we'll see below, it is an important part of the state that makes
up a TCP connection. It is only arbitrary when it is first given to
the client process. You might also notice that the client-side's port
number in this case is below 1024. The reason is that the SSH client
program has the setuid permissions bit set:
xibalba % ls -l /usr/local/ssh/bin/ssh1
-rws--x--x 1 root root 682884 Nov 2 1998 /usr/local/ssh/bin/ssh1*
^^^
.. meaning that the SSH client process is also owned by root.
There are a number of important items of information that each side
(i.e. client and server) of a TCP connection keeps track of.
This is part of what is known as the state of a TCP session.
xibalba.1021 > tlaloc.cs.colorado.edu.ssh: S 2201118588:2201118588(0) win 32120 <mss 1460,sackOK,timestamp 79021300[|tcp]> (DF)
tlaloc.cs.colorado.edu.ssh > xibalba.1021: S 4079345241:4079345241(0) ack 2201118589 win 10136 <nop,nop,timestamp 259753321 79021300,nop,[|tcp]> (DF)
The first packet is the session negotiating packet from the client.
Note the S flag that follows the destination host.port:
xibalba.1021 > tlaloc.cs.colorado.edu.ssh: S
^^^
The S indicates that the TCP packet has the SYN flag set. Following
the SYN flag indicator is the packet's TCP sequence number.
Following the sequence number in the second, return , packet from the
server is the ack keyword indicating that the ACK flag was set
as well as the sequence number of the packet being acknowledged.
Here is the general format of output from tcpdump for actual TCP packets:
timestamp src-IP.port > dest-IP.port flags seq-num ack window urgent options
For a more detailed description, see the tcpdump(1) manpage under "OUTPUT FORMAT".
Finally, when a TCP session is terminated, each side exchanges packets
which have the FIN flag turned on.
To get a greater understanding of TCP, I refer the interested reader to Richard Stephens "TCP/IP - the protocols" or one of several other texts available on the subject.
ipfilter=YES
ipnat=YES # for "YES" ipfilter must also be "YES"
ipfilter_rules=/etc/ipf.rules # Rules for IP packet filtering
ipnat_rules=/etc/ipnat.rules # Rules for Network Address Translation
The /etc/ipnat.rules file for coatlicue (my home network gateway)
looks like this:
coatlicue% cat /etc/ipnat.rules
# $Id: class08.txt,v 1.2 2000/04/17 22:57:24 tor Exp tor $
#
# See /usr/share/ipf/nat.1 for examples.
# edit the nat= line in /etc/rc.conf to enable Network Address Translation
# map internal 10-net addresses onto le0's address
map le0 10.0.0.0/24 -> le0/32 portmap tcp/udp 10000:20000
# for icmp inside -> outside
map le0 10.0.0.0/24 -> le0/32
# http -> xibalba
rdr le0 0.0.0.0/0 port http -> 10.0.0.2 port http
# ssh -> xibalba
rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port ssh
The first line says to map 10.0.0.0/24 subnet addresses for traffic out the le0 interface. The mapping mechanism makes use of the specified range of port numbers to perform the translations. (more on how it works below.) This mechanism only works for tcp and udp based connections.
The next (second) line allows other kinds of traffic (like icmp ) to be mapped.
The last two lines are called redirects and they are used for mapping specific incoming traffic from the outside to be redirected to an internal IP address. The first of these simply redirects the http protocol (TCP port 80) while the second redirects tcp port 114 to an internal address, but changes the port number to be that of SSH (port 22).
There are excellent template files to work with in /usr/share/ipf . Most of the files in this directory have to do with ipf which is the IP firewall mechanism available with OpenBSD; there are some ipnat examples there as well, though. I simply copied the template files and changed some of the numbers.
The command ipnat is used to manipulate the rules used by NAT. For example, to the delete (Clear) the current rules and load new ones, you could do the following:
coatlicue % sudo ipnat -C
coatlicue % sudo ipnat -f /etc/ipnat.rules
You can also look at the current mappings:
coatlicue% sudo ipnat -l
List of active MAP/Redirect filters:
rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp
rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp
map le0 10.0.0.0/24 -> 198.11.19.5/32 portmap tcp/udp 10000:20000
map le0 10.0.0.0/24 -> 198.11.19.5/32
Following the list of rules output by ipnat -l
you also get a list of active mappings:
List of active sessions:
MAP 10.0.0.2 0 <- -> 198.11.19.5 0 [128.138.238.18 0]
MAP 10.0.0.2 0 <- -> 198.11.19.5 0 [128.138.240.1 0]
MAP 10.0.0.2 1027 <- -> 198.11.19.5 13955 [128.138.129.76 53]
MAP 10.0.0.2 2022 <- -> 198.11.19.5 13946 [128.138.129.25 80]
MAP 10.0.0.2 1022 <- -> 198.11.19.5 11319 [128.138.243.135 22]
MAP 10.0.0.2 1023 <- -> 198.11.19.5 11318 [128.138.242.212 22]
MAP 10.0.0.10 2186 <- -> 198.11.19.5 13542 [204.71.200.67 80]
MAP 10.0.0.10 2155 <- -> 198.11.19.5 13511 [151.193.165.62 80]
MAP 10.0.0.10 2150 <- -> 198.11.19.5 13506 [151.193.165.62 80]
MAP 10.0.0.10 1600 <- -> 198.11.19.5 13332 [199.45.146.2 80]
MAP 10.0.0.10 1599 <- -> 198.11.19.5 13331 [199.45.146.2 80]
MAP 10.0.0.10 1598 <- -> 198.11.19.5 13330 [199.45.146.2 80]
MAP 10.0.0.10 1597 <- -> 198.11.19.5 13329 [199.45.146.2 80]
MAP 10.0.0.10 1595 <- -> 198.11.19.5 13327 [199.45.146.2 80]
MAP 10.0.0.10 2049 <- -> 198.11.19.5 13252 [192.156.134.160 21]
MAP 10.0.0.10 2054 <- -> 198.11.19.5 13142 [216.15.42.166 8265]
MAP 10.0.0.10 1301 <- -> 198.11.19.5 13094 [144.198.225.50 80]
MAP 10.0.0.10 1298 <- -> 198.11.19.5 13091 [144.198.225.50 80]
MAP 10.0.0.10 2048 <- -> 198.11.19.5 13014 [204.132.155.200 22]
MAP 10.0.0.10 2122 <- -> 198.11.19.5 12499 [216.65.3.233 80]
MAP 10.0.0.36 1086 <- -> 198.11.19.5 13269 [216.15.42.166 80]
MAP 10.0.0.36 1084 <- -> 198.11.19.5 13267 [216.15.42.166 80]
The format of a mapping is:
MAP <real-src-IP> <real-src-port> <- -> <nat-src-IP> <nat-src-port>
[ <dest-IP> <dest-port> ]
The first two mappings in the above table don't have port numbers
(which you can tell by the fact that they are listed as 0. )
These are mappings associated with icmp.
The remaining are an assortment of TCP connections ranging from SSH and
HTTP to FTP.
Now let's look at some tcpdump output to illustrate ipnat
in action!
These examples will all be originating from my internal host xibalba
which has an IP address of 10.0.0.2.
First, an example that uses portnumbers. I'll use nslookup to demonstrate. Here is the query:
xibalba % nslookup freshmeat.net
Server: otis.Colorado.EDU
Address: 128.138.129.76
Non-authoritative answer:
Name: freshmeat.net
Addresses: 209.207.224.211, 209.207.224.212
And here is tcpdump output from each interface on the coatlicue:
coatlicue% sudo tcpdump -n -i le1 port domain
tcpdump: listening on le1
10:07:09.350885 10.0.0.2.1027 > 128.138.129.76.53: 58849+ (45)
10:07:09.382786 128.138.129.76.53 > 10.0.0.2.1027: 58849* 1/4/5 (273) (DF)
10:07:09.385611 10.0.0.2.1027 > 128.138.129.76.53: 58850+ (31)
10:07:09.413888 128.138.129.76.53 > 10.0.0.2.1027: 58850 2/4/4 (210) (DF)
coatlicue% sudo tcpdump -n -i le0 port domain
tcpdump: listening on le0
10:07:09.351475 198.11.19.5.13959 > 128.138.129.76.53: 58849+ (45)
10:07:09.382185 128.138.129.76.53 > 198.11.19.5.13959: 58849* 1/4/5 (273) (DF)
10:07:09.386189 198.11.19.5.13959 > 128.138.129.76.53: 58850+ (31)
10:07:09.413280 128.138.129.76.53 > 198.11.19.5.13959: 58850 2/4/4 (210) (DF)
And now an example using ping. With this example, we'll see
a problem that you will encounter using NAT. First, here is
the ping command itself:
xibalba % ping boulder.colorado.edu
PING boulder.colorado.edu (128.138.240.1) from 10.0.0.2 : 56(84) bytes of data.
64 bytes from 128.138.240.1: icmp_seq=0 ttl=250 time=23.3 ms
64 bytes from 128.138.240.1: icmp_seq=1 ttl=250 time=34.0 ms
64 bytes from 128.138.240.1: icmp_seq=2 ttl=250 time=23.8 ms
And then look at the tcpdump output (again, listening on each
interface of the gateway box):
coatlicue% sudo tcpdump -i le1 -n icmp
tcpdump: listening on le1
09:29:40.032990 10.0.0.2 > 128.138.238.18: icmp: echo request
09:29:40.055660 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
09:29:41.024305 10.0.0.2 > 128.138.238.18: icmp: echo request
09:29:41.047365 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
09:29:42.024220 10.0.0.2 > 128.138.238.18: icmp: echo request
09:29:42.048341 128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
coatlicue% sudo tcpdump -i le0 -n icmp
tcpdump: listening on le0
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
Now let's look at the problem that occurs when we immediately try
to run the same ping command, except this time from the gateway
itself:
coatlicue% ping boulder.colorado.edu
PING boulder.colorado.edu (128.138.240.1): 56 data bytes
--- boulder.colorado.edu ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
And the tcpdump output:
On the external interface:
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
198.11.19.5 > 128.138.238.18: icmp: echo request
128.138.238.18 > 198.11.19.5: icmp: echo reply (DF)
And on the internal interface:
128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
128.138.238.18 > 10.0.0.2: icmp: echo reply (DF)
Can you guess what is happening? Because there are no portnumbers
to remap and thus keep track of similar connections, the
mapping that was previously made to allow the ICMP traffic to go
between xibalba and boulder is still sending that traffic
to xibalba! Doh! For small networks, this should be a problem,
but on a larger scale, you can imagine it might be: if two different
folks on two different internal hosts try and ping the same
external host, one of them will have the problem.
The above problem occurs on my crufty old OpenBSD-2.5 gateway (I really need to upgrade :) running an also out-dated version of NAT. The above problem does not occur under O'BSD 2.6!
Here is another interesting problem. This one occured when my
gateway box received a DHCP address on boot, instead of having
a static IP address like it does now.
Here was my primary symptom:
on my pc running debian linux (10.0.0.2) I can no longer
get to the outside world unless I ssh first to my gateway
at (10.0.0.1) running sparc/openbsd
my gateway is forwarding the (icmp) packets just fine:
coatlicue% sudo tcpdump -i le1 -env icmp # 10.0.0.1 interface
tcpdump: listening on le1
0:a0:24:15:57:b7 8:0:20:1f:46:d9 0800 98: 10.0.0.2 > 128.138.238.18: icmp: echo request (ttl 64, id 39601)
coatlicue% sudo tcpdump -i le0 -env icmp # 198.11.19.11 interface
tcpdump: listening on le0
8:0:20:1f:46:d9 0:0:c:2b:e0:64 0800 98: 198.11.19.11 > 128.138.238.18: icmp: echo request (ttl 63, id 39601)
but no icmp pkts are being returned to xibalba. :(
however, from the gw it of course works:
8:0:20:1f:46:d9 0:0:c:2b:e0:64 0800 98: 198.11.19.39 > 128.138.238.18: icmp: echo request (ttl 255, id 4777)
0:0:c:2b:e0:64 8:0:20:1f:46:d9 0800 98: 128.138.238.18 > 198.11.19.39: icmp: echo reply (DF) (ttl 253, id 35817)
Woah, can you see the problem? In the first case - the routed pkt is sent with a src IP ending in .11 whereas in the latter case the src ip ends in .39
What happened?
The DSL connection dropped. I ifconfig'd the interface down, power-cycled the DSL modem and then ifconfig'd the interface back up. here's what the gw's routing table looks like:
coatlicue% netstat -rn
Routing tables
Internet:
Destination Gateway Flags Refs Use Mtu Interface
default 198.11.19.1 UGS 3 229 - le0
10.0.0/24 link#2 UC 0 0 - le1
10.0.0.1 127.0.0.1 UGHS 0 0 - lo0
10.0.0.2 0:a0:24:15:57:b7 UHL 3 15224 - le1
10.0.0.32 link#2 UHL 1 1 - le1
10.0.0.34 link#2 UHL 1 6434 - le1
127/8 127.0.0.1 UGRS 0 0 - lo0
127.0.0.1 127.0.0.1 UH 4 355 - lo0
198.11.19.0/25 link#1 UC 0 0 - le0
198.11.19.1 0:0:c:2b:e0:64 UHL 1 0 - le0
198.11.19.39 127.0.0.1 UGHS 0 0 - lo0
224/4 127.0.0.1 URS 1 15 - lo0
hmm. looks ok. I'm still confused maybe. now let's look at the ipnat
table:
coatlicue% sudo !!
sudo ipnat -l
List of active MAP/Redirect filters:
map le0 10.0.0.0/24 -> 198.11.19.11/32 portmap tcp/udp 10000:20000
rdr le0 128.125.138.150/32 port 20 -> 10.0.0.3 port 20 tcp
map le0 10.0.0.0/24 -> 198.11.19.11/32
rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp
rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp
List of active sessions:
MAP 10.0.0.2 0 <- -> 198.11.19.11 0 [128.138.238.18 0]
MAP 10.0.0.2 1019 <- -> 198.11.19.11 13109 [206.168.118.178 22]
MAP 10.0.0.2 1020 <- -> 198.11.19.11 13108 [128.138.202.9 22]
MAP 10.0.0.2 1021 <- -> 198.11.19.11 13107 [128.138.192.220 22]
MAP 10.0.0.2 1023 <- -> 198.11.19.11 13106 [128.138.242.212 22]
MAP 10.0.0.2 1418 <- -> 198.11.19.11 13096 [128.138.242.195 80]
MAP 10.0.0.2 1412 <- -> 198.11.19.11 13089 [128.138.202.19 80]
MAP 10.0.0.2 1411 <- -> 198.11.19.11 13088 [128.138.202.19 80]
MAP 10.0.0.2 1409 <- -> 198.11.19.11 13038 [129.128.5.191 80]
MAP 10.0.0.2 1404 <- -> 198.11.19.11 13031 [128.138.192.84 26846]
MAP 10.0.0.2 1403 <- -> 198.11.19.11 13030 [128.138.192.84 19002]
MAP 10.0.0.2 1400 <- -> 198.11.19.11 13023 [128.138.242.195 80]
MAP 10.0.0.2 1399 <- -> 198.11.19.11 13022 [128.138.242.195 80]
MAP 10.0.0.2 1397 <- -> 198.11.19.11 13020 [128.138.242.195 80]
MAP 10.0.0.2 1346 <- -> 198.11.19.11 12947 [206.25.182.132 80]
MAP 10.0.0.2 1332 <- -> 198.11.19.11 12933 [206.25.182.132 80]
MAP 10.0.0.2 1022 <- -> 198.11.19.11 12753 [128.138.243.135 22]
MAP 10.0.0.10 2049 <- -> 198.11.19.11 19753 [128.138.150.15 22]
AHA. here is the problem. I need to do the following:
coatlicue% sudo ipnat -C
7 entries flushed from NAT list
coatlicue% sudo ipnat -l
List of active MAP/Redirect filters:
List of active sessions:
coatlicue% sudo ipnat -f /etc/ipnat.rules
coatlicue% sudo ipnat -l
List of active MAP/Redirect filters:
map le0 10.0.0.0/24 -> 198.11.19.39/32 portmap tcp/udp 10000:20000
map le0 10.0.0.0/24 -> 198.11.19.39/32
rdr le0 0.0.0.0/0 port 80 -> 10.0.0.2 port 80 tcp
rdr le0 0.0.0.0/0 port 114 -> 10.0.0.2 port 22 tcp
List of active sessions:
coatlicue%
We needed to flush the old table and re-read the ipnat.rules file. now ping should work again. it does. The problem was that the NAT mappings were still thinking that the gateway's external address was 198.11.19.11 when it should have changed things to be 198.11.19.39!
If you are setting up a home gateway, chances are that you will also
need to set up an IP firewall. We looked at such a firewall briefly
(and it was included with the handout for class seven) for a cisco
router. OpenBSD uses a facility known as IPfilter that is very
similar in concept to the techniques used on the Cisco.
Again, there are excellent and well-commented examples found
in the /usr/share/ipf directory on an OpenBSD machine.
We will revisit ipf in class 10.
<some chunk of ethernet with five hosts attached to it>
===+=============+==============+==============+==============+====
| | | | |
| | <- drop ---> | | |
| | cables | | |
+-+-+ +-+-+ +-+-+ +-+-+ +-+-+
| A | | B | | C | | D | | E |
+---+ +---+ +---+ +---+ +---+
Modern ethernet installations don't use the bulky co-axial thick-net cable.
The function of that cable has been collapsed inside a hub to which
drop cables are directly connected:
+---------------+
| HUB |
+-+--+--+--+--+-+
| | | | |
| | | | +----------------------------------------------+
| | | +----------------------------------+ |
| | +----------------------+ | |
| +----------+ | | |
| | | | |
| | <- drop ---> | | |
| | cables | | |
+-+-+ +-+-+ +-+-+ +-+-+ +-+-+
| A | | B | | C | | D | | E |
+---+ +---+ +---+ +---+ +---+
In all other respects, a hub can be treated the same: all ethernet
transmissions made by any one host plugged into the hub are heard by all
other hosts plugged into the hub.
Hubs may be cascaded either using a special uplink port on one of the hubs or by using a special 10baseT drop cable called a crossover cable that has the data transmit and receive pins flipped (similar in concept to a null modem serial cable).
There are important length restrictions for various types of ethernet (i.e. for 10baseT over copper, versus 100baseT over copper, versus 100baseT over fiber optic cable). See 'http://www.uwsg.indiana.edu/usail/external/ethernet/ethernet-guide.html' for specific details.
Up till now we have been talking about very simple routing for hosts with only one network interface and a default route. Setting up routing information for such hosts involves manually editing certain files (as we have seen) and changes are only required when there are changes to the subnet to which such hosts are attached. Routing of this nature is known as static routing because the route information is configured by hand or during system startup through invocations of the route command. Static routing is useful for small, stable environments.
For more complex environments dynamic routing may be required. As its name may imply, this style of routing requires special routing daemons which update a host's routing tables as the network environment changes. A primitive and ubiquitous routing daemon that is shipped with most every version of Un*x is called routed and makes use of a routing protocol called RIP . We will look at routed and a more powerful daemon gated .
It should be re-iterated that routing is a network layer concept. Routers forward packets onto their proper destination based solely on the packets' destination IP addresses and the contents of the machine's routing table.
My private home network is a good example of where static routing makes sense. Here is a picture of the topology of it:
198.11.19.5 +-----------+
=---- DSL ----------+ coatlicue |
198.11.19.1 | BSD sparc |
+-+---------+
| 10.0.0.1
| /----- [ laptop drop ]
+--+----+ /
(upstairs)| 10b/T +---- 10.0.0.2+----------+
| Hub +---------------+ xibalba |
+--+----+ | RH pc |
| +----------+
|
+--+----+
(downstairs)| 10b/T | 10.0.0.10+-----------+
| Hub +---------------+ Macintosh |
+-------+ +-----------+
Because the network is very small and the IP information almost never
changes, it would be absurd to use anything but static routing.
Let's have another look at the routing table on the gateway host coatlicue :
coatlicue% netstat -rn
Routing tables
Internet:
Destination Gateway Flags Refs Use Mtu Interface
default 198.11.19.1 UGS 5 187476 - le0
10.0.0/24 link#2 UC 0 0 - le1
10.0.0.1 127.0.0.1 UGHS 0 13 - lo0
10.0.0.2 0:a0:24:15:57:b7 UHL 4 158520 - le1
10.0.0.10 0:0:94:ad:97:8f UHL 0 3121 - le1
10.0.0.36 link#2 UHL 2 54 - le1
127/8 127.0.0.1 UGRS 0 0 - lo0
127.0.0.1 127.0.0.1 UH 2 0 - lo0
198.11.19.0/25 link#1 UC 0 0 - le0
198.11.19.1 0:0:c:c:2d:d0 UHL 1 2 - le0
224/4 127.0.0.1 URS 1 12 - lo0
For now, don't worry about the fact that I also run NAT which allows many
private addresses to be mapped into a single real IP address that can be
used on the internet. We'll look at NAT later.
The gateway host has two network interfaces: one for the internal network and one for the connection to the internet. You can see that there are routes in the table for each subnet that coatlicue is attached to:
10.0.0/24 link#2 UC 0 0 - le1
198.11.19.0/25 link#1 UC 0 0 - le0
You can also see the default route:
default 198.11.19.1 UGS 5 187476 - le0
When the machine receives a packet on one interface,
it checks the destination IP address of the packet
and forwards the packet out its other interface
if it knows that the destination IP can be reached
through that interface.
Here is the ifconfig information for coatlicue:
coatlicue% ifconfig -a
lo0: flags=8009<UP,LOOPBACK,MULTICAST>
inet 127.0.0.1 netmask 0xff000000
lo1: flags=8008<LOOPBACK,MULTICAST>
le0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST>
media: Ethernet 10baseT
inet 198.11.19.5 netmask 0xffffff80 broadcast 198.11.19.127
le1: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST>
media: Ethernet 10baseT
inet 10.0.0.1 netmask 0xffffff00 broadcast 10.0.0.255
Now look at the Configuration information for the internal host xibalba :
xibalba % ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:A0:24:15:57:B7
inet addr:10.0.0.2 Bcast:10.0.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
...
xibalba % netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
10.0.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo
0.0.0.0 10.0.0.1 0.0.0.0 UG 0 0 0 eth0
net.inet.ip.forwarding=1 # 1=Permit forwarding (routing) of packets
The process for RedHat also necessitates enabling IP forwarding
in the kernel, however, this usually requires that the kernel by
recompiled to include the configuration change.
See: http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html for details
about rbuilding the Linux Kernel.
Let's re-examine the concept of the default route.
First, in general, when a gateway machine receives a packet on
one of its interfaces it will consult its routing table and
forward the packet based on the best match of the packet's
destination IP address with the information in the table.
If no match can be made then the packet is sent via the machine's
default route, with the hope that a router further on will know
what to do with the packet.
coatlicue % ssh 128.138.202.9
Host key not found from the list of known hosts.
Are you sure you want to continue connecting (yes/no)? yes
...
bfs:tor %
Now, this of course fails to work if I remove my default route:
coatlicue% sudo route delete default
delete net default
coatlicue% ssh 128.138.202.9
Secure connection to 128.138.202.9 refused; reverting to insecure method.
Using rsh. WARNING: Connection will not be encrypted.
128.138.202.9: No route to host
SSH tries to connect to bfs on TCP port 22 which fails,
then SSH falls back to try the RSH protocol which also fails.
Then SSH tells us: No route to host . We are not
all surprised to see this error message, having just nuked
our default route from the routing table...
Instead of adding the default route back into the routing table, let's add a specific route to the 128.138.202 subnet:
coatlicue% sudo route add -net 128.138.202.0/24 198.11.19.1
add net 128.138.202.0: gateway 198.11.19.1
Now we can SSH to bfs again just fine:
coatlicue% ssh 128.138.202.9
Last login: Mon Feb 28 22:11:25 2000 from coatlicue.colora
...
bfs %
However, we can't get to anywhere else:
coatlicue% ssh 128.138.192.205
Secure connection to 128.138.192.205 refused; reverting to insecure method.
Using rsh. WARNING: Connection will not be encrypted.
128.138.192.205: No route to host
I'll restore my default route so I can get some work done...
coatlicue% sudo route delete -net 128.138.202 198.11.19.1
delete net 128.138.202: gateway 198.11.19.1
coatlicue% sudo route add default 198.11.19.1
add net default: gateway 198.11.19.1
When we move into the realm of larger and more complex network topologies, we also begin to deal more directly with connecting to the internet . From the point of view of the 'net , the world consists of large entites called autonomous systems . Autonomous Systems ( AS ) are basically high-level domains that have individually maintained interior routing policies and protocols and that interact with other AS's with different, mutually agreed upon exterior routing policies and protocols.
For example, the University of Colorado is basically an AS. It has an internal routing policy among the campus backbone routers and (basically) a single route to the rest of the internet. The border router between campus and the internet cooperates with the internal routing policy on its internal network connection and cooperates with an external routing policy that is coordinated between the network administrators of neighboring Autonomous Systems and their border routers.
The concept of the Autonomous System can actually be applied recursively within a given AS. (Take the above example, for example :) The CS dept domain, cs.colorado.edu , can be looked at as a mini-AS within the (greater) colorado.edu domain. Internal to the CS department networks we have an internal routing policy and mechanism, while our border router also obeys the agreed upon routing policy and mechanism that is in use on the campus backbone .
Given the above, you will not be surprised to know that dynamic routing protocols are divided into two broad categories:
I'm not going to discuss any of the exterior routing protocols (e.g. EGP or BGP). If you find yourself in a situation requiring this knowledge, you'll have to get a good book about it anyway! The above mentioned O'Reilly book is a great place to start. Additionally, configuration details will inevitably be specific to the particular router hardware on which you are working.
The routing decisions made by both simple network hosts and by
large border routers are based solely on the contents of the particular
machine's routing table (as I have already noted).
For the simple host, there is not much involved: decide whether a
packet is destined for the local host, the local subnet, or the
default route. For border routers the decisions are much more
complex. Have a look at the routing table for the CS department's
border router, a cisco 7000 :
gw#show ip route
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
U - per-user static route
Gateway of last resort is 128.138.80.141 to network 0.0.0.0
128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks
O 128.138.80.88/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.0.0/19 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.80/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
S 128.138.192.192/26 [1/0] via 128.138.243.194
O 128.138.1.0/24 [110/13] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.84/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.72/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.76/30 [110/3] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.68/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.40.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.233.192/26 [110/15] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.34.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
....
I truncated the list (there are about 320 routes in the table).
You will see that most of the routes are actually reachable via the
router's "Gateway of last resort" (cisco's name for the default route).
The routing protocol by which a given route was added to the table
is indicated by the capital letter at the far left.
You can imagine a border router for one of the large internet carriers like mci or sprint would have an absolutely huge routing table!
The traceroute command is used to show the routers a packet will travel through to get to a given destination. For example:
xibalba % traceroute bfs
traceroute to bfs.cs.colorado.edu (128.138.202.9), 30 hops max, 38 byte packets
1 coatlicue (10.0.0.1) 1.308 ms 1.194 ms 1.150 ms
2 its-dsl1.Colorado.EDU (198.11.19.1) 19.669 ms 20.740 ms 28.076 ms
3 hut-its-7206.Colorado.EDU (128.138.80.33) 21.714 ms 21.315 ms 20.827 ms
4 engr-hut.Colorado.EDU (128.138.80.202) 21.750 ms 20.623 ms 21.492 ms
5 cs-gw.Colorado.EDU (128.138.80.142) 23.105 ms 22.042 ms 21.432 ms
6 bfs.cs.colorado.edu (128.138.202.9) 23.753 ms * 34.424 ms
Each line of output from traceroute (except for the last) is a router
in between my home PC xibalba and the CSEL lab server bfs .
The first hop is my gateway box coatlicue as we would expect. The
second hop is also not too surprising: it is my gateway's default
route. From there we hop to two other routers in the campus backbone
before reaching the CS department's border router and finally bfs itself.
Traceroute is very useful for diagnosing routing problems, as you can probably imagine.
RIP is probably the most commonly used interior routing protocol. The protocol is implemented as a daemon called routed ("route-dee") that is shipped with virtually every version of Un*x. RIP makes use of the UDP transport protocol to convey messages between RIP-aware servers.
When the routed program starts up it solicites routing information from any neighboring routers. That is, a RIP query packet is broadcast on each subnet to which the host running routed is connected. Any other hosts running routed who hear the query will respond with RIP response packets delineating the routes they know. During normal operation, RIP servers will continue to send update packets to all listening servers. The theory is, if a particular RIP server fails to send updates for X amount of time, then the routes which were previously advertised by that server are assumed to be broken and are removed from the routing tables. (X is usually 180 seconds).
Simple network hosts with only a single interface and default route can run routed -q . The -q flag to routed tells it to run in quiet mode meaning that routed will never advertise routes, it will only listen to the routes advertised by other RIP servers. It is bad form to run routed on such client hosts without the -q flag.
The -s flag to routed is the opposite of -q and it is also the default (so it doesn't need to be specified) when routed is run on a host with more than one network interface.
A RIP server assigns a metric or cost to each route it advertises. The measurement is also called a hop-count and it basically tells the number of routers that will be traversed between this server and the advertised destination. If two routes are received that have the same destination, routed will only keep the one with the lowest cost. (This is a method used to avoid some kinds of routing loops ).
You can preconfigure the routing information that routed will start with by editing the file /etc/gateways . One reason you may need to do something like this is if a particular route is known to exist, but it does not advertise, so routed can never learn it during normal operation. There are two basic types of entries allowed in this file: routes for hosts and routes for nets . Here are the formats:
net Nname[/mask] gateway Gname metric value <passive | active | extern>
host Hname gateway Gname metric value <passive | active | extern>
For a network route, one needs to specify the subnet and netmask ,
the gateway , the cost and a special tag at the end that
tells whether the route is passive , active or external .
For example, if I were to set up RIP on my gateway coatlicue , I might add a line as follows to /etc/gateways:
net 0.0.0.0 gateway 198.11.19.1 metric 1 passive
The 0.0.0.0 address is routed's way of saying default route .
The keyword passive is used to inform routed that the indicated
gateway will not provide RIP updates about its status. The keyword
must be specified in such cases so that routed doesn't delete the
route from the routing tables (which it does when it doesn't
get updates from a given gateway). Entries which have the active
keyword are almost unnecessary, as the assumption is that routed
will start receiving updates about such routes anyway.
RIP is subject to a number of different problems, most of which have been solved with either subtle implementation changes or a new revision of RIP called RIPv2.
These are the CIDR house rules :)
The CIDR specification arose as an attempt to deal with the fact that while there were many unused IP addresses, they were unavailable due to tradition class boundaries. With CIDR, these traditional class A, B and C boundaries are abolished. The point is simply to say that networks can have an arbitrary number of bits in the IP address devoted to the subnet instead of the traditional 8-bits for class A address, 16-bits for class B's and so on. This is where the /N convention arose:
128.138.202.0/24 designates 24 bits for the network, 8 for the host
128.138.192.192/26 designates 26 bits for the net, 6 for the host
You already know that we subdivide many of our CS dept nets into 6-bit
networks. Take a look at this snippet from the /etc/networks file:
cu-cs-capp 128.138.242.0 # [6] CS CAPP Lab (lynda.mcginley)
cu-cs-serl 128.138.242.64 # [6] CS SW Engring Rsrch Lab (lynda.mcginley)
cu-cs-cappfast 128.138.242.128 # [6] CS CAPP Fast Ethernet (lynda.mcginley)
cu-cs-fs 128.138.242.192 # [6] CS Fast Servers (lynda.mcginley)
The /etc/networks file gives you the symbolic name of the network and the
IP address of the network.
Here at CU, we also have a convention of placing the number of
host-bits as the first bit of information in the comment following each
entry. Anyway, you can see that what could have been a single 24-bit
(i.e. 8 host bits) network has been divided into four 26-bit networks
(i.e. 6 host bits). The netmask for a 26-bit network is 255.255.255.192 .
Another feature of CIDR is that one can aggregate networks together into a single entry in a routing table (when each of those networks are contigious with one another and reachable by the same gateway). The above four subnets are all connected to our internal router named cs-gw3 . Look at the single static routing entry on our external router ( cs-gw ) for these nets:
ip route 128.138.242.0 255.255.255.0 128.138.243.194 100
The four 26-bit nets have been aggregated together into a single 24-bit net entry
in the routing table. This is a 4:1 reduction, helping our routing table to be much
smaller.
The Open Shortest Path First protocol was developed to deal with many of the shortcomings of RIP. It is in a class of protocols known by the Shortest Path First algorithm used to choose optimal routes. The protocol is also known as a link state protocol. The link-state concept differs from RIP's distance-vector mechanism in that OSPF will compute a link-state based primarily on whether or not a gateway is actually functioning. Other variables that make up the link-state include the round trip time for a packet to reach a given gateway as well as the MTU of the link; if two routes are known to reach the same destiniation, then the route which is fatter and faster will be considered optimal. OSPF is also capable of load balancing (aka "equal-cost multi-path" routing) whereby heavy volumes of traffic can be distributed over a number of routes to reduce the load over each route. Don't forget that OSPF is also an interior routing protocol.
An important concept for the operation of OSPF is that of the area . There are two main kinds: stub areas and backbone areas. Stub areas have only a single gateway, and this is usually a border router that also sits in a backbone area.
Each OSPF router figures out the link-state of it's immediately connected routes and floods this information to every other OSPF router in the system. Then each router builds a directed graph or the network topology from its own point of view. The tree is then pruned using the SPF algorithm. The fact that each OSPF router maintains these link-state databases along with their communication style with each other make a network using OSPF converge much more quickly (than with RIP) when existing routes fail or new routes appear.
Let's revisit the routing table on the CS department router cs-gw :
Gateway of last resort is 128.138.80.141 to network 0.0.0.0
128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks
O 128.138.80.88/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.0.0/19 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.80/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
S 128.138.192.192/26 [1/0] via 128.138.243.194
O 128.138.1.0/24 [110/13] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.84/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.72/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.76/30 [110/3] via 128.138.80.141, 1d02h, FastEthernet4/1
O 128.138.80.68/30 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.40.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.233.192/26 [110/15] via 128.138.80.141, 1d02h, FastEthernet4/1
O IA 128.138.34.0/25 [110/5] via 128.138.80.141, 1d02h, FastEthernet4/1
....
Notice the second line above:
128.138.0.0/16 is variably subnetted, 242 subnets, 8 masks
This information is derivable from the OSPF link-state database.
The other routes in the above table that were installed by OSPF
(which you can tell by the O at the far left of each entry
in the table) are some of those 242 subnets; the fact that they
all are reached via the default route ( 128.138.80.141 )
is basically coincidental. It is a combination of the fact that
the CS network is a stub OSPF area and the specific configuration
(i.e. the topology ) of the colorado.edu network overall.
Notice that many of the routes shown above are to /30 subnets. Each of these subnets has a netmask of 255.255.255.252 which leaves only two bits for the host. Basically, a /30 subnet has only two usable IP addresses. For example, the default route for cs-gw is to 128.138.80.141 which is also a /30 subnet:
128.138.80.140 this is the subnet
128.138.80.141 this is an ITS (CU) backbone router
128.138.80.142 this is the CS dept router (cs-gw)
128.138.80.143 this is the IP broadcast address for the subnet
Given that a /30 subnet has only two bits for the host, it is
commonly used for point-to-point like connections between routers.
One end of the connection gets the first usable address, the other
end gets the other usable address. To reiterate, with only two bits
for possible host identification (i.e. bits number 30 and 31 of the 32
bit IP address):
bit-30 bit-31 desc
---- ---- ----
0 0 defines the particular subnet
0 1 first usable address
1 0 second usable address
1 1 IP broadcast address for the subnet
Of course, the IP broadcast address is kind of moot with a link
like this, but it comes along for the ride because it's part of the
IP-subnet concept.
These days most complex routing situations are dealt with using dedicated router hardware. If you instead want to use a Un*x box for this purpose then you will most likely have to obtain and configure gated . The gated package is quite complex and very powerful, as it can make use of many different routing protocols. In fact, you can configure gated to listen to information from many different routing protocol servers and compute a preference value for each based on your configuration instructions. Routes with lower preference values are selected first.
Gated can converse with the following routing protocols:
You configure gated by editing the file /etc/gated.conf . The file is divided into six major sections which must appear in the following order:
The following is a sample gated configuration file in use in the CS department. It is on a machine that we call a backup router. It is important to note that we don't use OSPF on our internal networks, so this example of gated is only dealing with RIP.
#
# GateD configuration file for the CSOps subnet
#
# $Header: /home/fcsk/tor/working/saclass/RCS/class07.txt,v 1.2 2000/03/05 22:22:53 tor Exp $
#
# THIS FILE IS UNDER RCS! The master copy resides in
# /csops/private/config/gated/suod
#
##############################################################################
#
# Ensure GateD comes up if DNS is hosed
options noresolv ;
options syslog upto warning ;
#
# Prevent any interfaces from being marked `down'
interfaces {
interface all passive ;
# Prefer the leaf interface
interface 128.138.192.205 preference 10 ;
interface 128.138.243.135 preference 15 ;
};
#
# Don't flush the routing table (DOH!)
kernel {
options noflushatexit ;
} ;
#
# Configure RIP
rip yes {
# Add 3 to all advertisements
interface all metricout 3 ;
# Force RIP v.1 advertisements
interface all version 1 ;
# List of routers to accept RIP info from
trustedgateways
# Backbone Routers, 243 interface
128.138.243.129
128.138.243.131
128.138.243.135
128.138.243.137
128.138.243.138
128.138.243.140
128.138.243.143
128.138.243.144
128.138.243.167
128.138.243.171
# CSOps
128.138.192.193
128.138.192.197
128.138.192.205
;
traceoptions
policy
request
response
other
;
} ;
#
# Filter RIP announcements
export proto rip interface 128.138.243.135 {
proto direct {
} ;
} ;
#
# Logging and tracing and other options
traceoptions "/var/log/gated.log" replace size 100k files 3
# choose from below (at least one must be uncommented)
#parse
#adv
#symbols
#iflist
#general
#state
#normal
#policy
#task
#timer
#route
none
;
# end
As you know, most of our departmental subnets connect to a single router ( cs-gw3 ). This router is potentially a single point of failure in that if it failed, communication from any of the subnets to any other or to the outside world is effectively cut off. To deal with such a problem we have a number of special Un*x hosts that each have network interfaces on a shared subnet as well as a second interface on a different subnet. We setup gated on each of these hosts so that if the default route through the cisco router (i.e. cs-gw3 ) goes down, then they start routing traffic instead. We call these hosts backup routers.
Of course, traffic to the outside world may still not work in this case. Having a redundant link to the rest of the internet is a whole other issue: do you contract with a second ISP? Do you then divide all your traffic between the two routes to the outside world? Cost is big factor here, of course, not to mention the routing considerations. Of questions that can be asked in regards to routing under such circumstances: can traffic be simply split in half for load-balancing? what if the two routes are not actually to the same destination (which is good for robust connectivity)? Local routers will have to understand the external topology somewhat in order to route packets efficiently...
You can use the traceroute command to see a path that your packets will travel to reach a particular destination. Can you find an destination that gives different results for multiple invocations of the same traceroute command? Do you find that packets traveling to relatively local destinations often go all over the country before reaching, e.g. Denver? Sometimes you will see this sort of thing occur, like packets travelling to San Jose before reaching their destination at a computer somewhere else in boulder! Generally this is due to misconfigured routers either internal-to or between the large internet carriers like MCI or qwest. Next time you notice delays downloading a web document, try using traceroute to see where the delays are occuring. Often it will again be caused by the same types of problems (if it's not just an overloaded web-server :).
If you are interested in using gated for OSPF routing, look here in particular:
Most important services provided by a unix machine are implemented in the form of server daemons (as we have seen before). Often these daemons are simply started up at boot time and they run continuously until they are manually killed or the machine is shutdown (or they fail because of an error :). Generally, for services like http on a busy machine, this makes perfect sense.
There are many lesser-used services, though, that one would like to provide, but would rather not have running all the time using up system resources unnecessarily. That is the purpose of the inetd daemon. The inetd daemon loads its configuration information from the file /etc/inetd.conf . This file is a list of services for which inetd acts as an agent. Inetd will listen on the well known ports of the services specified (i.e. the symbolic service name is given here and it is referenced in the /etc/services file). When a connection on a particular port (that inetd is listening on) is asked for, inetd will spawn a new process and start up the relevant server daemon, handing off the TCP or UDP connection to that new process.
Let's have a look at the /etc/inetd.conf file from OpenBSD:
# $OpenBSD: inetd.conf,v 1.31 1999/04/10 05:13:42 deraadt Exp $
#
# Internet server configuration database
#
ftp stream tcp nowait root /usr/libexec/ftpd ftpd -US
lld
#telnet stream tcp nowait root /usr/libexec/telnetd telnetd
-k
#shell stream tcp nowait root /usr/libexec/rshd rshd -L
#login stream tcp nowait root /usr/libexec/rlogind rlogind
#exec stream tcp nowait root /usr/libexec/rexecd rexecd
#uucpd stream tcp nowait root /usr/libexec/uucpd uucpd
#finger stream tcp nowait nobody /usr/libexec/fingerd fingerd
-lsm
#ident stream tcp nowait nobody /usr/libexec/identd identd -
elo
#tftp dgram udp wait root /usr/libexec/tftpd tftpd -s
Actually, that is just the first 15 lines or so from the file - enough
for our purposes.
You will note that most of the entries in this file have
been commented out.
In fact, the default installation of OpenBSD has very few entries
in this file actually enabled. One that I turned on was for ftp.
The format of each entry is:
service socket-type protocol [no]wait user daemon daemon-arguments
The service is a symbolic service name that can be referenced in
the /etc/services file. By default, inetd listens for each
specified service on every network interface. Some versions of
inetd, however (notably OpenBSD and Linux) let you specify
a subset of those interfaces if that is desired - e.g. for security
purposes. For example, on my home network, I might want to
only allow ftp on the inside net. To do that, the service
would look like: 10.0.0.1:ftp instead of just ftp.
See the inetd(8) manpage for more details.
The socket-type will make sense if you actually do any network programming in C. TCP connections will always be type stream and UDP connections will always be type dgram.
The protocol is generally tcp or udp although you will also see rpc/tcp and rpc/udp.
You must specify whether inetd should wait until a given process completes before initiating a new one (of the same daemon) or not.
The user which will be the owner of the daemon process when inted starts it.
The absolute path to the server daemon itself.
Last come the arguments that are passed to the server daemon. Note well that the first of these arguments is the name of the process itself. If you are familiar with the execve(2) system call (A C language system call) this will make sense to you. You will also see that it is important for implementing tcpwrappers as well.
An easy and convenient way to control and log access to services is through the used of tcpd - the "TCP Wrappers" package. The package was written by Wietse Venema of satan fame. It is available via FTP; see: ftp://ftp.porcupine.org/pub/security/index.html
Setting up the TCP wrappers is easy enough after you install the package. Simply change your /etc/inetd.conf file to have tcpd invoked instead of the actual server, like, for example telnet. Have a look at some excerpts from an inetd.conf file that makes use of TCP wrappers:
ftp stream tcp nowait root /usr/local/tcpd/bin/tcpd in.ftpd -l
telnet stream tcp nowait root /usr/local/tcpd/bin/tcpd in.telnetd
shell stream tcp nowait root /usr/local/tcpd/bin/tcpd in.rshd
login stream tcp nowait root /usr/local/tcpd/bin/tcpd in.rlogind
exec stream tcp nowait root /usr/local/tcpd/bin/tcpd in.rexecd
The wrappers allow you to log the connection as well as control
access to the services based on source IP addresses.
Logging is done with the local7 syslog facility.
To control access you use the /etc/hosts.allow and /etc/hosts.deny files. Each line these files has a basic format of:
daemon_list : client_list [ : shell_command ]
There are some exceptions. Let's look at some examples. Here is the first line for the CS dept /etc/hosts.allow file:
ALL : PARANOID : banners /usr/local/tcpd/lib/paranoid : DENY
This first line is one of the exceptions.
The first field ALL indicates, as you might expect, that
this line applies to all daemons. In the next field, PARANOID
is a special client type that says to verify that the DNS A and PTR
records for the requesting host match each other. The connection
is denied if the two records do not correspond to each other.
If DENY or ALLOW option is specified,
then it must occur last on the line.
The banners option will transmit a text message contained in the
specified directory. The text message that is transmitted is
found in a file of the same name as the service (daemon). I.e
take a look in the directory /usr/local/tcpd/lib/paranoid on bfs.
See the hosts_options(5) manpage for the whole scoop on it.
Other lines in this file look less unusual:
in.fingerd : .cs.colorado.edu : ALLOW
This one is more straight forward: the service is finger
and it is allowed
for every host in the cs.colorado.edu domain.
Newer distributions of Linux, like RedHat-7.0 ship with xinetd instead of (or in addition to) inetd. The overall concept is identical, except that xinetd integrates tcpwrapper functionality directly. Overall configuration takes place in the /etc/xinetd.conf file, while server-specific information is placed in individual files (named for the server of course) placed in /etc/xinetd.d .
Here is an example /etc/xinetd.conf file:
xibalba % cat /etc/xinetd.conf
#
# Simple configuration file for xinetd
#
# Some defaults, and include /etc/xinetd.d/
defaults
{
instances = 60
log_type = SYSLOG authpriv
log_on_success = HOST PID
log_on_failure = HOST RECORD
}
includedir /etc/xinetd.d
In the above config file we are just setting some default behavior. Specifically,
the instances indicates the number of servers that may exist simultaneously for
a given service; log_type sets the syslog facility-name to which log messages will
be sent; and finally the information that should be logged for either successful or failed
attempts to access services.
Individual services place their information in files in the /etc/xinetd.d directory. Here is a listing of that directory on a default RedHat-7.0 install:
xibalba % ls /etc/xinetd.d
finger linuxconf-web ntalk rexec rlogin rsh swat talk telnet tftp
And finally, here is a look at the telnet file:
xibalba % cat !$/telnet
cat /etc/xinetd.d/telnet
# default: on
# description: The telnet server serves telnet sessions; it uses \
# unencrypted username/password pairs for authentication.
service telnet
{
flags = REUSE
socket_type = stream
wait = no
user = root
server = /usr/sbin/in.telnetd
log_on_failure += USERID
}
To implement tcpwrapper-like functionality using the native xinetd syntax
you use the only_from keyword. For example:
only_from = 128.138.202.0/24
You may specify this keyword several times to allow access from several places.
In the CS department we have been using inetd for a very long time in conjunction with tcpwrappers , and we have an infrastructure in place to automatically maintain the necessary files on all of our machines. With a little work (and the newest xinetd binaries) we were abel to get xinetd to work just fine with tcpwrappers instead of its own only_from syntax. We wrote a perl script to generate xinetd files from an existing inetd.conf file (there are other utilities available to do this). Here is what the converted file for telnet looks like:
#
# Warning, this is a *generated* file.
# Any changes you make WILL be overwritten.
# Edit /local/etc/inetd.conf.local and re-run mkinetd.conf instead.
#
service telnet
{
socket_type = stream
protocol = tcp
wait = no
user = root
flags = NAMEINARGS
server = /usr/local/tcpd/bin/tcpd
server_args = /usr/sbin/in.telnetd
}