A basic intro to TCP/IP routing

To get a computer to route packets between 2 or more network interfaces you need two things:

Packet forwarding enabled in the machines kernel.
A Routing table with appropriate routes in it, so the machine knows where to send the packets to.

But first a brief intro to how routing works.

When a router receives a packet it grabs the packets destination address, looks it up in the routers routing table to find any matches and then chooses the most specific route and passes the packet onto the next hop associated with that route (which may be another router, a computer on a subnet that the router is on, or the router itself).

These are three things to note from the above:

Firstly, that all routing is done on the basis of the destination address of a packet. (There is something called policy based routing which allows you to route on any criteria you like, but thats beyond the scope of this document).

Secondly, that all routing is done on the next hop basis, a packet can't 'skip over' several intermediate routers, each router in the chain must know where to send the packet to. (There are ways around this called source routing, but don't worry about that now). This is why we need to use ip over ip tunnels to connect consume nodes together over the Internet, it allows us to create a pseudo point to point link between two routers without worrying about any intermediate routers not knowing where to send the packets.

Thirdly, that when faced with multiple routes whose destination matches the destination address of a packet the router will choose the most specific route, that is the route with the largest number of bits set in it's netmask.

Suppose you have the following routes in your routers routing table:

0.0.0.0/0	(i.e. the default route)
10.0.0.0/8
10.0.0.0/24
10.0.0.0/27

If you have a packet destined to 10.0.0.1 it will match all of these routes, but it will be sent to 10.0.0.0/27, because that route has the largest netmask (27 bits). Another packet to 10.0.0.200 will match 10.0.0.0/24, and 42.19.86.23 will match 0.0.0.0/0, and so on.

One thing to remember here is that a 'host' route is just a network with an all ones (/32) netmask, the last hop to a packets final destination will be via a host route.

Enabling ip forwarding

N.B. you only need to do this on routers, you DON'T need to do this on client machines

By default most operating systems won't forwards packet unless you tell them to. The actual method of switching on packet forwarding is operating system dependent, here is a small list, it's by no means complete:

Linux

I don't know if this still works with recent kernels, and making this change permenant across reboots is highly distribution dependent. Please go and check the docs for your distro on how to do that!

# echo 1 > /proc/sys/net/ipv4/ip_forward

FreeBSD/NetBSD/OpenBSD

The best way to do this is to recompile your kernel with 'options GATEWAY' in the kernel config file. This will make the change permenant across reboots, and as an added bonus increases the size of various kernel data structures and buffers which will increase network performance.

If you haven't got 'options GATEWAY' in your kernel you should be able to enable packet forwarding using sysctl.

On NetBSD you would do the following:

# sysctl -w net.inet.ip.forwarding=1

I don't know the exact incantation for the other BSD's but i expect that it will be very similar, and you can find it pretty easily with sysctl -a | grep forward.

Filling in the routing tables

Every machine doing TCP/IP will have a routing table, although most will only have a few simple routes in it.

N.B. All the routing tables show here have been simplified, yours will probably have loads of other things in them.

I'm going to refer to the diagram below for most of this section, so I'd better do a bit of explaining. There are two networks, Network A and Network B connected to two routers A and B, which are connected by a point to point link (e.g. a pair of wireless cards). There is one computer on network A, Called Petunia, and another Computer on Network B, called Banana.

Router A is running DHCP and has given Petunia the ip address 10.0.0.2 and has told Petunia to point a default route at Router A.

Pretty much the same thing has happened with Banana, except that it's got 10.0.0.34

Note that both network A and B have got 27 bit netmasks (255.255.255.224) so the two networks don't overlap, they are separate subnets.

Here are the current routing tables for the network above

I've simplified them a bit.

Petunia

Destination        Gateway
default            10.0.0.1
10.0.0.0/27        (directly connected network)
10.0.0.1           (RouterA's MAC address)

Router A

Destination        Gateway
10.0.0.0/27        (directly connected network)
10.0.0.64/30	   (directly connected network)

Router B

Destination        Gateway
10.0.0.32/27       (directly connected network)
10.0.0.64/30	   (directly connected network)

Banana

Destination        Gateway
default            10.0.0.33
10.0.0.32/27       (directly connected network)
10.0.0.33          (RouterB's MAC address)

Trying things on our network

If you set up the network in the diagram above and tried to do anything with it you'd discover that all the machines can ping machines on any subnet they are connected to, e.g. Router A can ping Router B, Banana can ping Router B, and Petunia can ping Router A. But none of the machines can ping a machine on any other subnet. e.g. Banana cannot ping Petunia or Router A.

Lets follow a ping request from Petunia to Router B step by step and see what happens

Petunia looks up the address of Router B in it's (Petunias) routing table and finds that it matches the default route, so it send the packet on to Router A (10.0.0.1).
Router A looks up Router B's address in it's (Router A) routing table and sees that it's on the same subnet, so it sends it direct to Router B
Router B receives the packet, sees that it's a packet for itself and generates a reply packet.
Router B trys to look up Petunia address (which is the destination address of the reply) in it's routing table and finds that it doesn't know where to send it.
Router B throws away the ping reply.

If you tried to ping Banana from Petunia, Banana would generate the ping reply, and send it (after matching the default route) to Router B, which wouldn't know where to send it, and would promptly discard the packet again.

So, as you can see, our network isn't very functional at the moment.

The problem with Router A and B is that they don't know of the networks behind the other router. i.e. Router A doesn't know that Network B exists, and Router B doesn't know that Network A exists.

What we need to do to fix this is to add a static route on each router pointing at each other routers network.

So on Router A we would add a route to 10.0.0.32/27 via 10.0.0.66 (Router B), and on Router B we would add a route to 10.0.0.0/27 via 10.0.0.65 (Router A).

Routing tables after adding static routes and pinging things

Note that neither Banana or Petunia has changed.

Petunia

Destination        Gateway
default            10.0.0.1
10.0.0.0/27        (directly connected network)
10.0.0.1           (RouterA's MAC address)

Router A

Destination        Gateway
10.0.0.0/27        (directly connected network)
10.0.0.32/27       10.0.0.66					Static
10.0.0.64/30	   (directly connected network)
10.0.0.66          (RouterB's MAC address)

Router B

Destination        Gateway
10.0.0.0/27        10.0.0.65					Static
10.0.0.32/27       (directly connected network)
10.0.0.64/30	   (directly connected network)
10.0.0.65          (RouterA's MAC address)

Banana

Destination        Gateway
default            10.0.0.33
10.0.0.32/27       (directly connected network)
10.0.0.33          (RouterB's MAC address)

Ping again

From Petunia to Router B

Petunia looks up the address of Router B in it's (Petunias) routing table and finds that it matches the default route, so it send the packet on to Router A (10.0.0.1).
Router A looks up Router B's address in it's (Router A) routing table and sees that it's on the same subnet, so it sends it direct to Router B
Router B receives the packet, sees that it's a packet for itself and generates a reply packet.
Router B trys to look up Petunia address (which is the destination address of the reply) in it's routing table and matches 10.0.0.0/27
10.0.0.0/27 has a next hop of 10.0.0.65 so it sends the reply to Router A
Router A looks up 10.0.0.2 (Petunia's address) in it's routing table, sees that it's on a subnet thats it's directly attached to and sends the packet to Petunia.

So now our network works...

Dynamic routing

As you can imagine, adding a static route to each router for each new subnet added to the network is going to turn into a major pain if the network gets much larger. With static routing you also run the risk of typing errors, routes hanging around after the networks they pointed to being decommisioned, and massive trouble shooting problems.

Wouldn't it be nice if the routers could handle all this stuff themselves, without having to type in routes all the time, well, this is what dynamic routing protocols do.

Dynamic routing gives you two things:

Route distribution (doing what we just did with the static routes)
and Failover.

Failover is when if there is more than one next-hop to a particular destination a routing protocol can change the next-hop depending on the state of the network. This allows you to build redundant links (i.e. multiple paths) and have your network carry on working when one of the fails.

How to view your own routing table

N.B. you may well see entries for 127.0.0.1/8 and possibly 224.something in your own routing tables. Don't worry about them, 127.0.0.1 is for the loopback interface (packets the machine is sending to itself), and 224.whatever is for multicast.

*BSD, Solaris, and most Unixes

# netstat -ran

netstat is used to get network statistics, the r flag asked for the routing table, the a flag asks for all of it, and the n flags tells netstat not to do any DNS lookups.

See the man page for netstat for more details

netstat will by default display the routing tables for all protocols it know about, which may include things like Unix domain sockets. If you want just the ipv4 routing table use netstat -ranf inet. It will make the output much more readable.

Linux

netstat -ran and route -n both display the routing tables:

kim@doormat:~$ netstat -ran
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window irtt Iface
10.1.8.0        0.0.0.0         255.255.255.0   U         0 0 0 eth0
0.0.0.0         10.1.8.254      0.0.0.0         UG        0 0 0 eth0
kim@doormat:~

Windows

I think the route print command (which you run from the DOS prompt or command shell) can show you the routing table, I'll check when in get a chance

How to add static route

All examples here are for router A

FreeBSD

FreeBSD allows you to specify the netmask using /bits notation, which makes life easy:

# route add -net 10.0.0.32/27 10.0.0.66

NetBSD

Bah, the NetBSD route command hasn't had /bits notation added to it, so you have to use netmasks.

# route add -net 10.0.0.32 -netmask 255.255.255.224 10.0.0.66

OpenBSD

Dunno, but the NetBSD syntax will probably work.

Linux

route add -net (or -host) is supposed to work everywhere, but please check your local man pages for the right syntax. (There are subtle differences between distros on this issue).

Some examples, curtsey of kim:

doormat:~# route
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref Use Iface
 10.1.8.0        *               255.255.255.0   U     0      0 0 eth0
doormat:~# route add default gw 10.1.8.254
doormat:~# route
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref Use Iface
 10.1.8.0        *               255.255.255.0   U     0      0 0 eth0
 default         gw8             0.0.0.0         UG    0      0 0 eth0
doormat:~# route add -net 10.1.28.0 netmask 255.255.255.0 gw 10.1.8.254
doormat:~# route
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref Use Iface
 10.1.28.0       gw8             255.255.255.0   UG    0      0 0 eth0
 10.1.8.0        *               255.255.255.0   U     0      0 0 eth0
 default         gw8             0.0.0.0         UG    0      0 0 eth0
doormat:~# route del -net 10.1.28.0  netmask 255.255.255.0
doormat:~# route
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref Use Iface
 10.1.8.0        *               255.255.255.0   U     0      0 0 eth0
 default         gw8             0.0.0.0         UG    0      0 0 eth0
doormat:~# 

doormat:~# route -n
 Kernel IP routing table
 Destination     Gateway         Genmask         Flags Metric Ref Use Iface
 10.1.8.0        0.0.0.0         255.255.255.0   U     0      0 0 eth0
 0.0.0.0         10.1.8.254      0.0.0.0         UG    0      0 0 eth0
doormat:~#

Windows

Also dunno, I think you can do it with the route command, but need to check