textfiles/internet/tcp-ip-a.txt



                             Introduction
                                  to
                            Administration
                                of an
                            Internet-based
                            Local Network


                      C                       R

                              C       S
                  Computer Science Facilities Group
                              C       I

                      L                       S


                               RUTGERS
                  The State University of New Jersey
            Center for Computers and Information Services
               Laboratory for Computer Science Research


                            3 October 1988

This  is an introduction for people who intend to set up or administer
a network based on the Internet networking protocols (TCP/IP).

Copyright (C) 1988, Charles L. Hedrick.   Anyone  may  reproduce  this
document,  in  whole  or  in  part,  provided  that:   (1) any copy or
republication of the entire document must show Rutgers  University  as
the  source,  and  must  include this notice; and (2) any other use of
this material must reference this manual and Rutgers  University,  and
the fact that the material is copyright by Charles Hedrick and is used
by permission.


Unix is a trademark of AT&T Technologies, Inc.


                          Table of Contents


   1. The problem                                                    1
       1.1 Some comments about terminology                           2
   2. Routing and Addressing                                         2
   3. Choosing an addressing structure                               5
       3.1 Should you subdivide your address space?                  6
       3.2 Subnets vs. multiple network numbers                      6
       3.3 How to allocate subnet or network numbers                 8
       3.4 Dealing with multiple "virtual" subnets on one network    9
           3.4.1 Dealing with Multiple  Subnets  by  Turning  off   10
                 Subnetting
           3.4.2 Multiple Subnets: Implications for Broadcasting    11
       3.5 Choosing an address class                                11
       3.6 Dialup  IP  and  Micro  gateways: Dynamically assigned   12
           addresses
           3.6.1 Dialup IP                                          12
           3.6.2 Micro gateways                                     14
   4. Network-wide Services, Naming                                 15
   5. Setting up routing for an individual computer                 19
       5.1 How datagrams are routed                                 21
       5.2 Fixed routes                                             23
       5.3 Routing redirects                                        24
       5.4 Other ways for hosts to find routes                      26
           5.4.1 Spying on Routing                                  26
           5.4.2 Proxy ARP                                          27
           5.4.3 Moving to New Routes After Failures                32
   6. Bridges and Gateways                                          35
       6.1 Alternative Designs                                      36
           6.1.1 A mesh of point to point lines                     36
           6.1.2 Circuit switching technology                       37
           6.1.3 Single-level networks                              37
           6.1.4 Mixed designs                                      38
       6.2 An introduction to alternative switching technologies    39
           6.2.1 Repeaters                                          40
           6.2.2 Bridges and gateways                               41
           6.2.3 More about bridges                                 43
           6.2.4 More about gateways                                44
       6.3 Comparing the switching technologies                     45
           6.3.1 Isolation                                          45
           6.3.2 Performance                                        46
           6.3.3 Routing                                            47
           6.3.4 Network management                                 49
           6.3.5 A final evaluation                                 52
   7. Configuring Gateways                                          52
       7.1 Configuring routing for gateways                         55


                                  i


This document is intended to help people who are planning to set up  a
new  network  based  on  the  Internet  protocols, or to administer an
existing one.    It  assumes  a  basic  familiarity  with  the  TCP/IP
protocols,  particularly  the  structure  of  Internet  addresses.   A
companion paper, "Introduction to the Internet Protocols", may provide
a  convenient introduction.  This document does not attempt to replace
technical  documentation  for  your  specific  TCP/IP  implementation.
Rather, it attempts to give overall background that is not specific to
any  particular  implementation.    It  is  directed  specifically  at
networks  of "medium" complexity.  That is, it is probably appropriate
for a network involving several dozen buildings.   Those  planning  to
manage  larger networks will need more preparation than you can get by
reading this document.

In a number of cases, commands  and  output  from  Berkeley  Unix  are
shown.    Most  computer  systems  have  commands  that are similar in
function to these.  It seemed more useful to give some actual examples
that  to  limit  myself to general talk, even if the actual output you
see is slightly different.


1. The problem


This document will emphasize primarily "logical" network architecture.
There  are many documents and articles in the trade press that discuss
actual network media, such as Ethernet, Token  Ring,  etc.    What  is
generally  not  made  clear  in  these  articles is that the choice of
network media is generally not  all  that  critical  for  the  overall
design  of  a  network.   What can be done by the network is generally
determined more by the network protocols supported, and the quality of
the  implementations.  In practice, media are normally chosen based on
purely pragmatic grounds: what media are supported by  the  particular
types  of  computer that you have to connect, the distance you have to
go, and the logistics of installing various kinds of cable.  Generally
this means that Ethernet is used for medium-scale systems, Ethernet or
a network  based  on  twisted-pair  wiring  for  micro  networks,  and
specialized high-speed networks (typically token ring) for campus-wide
backbones, and for local networks involving super-computer  and  other
very high-performance applications.

Thus  this  document  assumes  that  you  have  chosen  and  installed
individual networks such as Ethernet or token ring,  and  your  vendor
has  helped  you connect your computers to these network.  You are now
faced with the interrelated problems of

   - configuring the software on your computers

   - finding a way to connect individual Ethernets, token rings, etc.,
     to form a single coherent network

   - connecting your networks to the outside world

My  primary  thesis in this document is that these decisions require a
                                  1


bit  of  advance  thought.    In   fact,   most   networks   need   an
"architecture".   This consists of a way of assigning addresses, a way
of doing routing, and various choices about how  hosts  interact  with
the  network.  These decisions need to be made for the entire network,
preferably when it is first being installed.


1.1 Some comments about terminology


I am going to use the term "IP" throughout this document to  refer  to
networks  designed  to carry TCP/IP.  IP is the network-level protocol
from the Internet (TCP/IP) family of protocols.   Thus  it  is  common
practice  to  use  the term "IP" when referring to addresses, routing,
and other network-layer items.  In fact the distinction is not  always
very  clear.    So  in practice the terms Internet, TCP/IP, and IP may
appear to be almost interchangeable.

The terms "packet" and "datagram"  are  also  almost  interchangeable.
Ideally,  "packet" is used for the lowest-level physical unit, whereas
"datagram" refers to a unit of data at the level of IP.  However these
are identical for most media, so people have nearly stopped making the
distinction.  I have tried to use the  terms  correctly,  even  though
these days it may sound a bit pedantic.  The term "packet" seems to be
winning out in  common  speech.    For  example,  gateway  speeds  are
generally  given  in  "packets  per  second."    I  have used the more
technically accurate  "datagrams  per  second,"  since  it  is  really
datagrams that are being counted.

I  use  the  term  "gateway"  where  some  other authors use "router."
"Gateway" is the original  Internet  term.    Unfortunately,  the  ISO
community  has  begun  using  the  same  word  with a rather different
meaning.  People have started using "router" because it  doesn't  have
this  ambiguity.    I am continuing to use "gateway" because, like the
companion Introduction to the Internet  Protocols,  this  document  is
intended to help you make sense of the Internet specifications.  Those
specifications use "gateway."


2. Routing and Addressing


Many of the decisions that you need  to  make  in  setting  up  an  IP
network  depend  upon  routing,  so  it  will be best to give a bit of
background on that topic now.  I will return to  routing  in  a  later
section  when  discussing  gateways  and  bridges.    In  general,  IP
datagrams pass through many networks as they  are  going  between  the
source and destination.  Here's a typical example.  (Addresses used in
the examples are taken from Rutgers University.)


                                  2


              network 1               network 2     network 3
               128.6.4                 128.6.21      128.121
        ============================  ==========  ================
          |              |        |    |      |    |         |
       ___|______   _____|____  __|____|__  __|____|____  ___|________
       128.6.4.2    128.6.4.3   128.6.4.1   128.6.21.1    128.121.50.2
                                128.6.21.2  128.121.50.1
       __________   __________  __________  ____________  ____________
       computer A   computer B   gateway R    gateway S    computer C


This diagram shows three normal computer systems,  two  gateways,  and
three  networks.  The networks might be Ethernets, token rings, or any
other sort.  Network 2 could even be a  single  point  to  point  line
connecting gateways R and S.

Note  that computer A can send datagrams to computer B directly, using
network 1.  However it can't reach computer  C  directly,  since  they
aren't  on  the  same  network.    There  are  several ways to connect
separate networks.  This diagram assumes that gateways are used. (In a
later  section,  we'll look at alternatives.)  In this case, datagrams
going between A and C must be sent through gateway R, network  2,  and
gateway   S.   Every  computer  that  uses  TCP/IP  needs  appropriate
information and algorithms to allow it to know when datagrams must  be
sent through a gateway, and to choose an appropriate gateway.

Routing  is  very  closely tied to the choice of addresses.  Note that
the address of each computer begins with the  number  of  the  network
that  it's  attached  to.    Thus  128.6.4.2 and 128.6.4.3 are both on
network 128.6.4.  Next, notice that gateways, whose job is to  connect
networks,  have  an  address  on each of those networks.  For example,
gateway R connects networks 128.6.4 and 128.6.21.  Its  connection  to
network  128.6.4 has the address 128.6.4.1.  Its connection to network
128.6.21 has the address 128.6.21.2.

Because of this association between addresses  and  networks,  routing
decisions  can  be  based  strictly  on  the  network  number  of  the
destination.  Here's what the routing information for computer A might
look like:

       network    gateway     metric

       128.6.4    none        0
       128.6.21   128.6.4.1   1
       128.121    128.6.4.1   2

From  this  table, computer A can tell that datagrams for computers on
network 128.6.4 can be sent directly, and datagrams for  computers  on
networks  128.6.21  and  128.121  need  to  be  sent  to gateway R for
forwarding.  The "metric" is used by  some  routing  algorithms  as  a
measure  of how far away the destination is.  In this case, the metric
simply indicates how many gateways the datagram  has  to  go  through.
(This is often referred to as a "hop count".)

When  computer  A  is  ready  to  send  a  datagram,  it  examines the
                                  3


destination address.  It gets the network number from the beginning of
the  address,  and  then  looks in the routing table.  The table entry
indicates  whether  the  datagram  should  be  sent  directly  to  the
destination or to a gateway.

Note  that  a  gateway  is  simply a computer that is connected to two
different networks, and is prepared to forward datagrams between them.
In  many  cases  it is most efficient to use special-purpose equipment
that are designed as gateways.  However it is  perfectly  possible  to
use  ordinary  computers,  as  long as they have more than one network
interface, and their software is prepared to forward datagrams.   Most
major TCP/IP implementations (even for microcomputers) are designed to
let you use your computer as a gateway.  However some of this software
has limitations that can cause trouble for your network.

Note that a gateway has several addresses -- one for each network that
it's attached to.  This is a difference  between  IP  and  some  other
network  protocols: each interface from a computer to a network has an
address.  With some  other  protocols,  each  computer  has  only  one
address,  which  applies  to all of its interfaces.  A gateway between
networks 128.6.4 and 128.6.21 will have an address  that  begins  with
128.6.4  (for  example,  128.6.4.1).    This  address  refers  to  its
connection to network 128.6.4.  It will  also  have  an  address  that
begins  with  128.6.21  (for example, 128.6.21.2).  This refers to its
connection to network 128.6.21.

The term "network" probably makes you think of things  like  Ethernet,
which  can  have  many  machines attached.  However it also applies to
point to point lines.  In the diagram above, networks 1 and 3 could be
in different cities.  Then network 2 could be a serial line, satellite
link, or other long-distance point to point connection between the two
locations.    A  point to point line is treated as a network that just
happens to have only two computers on it.  As with any other  network,
the  point to point line has a network number (in this case 128.6.21).
The systems connected by the line (gateways R and S) have addresses on
that network (in this case 128.6.21.1 and 128.6.21.2).

It  is  possible  to  design  routing software that does not require a
separate network number for each point to point line.  In  that  case,
the  interface between the gateway and the point to point line doesn't
have an address.  This can be useful if your network is so large  that
you  are  in  danger  of running out of network numbers.  However such
"anonymous interfaces"  can  make  network  management  somewhat  more
difficult.    If  there is no address, network management software may
have no way to refer to the interface.  Thus you may not  be  able  to
get data on throughput and errors for that interface.


                                  4


3. Choosing an addressing structure


The  first  comment  to  make about addresses is a warning: Before you
start using an IP network, you must get one or more  official  network
numbers.    IP  addresses  look like this: 128.6.4.3.  This address is
used by one computer at Rutgers University.  The  first  part  of  it,
128.6,  is  a  network  number,  allocated  to  Rutgers  by  a central
authority.  Before you start allocating addresses to  your  computers,
you  must  get an official network number.  Unfortunately, some people
set up networks using either a randomly-chosen number,  or  a  generic
number  supplied by the vendor.  While this may work in the short run,
it is a very bad idea for the long run.  Eventually, you will want  to
connect  your  network  to some other organization's network.  Even if
your organization is highly secret and very concerned about  security,
somewhere  in  your  organization  there  is  going  to  be a research
computer that ends up being connected to a nearby  university.    That
university  will  probably  be  connected  to  a  large-scale national
network.  As soon as one of your datagrams escapes your local network,
the  organization you are talking to is going to become very confused,
because the addresses that  appear  in  your  datagrams  are  probably
officially allocated to someone else.

The  solution  to this is simple: get your own network number from the
beginning.  It costs nothing.  If you delay it,  then  sometime  years
from  now  you  are  going  to be faced with the job of changing every
address on a large network.  Network numbers are currently assigned by
the  DDN Network Information Center, SRI International, 333 Ravenswood
Avenue, Menlo Park, California 94025 (telephone: 800-235-3155).    You
can  get  a  network  number no matter what your network is being used
for.  You do not need authorization to connect  to  the  Defense  Data
Network  in order to get a number.  The main piece of information that
will be needed when you apply for a  network  number  is  the  address
class that you want.  See below for a discussion of this.

In  many ways, the most important decision you have to make in setting
up a network is how you will assign IP addresses  to  your  computers.
This  choice  should be made with a view of how your network is likely
to grow.  Otherwise, you will find that you have to change  addresses.
When you have several hundred computers, address changes can be nearly
impossible.

Addresses are critical because IP datagrams are routed on the basis of
their  address.    For example, addresses at Rutgers University have a
2-level structure.  A typical address is 128.6.4.3.  128.6 is assigned
to  Rutgers  University.    As  far as the outside world is concerned,
128.6 is a single network.  Other universities send any datagram whose
address  begins  with  128.6  to the nearest Rutgers gateway.  However
within Rutgers, we divide up our address space into "subnets".  We use
the next 8 bits of address to indicate which subnet a computer belongs
to.    128.6.4.3  belongs  to  subnet  128.6.4.    Generally   subnets
correspond  to physical networks, e.g. separate Ethernets, although we
will see some exceptions later.  Systems inside Rutgers, unlike  those
outside,  contain  information about the Rutgers subnet structure.  So
once a datagram for 128.6.4.3 arrives at Rutgers, the Rutgers  network
                                  5


will  route  it to the departmental Ethernet, token ring, or whatever,
that has been assigned subnet number 128.6.4.

When you start a network, there are several addressing decisions  that
face you:

   - Do you subdivide your address space?

   - If so, do you use subnets or class C addresses?

   - How big an address space do you need?


3.1 Should you subdivide your address space?


It  is  not  necessary  to  use  subnets  at  all.   There are network
technologies that allow an entire campus or company to act as a single
large  logical Ethernet, so that no internal routing is necessary.  If
you use this technology, then  you  do  not  need  to  subdivide  your
address  space.    In that case, the only decision you have to make is
what class of address to apply for.  However we recommend using either
a  subnet  approach  or  some other method of subdividing your address
space in most networks:

   - In section 6.2 we will argue that internal gateways are desirable
     for all networks beyond the very simplest.

   - Even if you do not need gateways now, you may find later that you
     need to  use  them.  Thus  it  probably  makes  sense  to  assign
     addresses  as  if  each Ethernet or token ring were going to be a
     separate subnet.  This will allow for conversion to real  subnets
     later if it proves necessary.

   - For  network  maintenance  purposes,  it  is  convenient  to have
     addresses whose structure corresponds to  the  structure  of  the
     network.   For example, when you see a stray datagram from system
     128.6.4.3, it is nice to know that all addresses  beginning  with
     128.6.4 are in a particular building.


3.2 Subnets vs. multiple network numbers


Suppose  that  you have been convinced that it's a good idea to impose
some structure on your addresses.  The  next  question  is  what  that
structure should be.  There are two basic approaches.  One is subnets.
The other is multiple network numbers.

The Internet  standards  specify  the  format  of  an  address.    For
addresses  beginning  with  128  through  191 (the most common numbers
these days), the first two octets form the network number.    E.g.  in
140.3.50.1, 140.3 is the network number.  Network numbers are assigned
                                  6


to a particular organization.  What you do with the next two octets is
up  to  you.    You  could  choose  to make the next octet be a subnet
number, or you could use some other scheme entirely.  Gateways  within
your  organization  must  be set up to know the subnetting scheme that
you are using.  However outside your organization, no  one  will  know
that 140.3.50 is one subnet and 140.3.51 is another.  They will simply
know that 140.3 is your organization.  Unfortunately, this ability  to
add additional structure to the address via subnets was not present in
the original IP specifications.  Thus some older software is incapable
of being told about subnets.

If  enough of the software that you are using has this problem, it may
be impractical for you to use subnets.  Some organizations have used a
different  approach.   It is possible for an organization to apply for
several network numbers.  Instead of dividing a single network number,
say  140.3,  into  several subnets, e.g. 140.3.1 through 140.3.10, you
could apply for 10 different network  numbers.    Thus  you  might  be
assigned  the  range  140.3 through 140.12.  All IP software will know
that these are different network numbers.

While using separate network numbers will work just fine  within  your
organization,  it  has two very serious disadvantages.  The first, and
less serious, is that it wastes address space.  There are  only  about
16,000  possible  class  B addresses.  We cannot afford to waste 10 of
them on your organization, unless it is very large.  This objection is
less  serious because you would normally ask for class C addresses for
this  purpose,  and  there  are  about  2  million  possible  class  C
addresses.

The  more  serious  problem  with using several network numbers rather
than subnets is that it overloads the routing tables in  the  rest  of
the Internet.  As mentioned above, when you divide your network number
into subnets, this division is known within your organization, but not
outside  it.    Thus  systems  outside your organization need only one
entry in their tables in order to be able to reach you.    E.g.  other
universities  have entries in their routing tables for 128.6, which is
the Rutgers network number.  If you use a  range  of  network  numbers
instead  of  subnets,  that  division  will  be  visible to the entire
Internet.  If we used 128.6  through  128.16  instead  of  subdividing
128.6, other universities would need entries for each of those network
numbers in their routing tables.   As  of  this  writing  the  routing
tables  in many of the national networks are exceeding the size of the
current  routing  technology.    It  would  be  considered   extremely
unfriendly  for  any organization to use more than one network number.
This may not be a problem if your network is going  to  be  completely
self-contained,  or if only one small piece of it will be connected to
the  outside  world.    Nevertheless,  most  TCP/IP  experts  strongly
recommend  that  you  use  subnets rather than multiple networks.  The
only reason for considering multiple networks is to deal with software
that  cannot  handle subnets.  This was a problem a few years ago, but
is currently less serious.   As  long  as  your  gateways  can  handle
subnets,  you  can deal with a few individual computers that cannot by
using "proxy ARP" (see below).

One warning about subnets: Your subnets must all be "adjacent".   That
                                  7


is,  you  can't have a configuration where you get from subnet 128.6.4
to subnet 128.6.5 by going through some other network  entirely,  e.g.
128.121.    For  example,  Rutgers  has  campuses in New Brunswick and
Newark.  It is perfectly OK for the networks  in  both  cities  to  be
subnets  of  128.6.    However  in  that  case,  the  lines beween New
Brunswick and Newark must also be part of 128.6.  Suppose  we  decided
to  use  a  regional  network  such as JvNCnet to talk between our two
campuses, instead of providing  our  own  lines.    Since  JvNCnet  is
128.121,  the  gateways  and serial lines that they provide would have
addresses that begin with 128.121.  This violate the rules.  It is not
allowable  to  have  gateways  or  lines  that  are  part  of  128.121
connecting two parts of 128.6.  So if we wanted to use JvNCnet between
our  two  campuses, we'd have to get different network numbers for the
two campuses.  (This rule  is  a  result  of  limitations  in  routing
technology.    Eventually  gateway software will probably be developed
that can deal with configurations whose networks are not contiguous.)


3.3 How to allocate subnet or network numbers


Now that you have decided to use subnets or multiple network  numbers,
you  have  to  decide  how  to allocate them.  Normally this is fairly
easy.  Each physical network, e.g. Ethernet or token ring, is assigned
a  separate  subnet  or  network  number.    However  you do have some
options.

In some cases it may make sense to assign several subnet numbers to  a
single  physical  network.   At Rutgers we have a single Ethernet that
spans three buildings, using repeaters.  It is very clear to  us  that
as  computers  are  added  to this Ethernet, it is going to have to be
split into several separate Ethernets.  In order to  avoid  having  to
change  addresses when this is done, we have allocated three different
subnet numbers to this Ethernet, one per building.    (This  would  be
handy  even  if  we didn't plan to split the Ethernet, just to help us
keep track of where computers are.)  However before doing  this,  make
very  sure  that  the  software  on all of your computers can handle a
network that has three different network numbers on it.  This issue is
discussed in more detail in section 3.4.

You also have to choose a "subnet mask".  This is used by the software
on your systems to separate the subnet from the rest of  the  address.
So  far  we  have  always  assumed  that  the first two octets are the
network number, and the next octet is the subnet number.  For class  B
addresses,  the  standards  specify  that the first two octets are the
network number.  However we are free to choose  the  boundary  between
the  subnet  number  and the rest of the address.  It's very common to
have a one-octet subnet number,  but  that's  not  the  only  possible
choice.  Let's look again at a class B address, e.g. 128.6.4.3.  It is
easy to see that if the third octet is used for a subnet number, there
are 256 possible subnets and within each subnet there are 256 possible
addresses.  (Actually, the numbers are more  like  254,  since  it  is
generally a bad idea to use 0 or 255 for subnet numbers or addresses.)
Suppose you know that you will never have more than 128 computers on a
                                  8


given subnet, but you are afraid you might need more than 256 subnets.
(For example, you might have a campus with lots of  small  buildings.)
In that case, you could define 9 bits for the subnet number, leaving 7
bits for addresses within each subnet.  This choice is expressed by  a
bit  mask,  using  ones  for  the  bits used by the network and subnet
number, and 0's for the bits  used  for  individual  addresses.    Our
normal  subnet  mask  choice is given as 255.255.255.0.  If we chose 9
bit subnet numbers and 7 bit  addresses,  the  subnet  mask  would  be
255.255.255.128.

Generally  it is possible to specify the subnet mask for each computer
as part of configuring its IP software.  The IP protocols  also  allow
for computers to send a query asking what the subnet mask is.  If your
network supports broadcast queries, and there is at least one computer
or  gateway  on  the  network  that  knows  the subnet mask, it may be
unnecessary to set it on the other computers.  However this capability
brings  with  it a whole new set of possible problems.  One well-known
TCP/IP implementation would answer with the  wrong  subnet  mask  when
queried,  thus  leading causing every other computer on the network to
be misconfigured.  Thus it may  be  safest  to  set  the  subnet  mask
explicitly on each system.


3.4 Dealing with multiple "virtual" subnets on one network


Most  software  is written under the assumption that every computer on
the local network has the same subnet number.  When traffic  is  being
sent  to  a  machine with a different subnet number, the software will
generally expect to find  a  gateway  to  handle  forwarding  to  that
subnet.  Let's look at the implications.  Suppose subnets 128.6.19 and
128.6.20 are on the same Ethernet.  Consider the way things look  from
the point of view of a computer with address 128.6.19.3.  It will have
no problem sending to other machines with addresses 128.6.19.x.   They
are on the same subnet, and so our computer will know to send directly
to them on the local Ethernet.  However suppose it is asked to send  a
datagram  to  128.6.20.2.    Since  this  is  a different subnet, most
software will expect to find a gateway that handles forwarding between
the  two  subnets.    Of  course there isn't a gateway between subnets
128.6.19 and 128.6.20, since they are on the same Ethernet.  Thus  you
have  to find a way to tell your software that 128.6.20 is actually on
the same Ethernet.

Most common TCP/IP implementations can deal with more than one  subnet
on  a  network.    For  example,  Berkeley  Unix lets you use a slight
modification of the command used to define gateways.  Suppose that you
get  from  subnet  128.6.19  to  subnet  128.6.4 using a gateway whose
address is 128.6.19.1.  You would use the command

  route add 128.6.4.0 128.6.19.1 1

This says that to reach subnet 128.6.4, traffic should be sent via the
gateway  at  128.6.19.1, and that the route only has to go through one
gateway.  The "1" is referred to as the "routing metric".  If you  use
                                  9


a  metric  of  0, you are saying that the destination subnet is on the
same network, and no gateway is needed.  In  our  example,  on  system
128.6.19.3, you would use

  route add 128.6.20.0 128.6.19.1 0

The  actual  address  used  in place of 128.6.19.1 is irrelevant.  The
metric of 0 says that no gateway is actually going to be used, so  the
gateway  address  is  not used.  However it must be a legal address on
the local network.

Note that the commands in this section are simply examples. You should
look  in  the  documentation for your particular implementation to see
how to configure your routing.


3.4.1 Dealing with Multiple Subnets by Turning off Subnetting


There is another  way  to  handle  several  subnets  on  one  physical
network.    This  method  involves  intentionally  misconfiguring your
hosts, so it is potentially dangerous if you don't watch what you  are
doing.    However  it may be easier to deal with when you have lots of
subnets on one physical network.  An example of this is  a  site  that
uses  bridges, and uses subnets simply for administrative convenience.
The trick is to configure the software on your hosts as  if  you  were
not  using  subnets at all.  In this case your hosts will not make any
distinction between the subnets, and they'll have no  trouble  dealing
with all of them.  Now the only problem is how to talk to subnets that
are not on this multi-subnet network.  However if your gateways handle
proxy  ARP,  they  will  solve that problem for you.  This approach is
likely to be  convenient  when  the  same  network  is  carrying  many
subnets, particularly if additional ones are likely to be added later.
However it has two problems:

If you have any hosts with multiple interfaces, you will  have  to  be
very careful.  First, only one interface should be on the multi-subnet
network.  For example, suppose you have a "network" that is made up of
several Ethernets connected by bridges.  You can't have a machine with
interfaces on two of those Ethernets.  However you can have  a  system
with  one  interface  on  the multi-subnet network and another on some
totally separate subnet.  Second, any machine with multiple interfaces
will  have  to  know  the  real  subnet mask, and will need to be told
explicitly which subnets are  on  the  multi-subnet  network.    These
restrictions  come about because a system with multiple interfaces has
to know which interface to use in any given case.

You will have to be careful about the ICMP subnet mask facility.  This
is a facility that allows systems to broadcast a query asking what the
subnet mask is.  If most of  your  hosts  think  the  network  is  not
subnetted,  but  your  gateways and multi-interface hosts think it is,
you've got a potential for confusion.  If a gateway or multi-interface
host  happens to send an ICMP subnet mask reply giving the real subnet
mask, some of your other hosts  may  pick  it  up.    The  reverse  is
                                  10


possible as well.  This means that you will either have to

   - disable  ICMP subnet mask replies on all of the systems that know
     the real subnet mask.  (This may be easy if  only  gateways  know
     it.)

   - make sure that your hosts ignore ICMP replies

According  to the most recent documents, as long as you set the subnet
mask explicitly, hosts are supposed to ignore  the  ICMP  subnet  mask
mechanism.   So you should be able to set different masks on different
hosts without causing any  problem,  as  long  as  you  set  the  mask
explicitly  for  all  of  them.   However we have noticed that some IP
implementations will change their subnet mask when they  see  an  ICMP
subnet mask reply.


3.4.2 Multiple Subnets: Implications for Broadcasting


When  you  have more than one subnet on the same physical network, you
need to give some thought to broadcast addresses.   According  to  the
latest  standards,  there  are two different ways for a host on subnet
128.6.20 to send a broadcast on the local network.    One  is  to  use
address  128.6.20.255.    The other is to use address 255.255.255.255.
128.6.20.255  says  explicitly  "all  hosts   on   subnet   128.6.20".
255.255.255.255  says "all hosts on my local network".  Normally these
have the same effect.  However they do  not  when  there  are  several
subnets  on  one  physical network.  If subnet 128.6.19 is on the same
Ethernet,  it  is   also   going   to   receive   messages   sent   to
255.255.255.255.    However  hosts  with  numbers  128.6.19.x will not
listen to broadcasts to 128.6.20.255.  The  result  is  that  the  two
different  forms  of  broadcast  address  will have somewhat different
meanings.  This means that you will have  to  exercise  some  care  in
configuring  software  on  networks  such  as  this, to make sure that
broadcasts go where you intend them to go.


3.5 Choosing an address class


When you apply for an official network number, you will be asked  what
class  of network number you need.  The possible answers are A, B, and
C. This affects how large an address  space  you  will  be  allocated.
Class  A addresses are one octet long, class B addresses are 2 octets,
and class C addresses are 3  octets.    This  represents  a  tradeoff:
there are a lot more class C addresses than class A addresses, but the
class C addresses don't allow as many hosts.  The idea was that  there
would  be  a few very large networks, a moderate number of medium-size
ones, and a lot of mom-and-pop stores with small networks.  Here is  a
table showing the distinction:


                                  11


       class  range of first octet   network  rest  possible addresses
         A       1 - 126               p      q.r.s    16777214
         B       128 - 191             p.q      r.s    65534
         C       192 - 223             p.q.r      s    254

For  example  network  10,  a  class  A network, has addresses between
10.0.0.1 and 10.255.255.254.  So it allows 254**3, or about 16 million
possible  addresses.    (Actually,  network 10 has allocated addresses
where some of the octets are zero, so there are a few  more  addresses
possible.)    Network  192.12.88,  a class C network has hosts between
192.12.88.1 and 192.12.88.254, i.e. 254 possible hosts.

In general, you will be expected to choose the lowest class that  will
provide  you with enough addresses to handle your growth over the next
few years.  Organizations that have computers in many  buildings  will
probably need and be able to get a class B address, assuming that they
are going to use subnetting.  (If you are going to use  many  separate
network  numbers,  you  would  ask for a number of class C addresses.)
Class A addresses are normally used only for large public networks and
for a few very large corporate networks.


3.6 Dialup IP and Micro gateways: Dynamically assigned addresses


In  most  cases, each of your computers will have its own permanent IP
address.  However there are a few situations where it makes more sense
to  allocate  addresses  dynamically.    The most common cases involve
dialup IP, and gateways intended primarily for microcomputers.


3.6.1 Dialup IP


It is possible to run IP over dialup lines.  The protocol for doing so
is  called  SLIP  ("serial  line IP").  SLIP is useful in at least two
different circumstances:

   - As a low-cost alternative to permanent point to point lines,  for
     cases  where  there  isn't  enough  traffic  to justify dedicated
     lines.

   - As a way to connect individual PC's into a network when they  are
     located  in  buildings  that  don't  have  Ethernets or other LAN
     technology.

I am going to use the term "SLIP server" to refer to a computer system
that  has  modems  attached,  which other systems can connect to using
SLIP.  Such a system will provide a gateway into your network  for  PC
users or for other networks that connect using SLIP.

If  you  have  a number of individual PC's dialing up with SLIP, it is
often not practical to assign each PC its own IP  address.    For  one
                                  12


thing,  there  may just not be enough addresses.  In order to keep the
routing straight, the dialup systems have to get addresses on the same
subnet  as  the  SLIP  server.    Generally  there  are only 256 or so
addresses available on each subnet.  If you have more PC's than  that,
you  can't give each one its own address.  If you have SLIP servers on
more than one subnet, this will make  permanent  addresses  even  more
difficult.    If a user wanted to be able to call both servers, his PC
would need two addresses, one for each subnet.

In order to avoid these problems,  many  SLIP  implementations  assign
addresses  dynamically.   When a PC first connects to the SLIP server,
the server finds an unused IP address and assigns it to the PC.    The
simplest  way to manage this is to give each SLIP server a range of IP
addresses that it keeps track of and can assign.

When you use such a scheme, your SLIP software has to include some way
for  the  server to tell the PC what address to use.  If each PC has a
permanent address, you have the reverse problem:  when a  PC  connects
to  a server, there has to be a way for the PC to tell the server what
its address is.  Some care is needed.  Otherwise  someone  could  have
his PC claim to be yours and steal all your files.

Unfortunately,  there  is  no  standard way to manage these addressing
issues with SLIP.  There are several SLIP implementations that  handle
them, but there isn't a single standard yet.  Until such a standard is
developed, you need to check out SLIP software carefully.   Make  sure
that  it assigns addresses the way you want, and that your SLIP server
and your PC's agree on how to figure out the PC's address.

I recommend giving the PC's permanent addresses in cases  where  other
computers  have to be able to tell which PC they are talking to.  This
would be the case if the PC is going to receive private computer mail,
or  engage in other sensitive transactions.  I recommend using dynamic
addresses where you have a lot of PC's,  and  where  the  applications
that they access over the network do their own security checking.

When  you  are  using  SLIP  to  connect  two networks, you have three
choices for handling addressing (although not all  SLIP  software  can
handle all three choices):

   - Treat  SLIP connections like point to point lines that just don't
     happen to be up all  the  time.    If  you  call  more  than  one
     computer,  each  pair  of  computers  that  talks  has a separate
     network number which they use only when they talk to each other.

   - Use routing software that allows anonymous interfaces.   In  that
     case no address is needed at all.

   - Assign  addresses dynamically when the connection is opened, just
     as you would for a PC that is dialing up.

If you make connections only to one or two other systems, it is  quite
reasonable  to  use a network number for each connection.  This method
makes it easy to keep usage and error statistics.

                                  13


If you have many different connections, it is  probably  best  to  use
anonymous   interfaces.    You  would  probably  use  dynamic  address
allocation only if your routing technology did not  support  anonymous
interfaces.


3.6.2 Micro gateways


It  is  perfectly  possible for microcomputers to participate in an IP
network.  However there seems to be  a  tendency  for  micros  to  use
somewhat  different  network  technology than larger systems.  This is
because many micro users start with specialized network software whose
design  is  tailored specifically to the needs of micros, or even some
particular type of micro.  Micro users quite naturally want to be able
to  start  using  TCP/IP  without  having to abandon any special micro
network that they are already using.   For  that  reason  there  is  a
growing number of gateway products that allow PC's to access both some
micro-oriented network product and TCP/IP.

In this section, Apple's AppleTalk is used as an  example.    This  is
because  gateways  for  it  have  existed  for  some  time, and are in
widespread use.  However similar  products  exist  for  several  other
micro  network  technologies.   Note that the term AppleTalk refers to
the Apple network protocols, whereas LocalTalk refers to the  specific
twisted-pair  technology on which AppleTalk was initially implemented.
Thus AppleTalk is analogous to the TCP/IP protocols, whereas LocalTalk
is analogous to the Ethernet medium.

Several  vendors  supply  gateways to connect AppleTalk running over a
LocalTalk network with IP running over Ethernet.  Although  there  are
several  products  of  this  kind,  most  of them supply the following
services:

   - TCP/IP applications on the PC can connect to  TCP/IP  systems  on
     the  Ethernet.    Special  facilities  are  defined  to  allow IP
     datagrams to be carried over LocalTalk between  the  PC  and  the
     gateway.   TCP/IP applications on the PC have to be written using
     a special library that uses a mixture of  AppleTalk  and  TCP/IP.
     The  AppleTalk  facilities are needed to get the datagrams to the
     gateway, where they are transformed into pure TCP/IP before being
     put  out  onto  the  Ethernet.    Thus  the TCP/IP systems on the
     Ethernet don't know they are talking to micros.

   - AppleTalk applications can be written for larger systems, so that
     PC's  can  use  them  as servers.  These applications are written
     using a special library that is more or less the reverse  of  the
     one  just  described.   Again, it uses a mixture of AppleTalk and
     TCP/IP.  But this time TCP/IP facilities are needed  to  get  the
     datagrams  to  the  gateway, where they are transformed into pure
     AppleTalk  before  being  put  onto  the  LocalTalk  network   to
     communicate with the PC's.  Thus the PC's can access applications
     on the larger systems, without  knowing  that  they  are  on  the
     Ethernet rather than an Apple network.
                                  14


   - A campus or corporate IP network can be used to connect AppleTalk
     networks at different locations.  Gateways at each location  wrap
     up  AppleTalk  datagrams  inside IP datagrams, and send them over
     the main IP network.

In addition, some newer gateways will  translate  at  the  application
level.    For  example  one  gateway  will translate between the Apple
filing protocol and Sun's Network File System.  This allows  a  PC  to
access  a  Unix  file  system,  with  the  PC  using  the Apple filing
protocol, and the final access to the Unix  system  being  done  using
Sun's Network File System.

Unfortunately  the  flexibility  of products like this also means that
they are complex.  Addressing  issues  are  particularly  complicated.
For  the  same  reasons  as  SLIP, these gateways often use dynamic IP
address allocation. A range  of  IP  addresses  is  assigned  to  each
gateway.   When a PC attempts to open its first TCP/IP connection, the
gateway picks a free IP address and assigns it  to  the  PC.  As  with
SLIP,  you  will often need to choose whether you want addresses to be
assigned this way, or you want  each  PC  to  have  its  own  address.
Again,  this  depends upon how many PC's you have and whether you have
applications which must be able to use the IP address to identify  the
particular PC that is talking to it.

Addressing  is  further complicated by the fact that AppleTalk has its
own addressing structure.   So  you  must  define  a  mapping  between
AppleTalk  and  IP  network  numbers.    There  must also be a mapping
between individual IP addresses  and  AppleTalk  addresses,  but  this
mapping is maintained dynamically by the gateways.


4. Network-wide Services, Naming


If  you  are  going to have a TCP/IP network, there are certain things
that you are going to have to do centrally.  Some of them  are  simply
administrative.    The  most  important  is  that  you  will a central
registry of names and IP  addresses.    The  DDN  Network  Information
Center performs this role for the Internet network as a whole.  If you
are connected to the international Internet, your  administrator  will
need  to  register  with  the  DDN Network Information Center, so that
queries from other institutions about your hosts are forwarded to your
servers.

You will want to maintain a database containing information about each
system on your network.  At a minimum, you need to have the host  name
and  IP  address  for each system.  Probably the central registry will
assign IP addresses.  If your network is  subnetted,  or  if  you  use
multiple  class  C  network numbers, the registry will probably assign
network numbers to new networks or subnets.  Most commonly, individual
host  administrators  will  be allowed to choose their own host names.
However the registry must at least verify that there are no  duplicate
names.    If you have a very large network, you may choose to delegate
some  of  these  tasks  to  subregistries,  possibly  one   for   each
                                  15


department.

We  suggest that you assign numbers in the simplest way: starting from
1.  Thus if your network is 128.6, you would assign  128.6.1  as  your
first subnet, 128.6.2 as the second, etc.  IP addresses for individual
hosts should probably start at 2.  This allows you  to  reserve  1  on
each  subnet  for  use  by  a  gateway.  Thus the first host on subnet
128.6.4 would be 128.6.4.2, the next  128.6.4.3,  etc.    There  is  a
specific  reason  for  keeping addresses as small as possible.  If you
have a large organization, you may run out of subnet numbers.  If  you
do, and if your host numbers are small, you can assign another bit for
the subnet.  For example, we use the entire third octet  as  a  subnet
number.  As long as all of our host numbers are less than 128, we will
be able to expand to  9-bit  subnet  numbers.    For  example,  subnet
128.6.4  would  be  split  into  two  separate  subnets, 128.6.4.0 and
128.6.4.128.  If we had assigned host numbers above  128,  this  split
would be impossible.

Host  names need not be so systematic.  They can start with almost any
word made up of letters numbers, and hyphens.  It is  safest  for  the
first  character  to  be a letter.  It will be easier for users if the
name is fairly short.  (We have seen software that has trouble dealing
with  names  longer  than  16  characters.)  Many times departments or
projects choose a theme, and pick names that are consistent with them.
For  example,  the machines used by computer science graduate students
at Rutgers are named after rock bands:  STEELEYE,  BAND,  TREX,  DEVO,
etc.    Our math department uses famous mathematicians: GAUSS, FERMAT,
etc.  If your institution  does  not  have  any  connection  with  the
outside world, such one-word names are all you need.

If  you  are  connected  to  with  the  international  Internet,  your
organization will need to get a "domain name."  This  is  assigned  to
you by the DDN Network Information Center, just as your network number
is.  Unlike the network number, you can get along without one if  your
network  is isolated.  If you find later that you need one, it is easy
to add a domain name.  (We recommend that you start with  an  official
network  number  from  the  beginning because changing network numbers
later can be traumatic.)   Domain  names  normally  end  in  .EDU  for
educational  institutions,  .COM  for  companies,  etc.   For example,
Rutgers  University  has  a  domain  name  of  .RUTGERS.EDU   A   full
domain-style  host  name  consists  of  your  one-word  internal  name
followed by  your  organization's  domain  name.    For  example,  the
computer  I normally use is known internally as ATHOS.  It's full name
is ATHOS.RUTGERS.EDU If you have a large organization, it is  possible
to have sub-domains.  For example, you might have a subdomain for each
department.  This adds another period to your names.  For example, the
computer  science department might have decided to create a subdomain.
In   this   case,   my   computer    would    probably    be    called
ATHOS.CS.RUTGERS.EDU Once you get a domain name assigned to you, it is
wise to change all of your configuration files so that the  full  form
of  name  is  used.    However your software can be set up so that the
one-word versions are accepted as nicknames.    That  way  your  users
don't have to type out the long form.

If  you  have more than one or two systems, you are going to need some
                                  16


way to keep host information up  to  date  on  all  of  your  systems.
TCP/IP  software  needs  to  be  able  to translate host names into IP
addresses.  When a user tries to connect to another system,  he  wants
to  be able to refer to it by name.  The software has to translate the
name into the IP address in  order  to  open  the  connection.    Most
software  provides two ways to do this translation:  a static table or
a name server.  The  table  approach  is  probably  easier  for  small
organizations, as long as they are not connected to any other network.
You simply create a file that lists the names  and  addresses  of  all
your hosts.  Here's part of our host table:

    HOST: 128.6.4.2, 128.6.25.2 : ARAMIS.RUTGERS.EDU,ARAMIS : SUN-3-28
    HOST: 128.6.4.3 : GAUSS.RUTGERS.EDU,GAUSS : SUN-3-180 : UNIX ::
    HOST: 128.6.4.4, 128.6.25.4 : ATHOS.RUTGERS.EDU,ATHOS : SUN-4-280

This  format  has  one  line for each system, and lists its addresses,
names, and other information about it.  Note that aramis and athos are
both  on  two  networks,  so  they have two addresses.  They have both
primary names, e.g. ARAMIS.RUTGERS.EDU, and nicknames, e.g.    ARAMIS.
Since  we  are  attached  to  the Internet, our primary name is a full
domain name.  We supply brief nicknames to  make  it  easier  for  our
users.    There  is one other commonly-used format for the host table.
Here's an example of that format:

    128.6.4.2   aramis.rutgers.edu aramis
    128.6.25.2  aramis.rutgers.edu aramis
    128.5.4.3   gauss.rutgers.edu gauss
    128.6.4.4   athos.rutgers.edu gauss
    128.6.25.4  athos.rutgers.edu gauss

In this format, each line represents a single IP address. If a  system
has  two  interfaces,  there  are  two lines in the table for it.  You
should try to put the address first that is likely  to  be  used  more
often.  The documentation for your systems should indicate what format
they want the host information to use.

In the simplest setup, every computer has its own  copy  of  the  host
table.    If  you  choose  to  use  the setup, you will want to set up
procedures to make sure that systems get updated copies  of  the  host
table regularly.

Larger sites, and all sites that are connected to the Internet, should
use name servers instead of individual host tables.  A name server  is
a  program  that  you  run  on  a few of your systems to keep track of
names.  When a program needs to look up a name, instead of looking for
a copy of the host table, it sends a network query to the name server.
This approach has two advantages:

   - For a large site, it is easier to keep tables up to date on a few
     name servers than on every system.

   - If  your site is connected to the Internet, your name server will
     be able to talk to name servers at other organizations, and  look
     up names elsewhere.

                                  17


Using  a  name  server is the only way to have access to complete host
information about the rest of the Internet.

It is important to understand the difference between a name server and
a resolver.  A name server is a program that accesses a host database,
and answers queries from other programs.   A  resolver  is  a  set  of
subroutines  that  can  be  loaded  with  your  program.  It generates
queries to the name server, and processes the responses.  Every system
should  use the resolver.  (Actually, the resolver is generally loaded
with each program that uses the network, since it's simply  a  set  of
subroutines.)   You only need a few name servers.  Many people confuse
these two concepts, and come to believe that every computer  needs  to
run a name server.

In  order  to  use a resolver, each computer will need a configuration
file or other option that specifies the address of a name server where
queries  should  be  sent.   Generally you should specify several name
servers, in case one of them is down.  If your system cannot reach any
name  server,  much of your software is likely to misbehave.  Thus you
should be very careful to have enough name servers around  that  every
system can always reach at least one name server.

Name servers generally have a number of configuration options.  Rather
than giving advice here on setting up a name server,  I  am  going  to
refer  you  to  two  official  Internet standards documents.  Both are
available from the DDN Network Information Center, SRI  International,
333  Ravenswood  Avenue,  Menlo  Park,  California  94025  (telephone:
800-235-3155).  RFC 1032 contains instructions for  getting  a  domain
name  from  the  Network  Information  Center, including the necessary
forms.  RFC 1033 contains instructions on how to set up a name server.
Like  this  document,  these  documents are conceptual.  You will also
need documentation for the specific name server software that you  are
going  to use.  [This paragraph is a cop-out.  Future editions of this
document will contain  some  advice  on  setting  up  a  name  server.
However  RFC  1033  is  almost  unique  in  that  it  is  directed  at
administrators rather than networking experts.  Thus it is  reasonable
to direct people there for the moment.]

In  some cases you may need to use both fixed tables and name servers.
If you have some TCP/IP implementations that do not include resolvers,
then  you  will  have  to have host tables for those systems.  If your
network is connected to the international Internet, you are  going  to
have problems with systems that don't have resolvers.  The Internet is
too big for there to be a host table that  lists  all  of  its  hosts.
Thus you will have to put together a host table that lists those hosts
that your users tend to use.    The  DDN  Network  Information  Center
maintains a host table that will be a good starting point.  However it
is by no means complete.  So you will have to add your users' favorite
hosts  to it.  Systems that use a resolver will not have this problem,
since the name servers are able to translate any legal host name.

Host name and number allocation is the only facility that  has  to  be
done centrally.  However there are other things that you may prefer to
do centrally.  It is very common to have one  or  two  computers  that
handle  all  computer  mail.    If are on the Internet, it is easy for
                                  18


every one of your computers to talk directly to any other computer  on
the  Internet.    However  most  institutions want to communicate with
systems on other networks, such as  Bitnet  and  Usenet.    There  are
gateways  between  the  various  networks.    But  choosing  the right
gateway, and transforming  computer  mail  addresses  correctly  is  a
rather  specialized  business.  Thus many sites set up the appropriate
software only one place and direct all external mail (or all  external
mail to hosts that are not on the Internet) through this system.


5. Setting up routing for an individual computer


All  TCP/IP  implementations require some configuration for each host.
In some cases this is done  during  "system  generation".    In  other
cases,  various  startup and configuration files must be set up on the
system.  Still other systems get configuration information across  the
network  from  a "server".  While the details differ, the same kind of
information needs to be  supplied  for  most  implementations.    This
includes

   - parameters  describing  the  specific  machine,  such  as  its IP
     address.

   - parameters describing the network, such as the  subnet  mask  (if
     any)

   - routing software and the tables that drive it

   - startup of various programs needed to handle network tasks

Before  a  machine  is installed on your network, a coordinator should
assign it a host name and IP address, as described above.    Once  you
have  name  and  address,  you  are  ready  to  start configuring your
computer.  Often  you  have  to  put  the  address  and  name  into  a
configuration   file   on   the  computer.    However  some  computers
(particularly those without permanent  disks  on  which  configuration
information  could  be  stored) get this information over the network.
When such a system starts, it broadcasts a request over  the  network.
In  effect,  this  request says "who am I?"  If you have any computers
like this, you will have to make sure that some system on your network
is  ready  to  answer  these questions.  The obvious issue is: how can
another system tell who you are?  Generally  this  is  done  based  on
Ethernet  address  (or  the  analogous  address  for  other  types  of
network).    Ethernet  addresses  are   assigned   by   the   computer
manufacturer.    It  is guaranteed that only one machine in the entire
world has any particular Ethernet address.  The  address  is  normally
stored  in ROM somewhere in the machine.  The machine may not know its
IP address, but it does know its Ethernet address.  Thus the  "who  am
I"  request includes the Ethernet address.  Systems that are set up to
answer such requests have a table that lists  Ethernet  addresses  and
the  corresponding  IP  address.    This lets them know how to answer.
Unfortunately, you have to set this table up manually.  Generally  you
know the IP address, because your address coordinator has assigned it.
                                  19


The only problem in constructing the table will  be  finding  out  the
Ethernet address for each computer.  Generally, computers are designed
so that they print the Ethernet address on the console  shortly  after
being  turned on.  However in some cases you may have to find a way to
bring  the  computer  up  and  then  type  a  command  that   displays
information about the Ethernet interface.

Generally  the subnet mask should be specified in a configuration file
associated with the computer.    (For  Unix  systems,  the  "ifconfig"
command is used to specify both the Internet address and subnet mask.)
However there are provisions in the IP protocols  for  a  computer  to
broadcast a request asking for the subnet mask.  The subnet mask is an
attribute of the network.  It is the same for all computers on a given
subnet.    Thus there is no separate subnet table corresponding to the
Ethernet/Internet  address  mapping  table  used  to  answer   address
queries.    Ideally,  only  a  few authoritative computers will answer
queries about the subnet mask.  However  many  TCP/IP  implementations
are  set  up so that any machine on the network that believes it knows
the subnet mask will  answer.    If  your  TCP/IP  is  like  this,  an
incorrect  subnet  mask  setting  on  one  machine can cause confusion
throughout the network.

Normally the startup files do roughly the following things:

   - load any special device drivers that may be necessary.  (This  is
     particularly  common with PC's, where network access is likely to
     depend upon add-on controller cards and software that is not part
     of the original operating system.)

   - enable each of the network interfaces (Ethernet interface, serial
     lines, etc.)   Normally  this  involves  specifying  an  Internet
     address  and  subnet mask for each, as well as other options that
     will be described in your vendor's documentation.

   - establish network routing information, either  by  commands  that
     add  fixed  routes,  or  by  starting a program that obtains them
     dynamically.

   - turn on the domain system (used for looking up names and  finding
     the  corresponding  Internet  address  --  see the section on the
     domain system in the Introduction to  TCP/IP).    Note  that  the
     details  of  this  will  depend  upon  how  the  domain system is
     configured.  In most cases only a few hosts actually  run  domain
     name  servers  that  must  be  started.   Other hosts simply need
     configuration files that specify where the nearest name server is
     located.

   - set various other information needed by the system software, such
     as the name of the system itself.

   - start various "daemons".  These are programs that provide network
     services  to  other  systems on the network, and to users on this
     system.  In the case of PC's, which  often  cannot  run  multiple
     processes,  similar  facilities  may  be  provided  by  so-called
     "TSR"'s, or they may be built into the device drivers.
                                  20


It is not practical to document these  steps  in  detail,  since  they
differ for each vendor.  This section will concentrate on a few issues
where your choice will depend upon overall decisions  about  how  your
network  is  to  operate.   These overall network policy decisions are
often not as well documented by the vendors as the details of  how  to
start  specific  programs.    Note that some care will be necessary to
integrate commands that you add for routing, etc.,  into  the  startup
sequence  at  the  right  point.  Some of the most mysterious problems
occur when network routing is not set up before  a  program  needs  to
make  a  network  query,  or when a program attempts to look up a host
name before the name server has finished loading all of the names from
a master name server.


5.1 How datagrams are routed


If your system consists of a single Ethernet or similar medium, you do
not need to give routing much attention.   However  for  more  complex
systems,  each  of  your  machines  needs a routing table that lists a
gateway and interface to use for every possible  destination  network.
A  simple  example of this was given at the beginning of this section.
However it is now necessary to describe the way routing works in a bit
more  detail.  On most systems, the routing table looks something like
the following. (This example was taken from a system running  Berkeley
Unix,  using  the  command  "netstat  -n -r".  Some columns containing
statistical information have been omitted.)

    Destination          Gateway              Flags       Interface

    128.6.5.3            128.6.7.1            UHGD        il0
    128.6.5.21           128.6.7.1            UHGD        il0
    127.0.0.1            127.0.0.1            UH          lo0
    128.6.4              128.6.4.61           U           pe0
    128.6.6              128.6.7.26           U           il0
    128.6.7              128.6.7.26           U           il0
    128.6.2              128.6.7.1            UG          il0
    10                   128.6.4.27           UG          pe0
    128.121              128.6.4.27           UG          pe0
    default              128.6.4.27           UG          pe0

The example system is connected to two Ethernets:

      controller  network   address     other networks
         il0      128.6.7   128.6.7.26    128.6.6
         pe0      128.6.4   128.6.4.61    none

The first column shows the name  for  the  Ethernet  interface.    The
second  column  is  the  network  number for that Ethernet.  The third
column is this computer's Internet address on that network.  The  last
column shows other subnets that share the same physical network.

Now  let's  look  at the routing table.  For the moment, let us ignore
the first 3 lines.  The majority of the table consists  of  a  set  of
                                  21


entries  describing  networks.    For  each  network,  the other three
columns show where to send datagrams destined for that  network.    If
the  "G"  flag  is  present  in  the  third column, datagrams for that
network must be sent through a gateway.  The second column  shows  the
address  of  the  gateway to be used.  If the "G" flag is not present,
the computer is directly connected to the network  in  question.    So
datagrams  for  that network should be sent using the controller shown
in the third column.    The  "U"  flag  in  the  third  column  simply
indicates  that  the route specified by that line is up.  (Generally a
route is assumed to be up  unless  attempts  to  use  it  consistently
result in errors.)

The  first  3  lines  show "host routes", indicated by the "H" flag in
column three.    Routing  tables  normally  have  entries  for  entire
networks or subnets.  For example, the entry

    128.6.2              128.6.7.1            UG          il0

indicates  that  datagrams  for  any computer on network 128.6.2 (i.e.
addresses 128.6.2.1 through 128.6.2.254) should  be  sent  to  gateway
128.6.7.1  for  forwarding.   However sometimes routes apply only to a
specific computer, rather than to a whole network.  In  that  case,  a
host  route  is used.  The first column then shows a complete address,
and the "H" flag is present in column 3.  E.g. the entry

    128.6.5.21           128.6.7.1            UHGD        il0

indicates that datagrams for the specific address 128.6.5.21 should be
sent  to  the gateway 128.6.7.1.  As with network routes, the "G" flag
is used for routes that involve a gateway.   The  "D"  flag  indicates
that  the  route  was  added  dynamically,  based  on an ICMP redirect
message from a gateway.  (See below for details.)

The following route is special:

    127.0.0.1            127.0.0.1            UH          lo0

127.0.0.1 is the address of the "loopback device".  This  is  a  dummy
software  module.  Any datagram sent out through that "device" appears
immediately as input.  It can be  used  for  testing.    The  loopback
address  can  also  handy for talking to applications that are on your
own computer.  (Why bother to use your network to talk  to  a  program
that is on the same machine you are?)

Finally, there are "default" routes, e.g.

    default              128.6.4.27           UG          pe0

This route is used for datagrams that don't match any other entry.  In
this case, they are sent to a gateway with address 128.6.4.27.

In most systems, datagrams are routed by looking  up  the  destination
address  in  a  table  such as the one just described.  If the address
matches a specific host route, then that is used.   Otherwise,  if  it
matches  a  network route, that is used.  If no other route works, the
                                  22


default is used.  If there is no default, the user should get an error
message such as "network is unreachable".

The  following sections will describe several ways of setting up these
routing tables.  Generally, the actual operation of sending  datagrams
doesn't depend upon which method you use to set up the routes.  When a
datagram is to be sent, its destination is looked  up  in  the  table.
The  different  routing methods are simply more and less sophisticated
ways of setting up and maintaining the tables.


5.2 Fixed routes


The simplest way to set up routing is to use  fixed  commands.    Your
startup  files  contain  commands to set up the routing table.  If any
changes are needed, you make them manually, using  commands  that  add
and  delete  entries  in  the  routing  table.   (When you make such a
change, don't forget to update the startup files also.)   This  method
is practical for relatively small networks, particularly if they don't
change very often.

Most computers automatically set up  some  routing  entries  for  you.
Unix  will  add  an  entry  for the networks to which you are directly
connected.  For example, your startup file might contain the commands

      ifconfig ie0 128.6.4.4 netmask 255.255.255.0
      ifconfig ie1 128.6.5.35 netmask 255.255.255.0

These  specify  that  there  are  two  network  interfaces,  and  your
addresses on them.  The system will automatically create routing table
entries

    128.6.4              128.6.4.4            U           ie0
    128.6.5              128.6.5.35           U           ie1

These specify that  datagrams  for  the  local  subnets,  128.6.4  and
128.6.5, should be sent out the corresponding interface.

In  addition  to  these,  your startup files would contain commands to
define routes to whatever other networks you wanted  to  reach.    For
example,

      route add 128.6.2.0 128.6.4.1  1
      route add 128.6.6.0 128.6.5.35 0

These  commands  specify  that  in  order  to reach network 128.6.2, a
gateway at address 128.6.4.1 should be used, and that network  128.6.6
is  actually  an  additional  network  number for the physical network
connected to interface 128.6.5.35.   Some  other  software  might  use
different  commands  for these cases.  Unix differentiates them by the
"metric", which is the number at the end of the command.   The  metric
indicates  how  many  gateways the datagram will have to go through to
get to the destination.  Routes with metrics of 1 or  greater  specify
                                  23


the  address of the first gateway on the path.  Routes with metrics of
0 indicate that no gateway  is  involved  --  this  is  an  additional
network number for the local network.

Finally, you might define a default route, to be used for destinations
not listed explicitly.  This would normally  show  the  address  of  a
gateway   that   has   enough   information  to  handle  all  possible
destinations.

If your network has only one gateway attached to it,  then  of  course
all  you  need is a single entry pointing to it as a default.  In that
case, you need not worry further about  setting  up  routing  on  your
hosts.    (The  gateway  itself needs more attention, as we will see.)
The following sections are intended to provide  help  for  setting  up
networks where there are several different gateways.


5.3 Routing redirects


Most  TCP/IP  experts  recommend  leaving  routing  decisions  to  the
gateways.  That is, it is probably a bad  idea  to  have  large  fixed
routing  tables  on each computer.  The problem is that when something
on the network changes, you have to go around to  many  computers  and
update  the  tables.    If  changes  happen  because a line goes down,
service may not be restored until someone has a chance to  notice  the
problem and change all the routing tables.

The  simplest way to keep routes up to date is to depend upon a single
gateway to update your routing tables.  This gateway should be set  as
your  default.  (On Unix, this would mean a command such as "route add
default  128.6.4.27  1",  where  128.6.4.27  is  the  address  of  the
gateway.)   As described above, your system will send all datagrams to
the default when it doesn't have any better route.    At  first,  this
strategy  does  not sound very good if you have more than one gateway.
After all, if all you have is a single default  entry,  how  will  you
ever  use  the other gateways in the cases where they are better?  The
answer is that most gateways are able to send  "redirects"  when  they
get  datagrams  for  which  there  is a better route.  A redirect is a
specific kind of message using  the  ICMP  (Internet  Control  Message
Protocol).    It contains information that generally translates to "In
the future, to get to address XXXXX, please use gateway YYYYY  instead
of  me".    Correct  TCP/IP implementations use these redirects to add
entries to their routing table.  Suppose your routing table starts out
as follows:

    Destination          Gateway              Flags       Interface

    127.0.0.1            127.0.0.1            UH          lo0
    128.6.4              128.6.4.61           U           pe0
    default              128.6.4.27           UG          pe0

This  contains  an entry for the local network, 128.6.4, and a default
pointing to the gateway 128.6.4.27.  Suppose there is also  a  gateway
                                  24


128.6.4.30,  which  is the best way to get to network 128.6.7.  How do
you find it?  Suppose you have datagrams to send to 128.6.7.23.    The
first  datagram  will go to the default gateway, since that's the only
thing in the routing table.  However the default gateway,  128.6.4.27,
will  notice  that 128.6.4.30 would really be a better route.  (How it
does that is up to the gateway.  However there are some fairly  simple
methods  for a gateway to determine that you would be better off using
a  different  one.)    Thus  128.6.4.27  will  send  back  a  redirect
specifying   that   datagrams   for  128.6.7.23  should  be  sent  via
128.6.4.30.  Your TCP/IP software will add a routing entry

    128.6.7.23           128.6.4.30           UDHG         pe0

Any future datagrams for 128.6.7.23  will  be  sent  directly  to  the
appropriate gateway.

This  strategy  would  be a complete solution, if it weren't for three
problems:

   - It requires each computer to have  the  address  of  one  gateway
     "hardwired" into its startup files, as the initial default.

   - If a gateway goes down, routing table entries using it may not be
     removed.

   - If your network uses subnets, and your TCP/IP implementation does
     not handle them, this strategy will not work.

How  serious  the  first  problem is depends upon your situation.  For
small networks, there is no problem modifying startup  files  whenever
something  changes.   But some organizations can find it very painful.
If network topology changes, and a gateway  is  removed,  any  systems
that  have  that  gateway  as their default must be adjusted.  This is
particularly serious if the people who maintain the  network  are  not
the  same  as  those  maintaining  the individual systems.  One simple
appoach is to make sure that the default address never changes.    For
example,  you might adopt the convention that address 1 on each subnet
is the default gateway for  that  subnet.    For  example,  on  subnet
128.6.7,  the  default  gateway  would  always  be 128.6.7.1.  If that
gateway is ever removed, some other gateway  is  given  that  address.
(There  must  always  be  at least one gateway left to give it to.  If
there isn't, you are completely cut off anyway.)

The biggest problem with the description given so far is that it tells
you how to add routes but not how to get rid of them.  What happens if
a gateway goes down?  You want traffic to  be  redirected  back  to  a
gateway  that is up.  Unfortunately, a gateway that has crashed is not
going to issue Redirects.  One solution is  to  choose  very  reliable
gateways.  If they crash very seldom, this may not be a problem.  Note
that Redirects can be used to handle some kinds  of  network  failure.
If  something  fails  in  a  distant part of the network, your current
route may no longer be a good one.  As long as the  gateway  to  which
you  are talking is still up and talking to you, it can simply issue a
Redirect to the gateway that is now the best one.  However  you  still
need  a  way  to  detect  failure  of one of the gateways that you are
                                  25


talking to directly.

The best approach for handling failed  gateways  is  for  your  TCP/IP
implementation  to  detect  routes  that  have  failed.  TCP maintains
various timers that allow the software to detect when a connection has
broken.    When  this  happens, one good approach is to mark the route
down, and go back to the default gateway.  A similar approach can also
be used to handle failures in the default gateway.  If you have marked
two gateways as default,  then  the  software  should  be  capable  of
switching   when   connections   using  one  of  them  start  failing.
Unfortunately, some common TCP/IP implementations do not  mark  routes
as down and change to new ones.  In particular, Berkeley 4.2 Unix does
not.  However Berkeley 4.3 Unix does do this,  and  as  other  vendors
begin  to  base  products  on  4.3  rather  than  4.2, this ability is
expected to become more common.


5.4 Other ways for hosts to find routes


As long as your  TCP/IP  implementations  handle  failing  connections
properly, establishing one or more default routes in the configuration
file is likely to be the simplest way  to  handle  routing.    However
there  are two other routing approaches that are worth considering for
special situations:

   - spying on the routing protocol

   - using proxy ARP


5.4.1 Spying on Routing


Gateways generally  have  a  special  protocol  that  they  use  among
themselves.     Note  that  redirects  cannot  be  used  by  gateways.
Redirects are simply ways for gateways to tell "dumb" hosts to  use  a
different  gateway.    The  gateways  themselves  must have a complete
picture of the network, and a way to compute the optimal route to each
subnet.      Generally   they  maintain  this  picture  by  exchanging
information among themselves.  There  are  several  different  routing
protocols  in  use  for  this purpose.  One way for a computer to keep
track of gateways is for it to listen to the gateways' messages  among
themselves.   There is software available for this purpose for most of
the common routing protocols.    When  you  run  this  software,  your
computer  will maintain a complete picture of the network, just as the
gateways do.  The software is  generally  designed  to  maintain  your
computer's  routing  tables  dynamically, so that datagrams are always
sent to the proper gateway.  In effect, the  routing  software  issues
the  equivalent of the Unix "route add" and "route delete" commands as
the network topology changes.  Generally this results  in  a  complete
routing  table,  rather  than  one  that  depends upon default routes.
(This assumes that the gateways themselves maintain a complete  table.
                                  26


Sometimes  gateways  keep track of your campus network completely, but
use a default route for all off-campus networks, etc.)

Running routing software on each host does in some sense  "solve"  the
routing  problem.    However there are several reasons why this is not
normally recommended except as  a  last  resort.    The  most  serious
problem  is  that this reintroduces configuration options that must be
kept up to date on each host.  Any computer that wants to  participate
in the protocol among the gateways will need to configure its software
compatibly  with  the  gateways.      Modern   gateways   often   have
configuration  options  that  are  complex  compared  with those of an
individual host.  It is undesirable to spread these to every host.

There is a somewhat more specialized  problem  that  applies  only  to
diskless  computers.   By its very nature, a diskless computer depends
upon the network and file servers to load programs and to do swapping.
It  is  dangerous  for  diskless  computers  to  run any software that
listens to network broadcasts.   Routing  software  generally  depends
upon  broadcasts.    For  example,  each  gateway on the network might
broadcast its routing tables every  30  seconds.    The  problem  with
diskless nodes is that the software to listen to these broadcasts must
be loaded over the network.  On a busy computer, programs that are not
used  for  a  few seconds will be swapped or paged out.  When they are
activated again, they must  be  swapped  or  paged  in.    Whenever  a
broadcast is sent, every computer on the network needs to activate the
routing software in order to process the broadcast.  This  means  that
many  diskless  computers will be doing swapping or paging at the same
time.  This is likely to cause a temporary overload  of  the  network.
Thus  it is very unwise for diskless machines to run any software that
requires them to listen to broadcasts.


5.4.2 Proxy ARP


Proxy ARP is an alternative technique for letting  gateways  make  all
the routing decisions.  It is applicable to any broadcast network that
uses ARP or a similar technique for mapping  Internet  addresses  into
network-specific   addresses   such   as  Ethernet  addresses.    This
presentation will  assume  Ethernet.    Other  network  types  can  be
acccomodated  if  you  replace "Ethernet address" with the appropriate
network-specific address, and ARP with the protocol used  for  address
mapping by that network type.

In  many  ways  proxy  ARP  it is similar to using a default route and
redirects, however it uses a different mechanism to communicate routes
to  the  host.   With redirects, a full routing table is used.  At any
given moment, the host knows what gateways it is routing datagrams to.
With  proxy  ARP,  you  dispense  with explicit routing tables, and do
everything at the level of Ethernet addresses.  Proxy ARP can be  used
for all destinations, only for destinations within your network, or in
various combinations.  It will be simplest to explain it as  used  for
all  addresses.    To  do  this, you instruct the host to pretend that
every computer in  the  world  is  attached  directly  to  your  local
                                  27


Ethernet.  On Unix, this would be done using a command

      route add default 128.6.4.2 0

where  128.6.4.2  is  assumed  to  be the IP address of your host.  As
explained above, the metric of 0 causes everything that  matches  this
route  to be sent directly on the local Ethernet.  Alternatively, some
systems will allow you to get the same effect by setting a subnet mask
of  0.   If you do this, you may have to take precautions to make sure
that it isn't reset by an ICMP subnet mask broadcast by a system  that
knows the real subnet mask.

When  a  datagram  is to be sent to a local Ethernet destination, your
computer needs to know the Ethernet address of the  destination.    In
order  to find that, it uses something generally called the ARP table.
This is simply a mapping from Internet address  to  Ethernet  address.
Here's a typical ARP table.  (On our system, it is displayed using the
command "arp -a".)

    FOKKER.RUTGERS.EDU (128.6.5.16) at 8:0:20:0:8:22 temporary
    CROSBY.RUTGERS.EDU (128.6.5.48) at 2:60:8c:49:50:63 temporary
    CAIP.RUTGERS.EDU (128.6.4.16) at 8:0:8b:0:1:6f temporary
    DUDE.RUTGERS.EDU (128.6.20.16) at 2:7:1:0:eb:cd temporary
    W20NS.MIT.EDU (18.70.0.160) at 2:7:1:0:eb:cd temporary
    OBERON.USC.EDU (128.125.1.1) at 2:7:1:2:18:ee temporary
    gatech.edu (128.61.1.1) at 2:7:1:0:eb:cd temporary
    DARTAGNAN.RUTGERS.EDU (128.6.5.65) at 8:0:20:0:15:a9 temporary

Note that it is simply a list of IP addresses  and  the  corresponding
Ethernet  address.  The "temporary" indicates that the entry was added
dynamically using ARP, rather than being put into the table manually.

If there is an entry for the address in the ARP table, the datagram is
simply  put  on  the Ethernet with the corresponding Ethernet address.
If not, an "ARP request" is broadcast, asking for the destination host
to  identify  itself.   This request is in effect a question "will the
host with Internet  address  128.6.4.194  please  tell  me  what  your
Ethernet address is?".  When a response comes back, it is added to the
ARP table, and future datagrams  for  that  destination  can  be  sent
without delay.

This  mechanism  was  originally  designed  only  for  use  with hosts
attached directly to a single Ethernet.  If you need to talk to a host
on  a different Ethernet, it was assumed that your routing table would
direct you to a gateway.    The  gateway  would  of  course  have  one
interface  on  your Ethernet.  Your computer would then end up looking
up the address of that gateway using  ARP.    It  would  generally  be
useless  to  expect  ARP to work directly with a computer on a distant
network.  Since it isn't on the same  Ethernet,  there's  no  Ethernet
address you can use to send datagrams to it.  And when you send an ARP
request for it, there's nobody to answer the request.

Proxy ARP is based on the  concept  that  the  gateways  will  act  as
proxies  for  distant  hosts.    Suppose  you  have  a host on network
128.6.5, with address 128.6.5.2.  (computer A  in  diagram  below)  It
                                  28


wants  to send a datagram to host 128.6.4.194, which is on a different
Ethernet (subnet 128.6.4). (computer C in diagram below)  There  is  a
gateway  connecting  the  two subnets, with address 128.6.5.1 (gateway
R):

              network 1               network 2
               128.6.5                 128.6.4
        ============================  ==================
          |              |        |    |      |    |
       ___|______   _____|____  __|____|__  __|____|____
       128.6.5.2    128.6.5.3   128.6.5.1   128.6.4.194
                                128.6.4.1
       __________   __________  __________  ____________
       computer A   computer B   gateway R   computer C


Now suppose computer A sends an ARP request for computer  C.  C  isn't
able  to  answer  for  itself.  It's on a different network, and never
even sees the ARP request.  However gateway R can act on  its  behalf.
In  effect,  your  computer  asks "will the host with Internet address
128.6.4.194 please tell me what your Ethernet address  is?",  and  the
gateway   says  "here  I  am,  128.6.4.194  is  2:7:1:0:eb:cd",  where
2:7:1:0:eb:cd is actually the Ethernet address of the gateway.    This
bit  of  illusion  works  just  fine.    Your  host  now  thinks  that
128.6.4.194  is  attached  to  the   local   Ethernet   with   address
2:7:1:0:eb:cd.    Of  course it isn't.  But it works anyway.  Whenever
there's a datagram to be sent to 128.6.4.194, your host  sends  it  to
the specified Ethernet address.  Since that's the address of a gateway
R, the gateway gets  the  datagram.    It  then  forwards  it  to  the
destination.

Note that the net effect is exactly the same as having an entry in the
routing table saying  to  route  destination  128.6.4.194  to  gateway
128.6.5.1:

    128.6.4.194          128.6.5.1           UGH          pe0

except  that  instead  of  having the routing done at the level of the
routing table, it is done at the level of the ARP table.

Generally it's better to use the routing  table.    That's  what  it's
there for.  However here are some cases where proxy ARP makes sense:

   - when you have a host that does not implement subnets

   - when you have a host that does not respond properly to redirects

   - when you do not want to have to choose a specific default gateway

   - when your software is unable to recover from a failed route

The  technique  was first designed to handle hosts that do not support
subnets.  Suppose that you have a subnetted network.  For example, you
have  chosen  to break network 128.6 into subnets, so that 128.6.4 and
128.6.5 are separate.  Suppose you  have  a  computer  that  does  not
                                  29


understand  subnets.    It  will  assume that all of 128.6 is a single
network.  Thus it will be difficult to establish routing table entries
to  handle  the  configuration  above.    You  can't tell it about the
gateway explicitly using "route add 128.6.4.0 128.6.5.1  1"  Since  it
thinks  all of 128.6 is a single network, it can't understand that you
are trying to tell it where to send  one  subnet.    It  will  instead
interpret  this command as an attempt to set up a host route to a host
whose address is 128.6.4.0.  The only thing that would work  would  be
to  establish  explicit host routes for every individual host on every
other subnet.  You can't depend upon default gateways and redirects in
this  situation either.  Suppose you said "route add default 128.6.5.1
1".  This would establish the gateway 128.6.5.1 as a default.  However
the  system  wouldn't  use  it  to  send  datagrams  to other subnets.
Suppose the host is  128.6.5.2,  and  wants  to  send  a  datagram  to
128.6.4.194.    Since  the destination is part of 128.6, your computer
considers it to be on the same network as itself, and  doesn't  bother
to look for a gateway.

Proxy  ARP  solves  this  problem by making the world look the way the
defective implementation expects it to look.  Since  the  host  thinks
all  other  subnets  are part of its own network, it will simply issue
ARP requests for them.  It expects to get  back  an  Ethernet  address
that  can  be used to establish direct communications.  If the gateway
is practicing proxy ARP, it will respond with the  gateway's  Ethernet
address.    Thus  datagrams  are  sent  to the gateway, and everything
works.

As you can see, no specific configuration is needed to use  proxy  ARP
with a host that doesn't understand subnets.  All you need is for your
gateways to implement proxy ARP.    In  order  to  use  it  for  other
purposes, you must explicitly set up the routing table to cause ARP to
be used.  By default, TCP/IP implementations will  expect  to  find  a
gateway  for any destination that is on a different network.  In order
to make them issue ARP's, you must explicitly  install  a  route  with
metric  0,  as  in the example "route add default 128.6.5.2 0", or you
must set a subnet mask of 0.

It is obvious that proxy ARP is reasonable  in  situations  where  you
have hosts that don't understand subnets.  Some comments may be needed
on the other situations.  Generally TCP/IP implementations  do  handle
ICMP  redirects  properly.   Thus it is normally practical to set up a
default route to some gateway, and depend upon the  gateway  to  issue
redirects  for  destinations  that  should  use  a  different gateway.
However in case you ever run into an implementation that does not obey
redirects,  or cannot be configured to have a default gateway, you may
be able to make things work by depending upon proxy ARP.    Of  course
this  requires  that  you be able to configure the host to issue ARP's
for all destinations.    You  will  need  to  read  the  documentation
carefully  to  see  exactly  what routing features your implementation
has.

Sometimes you may choose to depend upon  proxy  ARP  for  convenience.
The  problem  with  routing tables is that you have to configure them.
The simplest configuration is simply to establish a default route, but
even  there  you  have  to  supply some equivalent to the Unix command
                                  30


"route add default ...".  Should you  change  the  addresses  of  your
gateways,  you  have  to  modify this command on all of your hosts, so
that they point to the new default gateway.  If you set up  a  default
route  that depends upon proxy ARP (i.e. has metric 0), you won't have
to change your configuration files when gateways change.   With  proxy
ARP,  no  gateway  addresses  are  given  explicitly.  Any gateway can
respond to the ARP request, no matter what its address.

In order to save you from having  to  do  configuration,  some  TCP/IP
implementations  default  to  using ARP when they have no other route.
The most flexible implementations allow you to mix strategies.    That
is,  if  you  have  specified  a  route for a particular network, or a
default route, they will use that route.  But if there is no route for
a  destination, they will treat it as local, and issue an ARP request.
As long as your gateways support proxy ARP, this allows such hosts  to
reach any destination without any need for routing tables.

Finally,  you  may  choose to use proxy ARP because it provides better
recovery from failure.  This choice is very much dependent  upon  your
implementation.    The next section will discuss the tradeoffs in more
detail.

In situations where  there  are  several  gateways  attached  to  your
network,  you  may  wonder how proxy ARP allows you to choose the best
one.  As described above,  your  computer  simply  sends  a  broadcast
asking  for  the  Ethernet address for a destination.  We assumed that
the gateways would be set up to respond to this broadcast.   If  there
is  more  than  one  gateway,  this  requires coordination among them.
Ideally, the gateways will have a  complete  picture  of  the  network
topology.    Thus  they are able to determine the best route from your
host to any destination.  If the gateways coordinate among themselves,
it  should  be  possible  for  the best gateway to respond to your ARP
request.  In practice, it may not  always  be  possible  for  this  to
happen.    It  is fairly easy to design algorithms to prevent very bad
routes.  For example, consider the following situation:

          1             2            3
        -------  A  ----------  B ----------

1, 2, and 3 are networks.  A and B are gateways, connecting network  2
to  1 or 3.  If a host on network 2 wants to talk to a host on network
1, it is fairly easy for gateway  A  to  decide  to  answer,  and  for
gateway  B  to  decide  not  to.   Here's how: if gateway B accepted a
datagram for network 1, it would have to forward it to gateway  A  for
delivery.   This would mean that it would take a datagram from network
2 and send it right back out on network 2.  It is very  easy  to  test
for  routes  that involve this sort of circularity.  It is much harder
to deal with a situation such as the following:


                                  31


                         1
                  ---------------
                    A        B
                    |        | 4
                    |        |
                  3 |        C
                    |        |
                    |        | 5
                    D        E
                  ---------------
                         2

Suppose a computer on network 1 wants to send a  datagram  to  one  on
network  2.  The route via A and D is probably better, because it goes
through only one intermediate network (3).  It is also possible to  go
via  B,  C,  and  E,  but  that path is probably slightly slower.  Now
suppose the  computer  on  network  1  sends  an  ARP  request  for  a
destination on 2.  It is likely that A and B will both respond to that
request.  B is not quite as good a route as A. However it  is  not  so
bad  as  the case above.  B won't have to send the datagram right back
out onto network 1.  It is unable  to  determine  there  is  a  better
alternative  route  without  doing  a  significant  amount  of  global
analysis on the network.  This may not be practical in the  amount  of
time available to process an ARP request.


5.4.3 Moving to New Routes After Failures


In  principle,  IP  routing  is  capable of handling line failures and
gateway crashes.  There  are  various  mechanisms  to  adjust  routing
tables  and  ARP  tables to keep them up to date.  Unfortunately, many
major implementations of TCP/IP have  not  implemented  all  of  these
mechanisms.   The net result is that you have to look carefully at the
documentation for your implementation,  and  consider  what  kinds  of
failures  are  most  likely.   You then have to choose a strategy that
will work best for your site.  The basic choices  for  finding  routes
have all been listed above:  spying on the gateways' routing protocol,
setting up a default route and depending  upon  redirects,  and  using
proxy  ARP.    These methods all have their own limitations in dealing
with a changing network.

Spying on the gateways' routing protocol is theoretically the cleanest
solution.  Assuming that the gateways use good routing technology, the
tables that they broadcast  contain  enough  information  to  maintain
optimal  routes  to all destinations.  Should something in the network
change (a line or a gateway  goes  down),  this  information  will  be
reflected  in  the  tables,  and  the routing software will be able to
update the hosts' routing tables appropriately.  The disadvantages are
entirely practical.  However in some situations the robustness of this
approach may outweight the disadvantages.  To summarize the discussion
above, the disadvantages are:

   - If  the  gateways  are  using  sophisticated  routing  protocols,
                                  32


     configuration may be fairly complex.  Thus you will be faced with
     setting up and maintaining configuration files on every host.

   - Some  gateways  use proprietary routing protocols.  In this case,
     you may not  be  able  to  find  software  for  your  hosts  that
     understands them.

   - If your hosts are diskless, there can be very serious performance
     problems associated with listening to routing broadcasts.

Some gateways may be able  to  convert  from  their  internal  routing
protocol  to  a simpler one for use by your hosts.  This could largely
bypass the first two disadvantages.  Currently there is no  known  way
to get around the third one.

The  problems  with  default  routes/redirects  and with proxy ARP are
similar: they both have trouble dealing with  situations  where  their
table  entries  no  longer  apply.    The only real difference is that
different tables are involved.  Suppose a gateway goes down.   If  any
of  your current routes are using that gateway, you may be in trouble.
If you are depending upon the routing table, the major  mechanism  for
adjusting  routes is the redirect.  This works fine in two situations:

   - where the default gateway is not the best  route.    The  default
     gateway can direct you to a better gateway

   - where  a distant line or gateway fails.  If this changes the best
     route, the current gateway can redirect you to the  gateway  that
     is now best

The case it does not protect you against is where the gateway that you
are currently sending your datagrams to crashes.  Since it is down, it
is  unable to redirect you to another gateway.  In many cases, you are
also unprotected if your default  gateway  goes  down,  since  routing
starts by sending to the default gateway.

The  situation  with proxy ARP is similar.  If the gateways coordinate
themselves properly,  the  right  one  will  respond  initially.    If
something  elsewhere  in  the  network  changes,  the  gateway you are
currently issuing can issue a  redirect  to  a  new  gateway  that  is
better.    (It is usually possible to use redirects to override routes
established by proxy ARP.)  Again, the  case  you  are  not  protected
against  is  where the gateway you are currently using crashes.  There
is no equivalent to failure of a default gateway,  since  any  gateway
can respond to the ARP request.

So  the big problem is that failure of a gateway you are using is hard
to recover from.  It's hard because the main  mechanism  for  changing
routes  is  the  redirect,  and  a  gateway  that  is down can't issue
redirects.  Ideally, this problem should be  handled  by  your  TCP/IP
implementation,   using   timeouts.    If  a  computer  stops  getting
responses, it should cancel the existing route, and try to establish a
new  one.    Where  you are using a default route, this means that the
TCP/IP implementation must be able to declare a route as down based on
a  timeout.  If you have been redirected to a non-default gateway, and
                                  33


that route is declared down, traffic will return to the default.   The
default gateway can then begin handling the traffic, or redirect it to
a different gateway.  To handle  failure  of  a  default  gateway,  it
should  be possible to have more than one default.  If one is declared
down, another will be used.  Together, these  mechanisms  should  take
care of any failure.

Similar  mechanisms can be used by systems that depend upon proxy ARP.
If a connection is timing out, the ARP table entry that it uses should
be  cleared.   This will cause a new ARP request, which can be handled
by a gateway that is still up.  A simpler mechanism would simply be to
time  out  all  ARP entries after some period.  Since making a new ARP
request has a very low overhead, there's no problem with  removing  an
ARP entry even if it is still good.  The next time a datagram is to be
sent, a new request will be made.    The  response  is  normally  fast
enough that users will not even notice the delay.

Unfortunately,   many   common   implementations   do  not  use  these
strategies.  In Berkeley 4.2, there is no automatic way of getting rid
of  any  kind of entry, either routing or ARP.  They do not invalidate
routes or ARP entries based on failures.  If  gateway  crashes  are  a
significant  problem,  there may be no choice but to run software that
listens to the routing protocol.  In Berkeley 4.3, routing entries are
removed  when  TCP connections are failing.  ARP entries are still not
removed.  This makes the default route strategy  more  attractive  for
4.3 than proxy ARP.  Having more than one default route may also allow
for recovery from failure of a default gateway.  Note however that 4.3
only  handles  timeout for connections using TCP.  If a route is being
used only by services based on UDP, it will not recover  from  gateway
failure.    While  the  "traditional" TCP/IP services use TCP, network
file systems generally do not.  Thus 4.3-based systems still  may  not
always be able to recover from failure.

In  general,  you  should  examine  your  implementation  in detail to
determine what sort of error recovery strategy it uses.  We hope  that
the  discussion in this section will then help you choose the best way
of dealing with routing.

There is one more strategy that some older implementations use.  It is
strongly  discouraged,  but we mention it here so you can recognize it
if you see it.  Some implementations detect gateway failure by  taking
active  measure to see what gateways are up.  The best version of this
is based on a list of all gateways that are currently in use.    (This
can  be  determined  from  the routing table.)  Every minute or so, an
echo request datagram is sent to each such  gateway.    If  a  gateway
stops responding to echo requests, it is declared down, and all routes
using it revert to the default.   With  such  an  implementation,  you
normally supply more than one default gateway.  If the current default
stops responding, an alternate is chosen.  In some cases,  it  is  not
even  necessary  to  choose an explicit default gateway.  The software
will  randomly  choose  any  gateway  that  is   responding.      This
implementation  is  very  flexible  and  recovers  well from failures.
However a large network full of such implementations will waste a  lot
of  bandwidth  on  the  echo  datagrams  that are used to test whether
gateways  are  up.    This  is  the  reason  that  this  strategy   is
                                  34


discouraged.


6. Bridges and Gateways


This  section  will  deal  in  more detail with the technology used to
construct larger networks.  It  will  focus  particularly  on  how  to
connect  together  multiple  Ethernets,  token rings, etc.  These days
most networks are hierarchical.  Individual hosts attach to local-area
networks  such  as  Ethernet or token ring.  Then those local networks
are connected via some combination of backbone networks and  point  to
point  links.    A  university might have a network that looks in part
like this:

     ________________________________
     |   net 1      net 2    net 3  |        net 4            net 5
     | ---------X---------X-------- |      --------         --------
     |                         |    |         |                 |
     |  Building A             |    |         |                 |
     |               ----------X--------------X-----------------X
     |                              |  campus backbone network  :
     |______________________________|                           :
                                                         serial :
                                                           line :
                                                         -------X-----
                                                             net  6

Nets 1, 2 and 3 are in one building.  Nets 4 and 5  are  in  different
buildings  on  the  same  campus.  Net 6 is in a somewhat more distant
location.  The diagram above shows nets 1, 2, and  3  being  connected
directly,  with switches that handle the connections being labelled as
"X".  Building A is connected to  the  other  buildings  on  the  same
campus  by  a backbone network.  Note that traffic from net 1 to net 5
takes the following path:

   - from 1 to 2 via the direct connection between those networks

   - from 2 to 3 via another direct connection

   - from 3 to the backbone network

   - across the backbone network from building A to  the  building  in
     which net 5 is housed

   - from the backbone network to net 5

Traffic  for  net  6 would additionally pass over a serial line.  With
the setup as shown, the same switch  is  being  used  to  connect  the
backbone  network  to net 5 and to the serial line.  Thus traffic from
net 5 to net 6 would not need to go through the backbone, since  there
is a direct connection from net 5 to the serial line.

This section is largely about what goes in those "X"'s.
                                  35


6.1 Alternative Designs


Note  that  there  are alternatives to the sort of design shown above.
One is to use point to point lines or switched lines directly to  each
host.   Another is to use a single-level of network technology that is
capable of handling both local and long-haul networking.


6.1.1 A mesh of point to point lines


Rather than connecting hosts to a local network such as Ethernet,  and
then   interconnecting  the  Ethernets,  it  is  possible  to  connect
long-haul serial lines directly to the individual computers.  If  your
network   consists   primarily  of  individual  computers  at  distant
locations, this might make sense.  Here would be  a  small  design  of
that type.

          computer 1                computer 2             computer 3
              |                         |                      |
              |                         |                      |
              |                         |                      |
          computer 4 -------------- computer 5 ----------- computer 6

In  the design shown earlier, the task of routing datagrams around the
network is handled by special-purpose switching units shown as  "X"'s.
If  you  run lines directly between pairs of hosts, your hosts will be
doing this sort of routing and switching,  as  well  as  their  normal
computing.    Unless  you  run  lines  directly  between every pair of
computers, some systems will end up handling traffic for  others.  For
example,  in this design, traffic from 1 to 3 will go through 4, 5 and
6.  This is certainly possible, since most TCP/IP implementations  are
capable of forwarding datagrams.  If your network is of this type, you
should think of your hosts as also acting as gateways.   Much  of  the
discussion  below  on  configuring  gateways will apply to the routing
software that you run on your hosts.  This sort  of  configuration  is
not as common as it used to be, for two reasons:

   - Most large networks have more than one computer per location.  In
     this case it is less expensive to set up a local network at  each
     location than to run point to point lines to each computer.

   - Special-purpose  switching  units have become less expensive.  It
     often makes sense to offload the routing and communications tasks
     to a switch rather than handling it on the hosts.

It is of course possible to have a network that mixes the two kinds of
techology.  In this case,  locations  with  more  equipment  would  be
handled  by  a hierarchical system, with local-area networks connected
by switches.  Remote locations with a single computer would be handled
by  point  to  point lines going directly to those computers.  In this
case the routing software used on the remote computers would  have  to
be  compatible  with that used by the switches, or there would need to
                                  36


be a gateway between the two parts of the network.

Design decisions of this type are typically made after  an  assessment
of  the  level  of network traffic, the complexity of the network, the
quality of routing software available for the hosts, and  the  ability
of the hosts to handle extra network traffic.


6.1.2 Circuit switching technology


Another  alternative  to  the hierarchical LAN/backbone approach is to
use circuit switches connected to each individual computer.   This  is
really  a  variant  of  the  point  to point line technique, where the
circuit switch allows each system to have what  amounts  to  a  direct
line to every other system.  This technology is not widely used within
the TCP/IP community, largely because the TCP/IP protocols assume that
the  lowest  level  handles  isolated  datagrams.    When a continuous
connection  is  needed,  higher  network  layers  implement  it  using
datagrams.    This  datagram-oriented  technology  does  not  match  a
circuit-oriented environment very closely.  In order  to  use  circuit
switching  technology,  the IP software must be modified to be able to
build and tear down virtual circuits as appropriate.  When there is  a
datagram  for a given destination, a virtual circuit must be opened to
it.  The virtual circuit would  be  closed  when  there  has  been  no
traffic  to  that  destination  for  some time.  The major use of this
technology is for  the  DDN  (Defense  Data  Network).    The  primary
interface  to  the  DDN is based on X.25.  This network appears to the
outside as a distributed X.25 network.  TCP/IP software  intended  for
use with the DDN must do precisely the virtual circuit management just
described.     Similar   techniques   could   be   used   with   other
circuit-switching  technologies, e.g. ATT's DataKit, although there is
almost no software currently available to support this.


6.1.3 Single-level networks


In some cases new developments in wide-area networks can eliminate the
need  for hierarchical networks.  Early hierarchical networks were set
up because the only convenient  network  technology  was  Ethernet  or
other  LAN's, and those could not span distances large enough to cover
an entire campus.  Thus it  was  necessary  to  use  serial  lines  to
connect  LAN's  in  various  locations.    It  is now possible to find
network technology whose characteristics are similar to Ethernet,  but
where  a  single  network  can  span a campus.  Thus it is possible to
think of using a single large network, with no hierarchical structure.

The  primary  limitations  of  a  large   single-level   network   are
performance  and  reliability  considerations.  If a single network is
used  for  the  entire  campus,  it  is  very  easy  to  overload  it.
Hierarchical   networks  can  handle  a  larger  traffic  volume  than
single-level networks if traffic patterns have a reasonable amount  of
                                  37


locality.  That is, in many applications, traffic within an individual
department tends to be greater than traffic among departments.

Let's look at a concrete example.  Suppose there are  10  departments,
each  of  which  generates 1 Mbit/sec of traffic.  Suppose futher than
90% of that traffic is to other systems  within  the  department,  and
only  10%  is  to  other  departments.  If each department has its own
network, that network only needs to handle 1 Mbit/sec.   The  backbone
network connecting the department also only needs 1 Mbit/sec capacity,
since it is handling 10% of 1 Mbit from each department.  In order  to
handle  this  situation  with a single wide-area network, that network
would have to be able to handle the  simultaneous  load  from  all  10
departments, which would be 10 Mbit/sec.

However  this example was carefully constructed to be favorable to the
hierarchical design.  If more of the  traffic  in  the  department  is
going  to  other  departments,  then  the  backbone will need a higher
bandwidth.  For example, suppose that a campus has a  few  centralized
resources,  e.g.  mainframes  and  other  large systems in a computing
center.  If  most  of  the  network  traffic  is  from  small  systems
attempting  to get to the central system, then the argument above does
not work.  In this case a hierarchy may still be useful.   However  it
doesn't  reduce  the bandwidth required for the long-haul network.  In
the example above, if all 10 departments communicated  primarily  with
systems  at the computer center, the backbone would have to be able to
carry all of their traffic, 10Mbits per second.  The  computer  center
would  either attach its systems directly to the backbone, or it would
have a "departmental" network with a capacity of  10Mbits  per  second
rather than the 1Mbits per second needed by the other departments.

The   second  limitation  on  single-level  networks  is  reliability,
maintainability and security.  Wide-area networks are  more  difficult
to  diagnose  and  maintain than local-area networks, because problems
can be introduced from any building to which the network is connected.
They  also  make traffic visible in all locations.  For these reasons,
it is often sensible to handle local  traffic  locally,  and  use  the
wide-area  network  only  for  traffic  that  actually must go between
buildings.  However if you have a situation where  each  location  has
only  one  or  two  computers, it may not make sense to set up a local
network at each location, and a single-level network may make sense.


6.1.4 Mixed designs


In practice,  few  large  networks  have  the  luxury  of  adopting  a
theoretically pure design.

It is very unlikely that any large network will be able to avoid using
a hierarchical design.  Suppose we  set  out  to  use  a  single-level
network.  Even if most buildings have only one or two computers, there
will be some location where there are enough that a local-area network
is justified.  The result is a mixture of a single-level network and a
hierachical network.  Most buildings have  their  computers  connected
                                  38


directly  to  the  wide-area  network, as with a single-level network.
However in one building there is a local-area network which  uses  the
wide-area  network  as  a  backbone,  connecting to it via a switching
unit.

On the other side of the story, even network designers with  a  strong
commitment  to  hierarchical networks are likely to find some parts of
the network where it simply doesn't make economic sense to  install  a
local-area  network.    So  a  host  is put directly onto the backbone
network, or tied directly to a serial line.

However you should think carefully before  making  ad  hoc  departures
from  your  design  philosophy in order to save a few dollars.  In the
long run, network maintainability is going to depend upon your ability
to make sense of what is going on in the network.  The more consistent
your technology is, the more likely you are to be able to maintain the
network.


6.2 An introduction to alternative switching technologies


This  section will discuss the characteristics of various technologies
used to switch datagrams between networks.  In effect, we  are  trying
to  fill  in  some  details  about the black boxes assumed in previous
sections.  There are three basic types of switches, generally referred
to as repeaters, bridges, and gateways, or alternatively as level 1, 2
and 3 switches (based on the level of the  OSI  model  at  which  they
operate).    Note however that there are systems that combine features
of more than one of these, particularly bridges and gateways.

The most important dimensions on which switches  vary  are  isolation,
performance, routing and network management facilities.  These will be
discussed below.

The most serious difference is between repeaters  and  the  other  two
types  of  switch.    Until recently, gateways provided very different
services from bridges.  However these two technologies are now  coming
closer  together.  Gateways are beginning to adopt the special-purpose
hardware that has characterized bridges in  the  past.    Bridges  are
beginning to adopt more sophisticated routing, isolation features, and
network management, which have characterized  gateways  in  the  past.
There  are  also systems that can function as both bridge and gateway.
This means that at the moment, the crucial  decision  may  not  be  to
decide  whether  to  use  a  bridge  or  a gateway, but to decide what
features you want in a switch  and  how  it  fits  into  your  overall
network design.


                                  39


6.2.1 Repeaters


A repeater is a piece of equipment that connects two networks that use
the same technology.  It receives every data packet on  each  network,
and retransmits it onto the other network.  The net result is that the
two networks have exactly the same  set  of  packets  on  them.    For
Ethernet or IEEE 802.3 networks there are actually two different kinds
of repeater.  (Other network technologies may not need  to  make  this
distinction.)

A  simple  repeater  operates at a very low level indeed.  Its primary
purpose is to get around limitations in cable length caused by  signal
loss or timing dispersion.  It allows you to construct somewhat larger
networks than you would otherwise be able to construct.    It  can  be
thought  of  as  simply  a two-way amplifier.  It passes on individual
bits in the signal, without doing any processing at the packet  level.
It even passes on collisions.  That is, if a collision is generated on
one of  the  networks  connected  to  it,  the  repeater  generates  a
collision  on  the  other  network.  There is a limit to the number of
repeaters that you can use in a network.  The  basic  Ethernet  design
requires  that signals must be able to get from one end of the network
to the other within a specified amount of time.    This  determines  a
maximum  allowable length.  Putting repeaters in the path does not get
around this limit.  (Indeed each repeater adds some delay, so in  some
ways  a repeater makes things worse.)  Thus the Ethernet configuration
rules limit the number of repeaters that can be in any path.

A "buffered repeater" operates at the level  of  whole  data  packets.
Rather  than passing on signals a bit at a time, it receives an entire
packet from one network into an internal buffer and  then  retransmits
it  onto  the other network.  It does not pass on collisions.  Because
such low-level features  as  collisions  are  not  repeated,  the  two
networks continue to be separate as far as the Ethernet specifications
are concerned.  Thus there  are  no  restrictions  on  the  number  of
buffered  repeaters  that can be used.  Indeed there is no requirement
that both of the networks be of  the  same  type.    However  the  two
networks  must  be sufficiently similar that they have the same packet
format.  Generally this means that  buffered  repeaters  can  be  used
between two networks of the IEEE 802.x family (assuming that they have
chosen the same address  length  and  maximum  packet  size),  or  two
networks  of  some other related family.  A pair of buffered repeaters
can be used to connect two networks via a serial line.

Buffered repeaters share with simple repeaters the most basic feature:
they  repeat every data packet that they receive from one network onto
the other.  Thus the two networks end up with exactly the same set  of
packets on them.


                                  40


6.2.2 Bridges and gateways


A  bridge  differs from a buffered repeater primarily in the fact that
it exercizes some selectivity as to what datagrams it forwards between
networks.    Generally  the  goal  is  to increase the capacity of the
system by keeping local traffic confined to the network  on  which  it
originates.  Only traffic intended for other networks goes through the
bridge.  So far this  description  would  also  apply  to  a  gateway.
Bridges  and  gateways differ in the way they determine what datagrams
to forward.  A bridge uses only the OSI level 2 address.  In the  case
of  Ethernet  or  IEEE  802.x networks, this is the 6-byte Ethernet or
MAC-level address. (The term  "MAC-level  address"  is  more  general.
However  for  the  sake of concreteness, examples in this section will
assume that Ethernet is being used.  You  may  generally  replace  the
term  "Ethernet  address"  with  the  equivalent MAC-level address for
other similar technologies.)  A bridge does not examine  the  datagram
itself,  so  it  does  not  use  the  IP address or its equivalent for
routing decisions.  In contrast, a gateway bases its decisions on  the
IP address, or its equivalent for other protocols.

There are several reasons why it matters which kind of address is used
for decisions.  The most basic is that  it  affects  the  relationship
between  the  switch  and  the  upper  layers  of  the  protocol.   If
forwarding is done at the level of the MAC-level address (bridge), the
switch  will  be  invisible to the protocols.  If it is done at the IP
level, the switch will be visible.  Let's give an example.   Here  are
two networks connected by a bridge:

              network 1          network 2
               128.6.5            128.6.4
        ==================  ================================
          |            |      |            |             |
       ___|______    __|______|__   _______|___   _______|___
       128.6.5.2        bridge       128.6.4.3     128.6.4.4
       __________    ____________   ___________   ___________
       computer A                   computer B    computer C


Note that the bridge does not have an IP address.  As far as computers
A, B, and C are concerned,  there  is  a  single  Ethernet  (or  other
network)  to which they are all attached.  This means that the routing
tables must be set up so that computers on both  networks  treat  both
networks  as local.  When computer A opens a connection to computer B,
it first broadcasts an ARP request asking for  computer  B's  Ethernet
address.    The  bridge  must  pass  this  broadcast from network 1 to
network 2.  (In general, bridges must pass all broadcasts.)  Once  the
two computers know each other's Ethernet addresses, communications use
the Ethernet address as the destination.  At that  point,  the  bridge
can  start  exerting  some  selectivity.   It will only pass datagrams
whose Ethernet destination address is  for  a  machine  on  the  other
network.  Thus a datagram from B to A will be passed from network 2 to
1, but a datagram from B to C will be ignored.

In order to make this  selection,  the  bridge  needs  to  know  which
                                  41


network  each machine is on.  Most modern bridges build up a table for
each network  to  which  they  are  connected,  listing  the  Ethernet
addresses  of  machines  known to be on that network.  They do this by
watching all of the datagrams on each network.  When a datagram  first
appears  on  network 1, it is reasonable to conclude that the Ethernet
source address corresponds to a machine on network 1.

Note that a bridge must look at every datagram on  the  Ethernet,  for
two  different reasons.  First, it may use the source address to learn
which machines are on which network.  Second,  it  must  look  at  the
destination address in order to decide whether it needs to forward the
datagram to the other network.

As mentioned above, generally bridges must pass  broadcasts  from  one
network to the other.  Broadcasts are often used to locate a resource.
The ARP request is a typical example of this.  Since the bridge has no
way  of  knowing  what  host is going to answer the broadcast, it must
pass it on to the other network.  Some  bridges  have  user-selectable
filters.  With them, it is possible to block some broadcasts and allow
others.  You might allow ARP broadcasts (which are essential for IP to
function),  but confine less essential broadcasts to one network.  For
example, you might choose not to pass  rwhod  broadcasts,  which  some
systems  use  to  keep  track  of  every  user logged into every other
system.  You might decide that it is  sufficient  for  rwhod  to  know
about the systems on a single segment of the network.

Now let's take a look at two networks connected by a gateway

              network 1                   network 2
               128.6.5                     128.6.4
        ====================      ==================================
          |              |          |              |             |
       ___|______    ____|__________|____   _______|___   _______|___
       128.6.5.2     128.6.5.1  128.6.4.1    128.6.4.3     128.6.4.4
       __________    ____________________   ___________   ___________
       computer A           gateway           computer B    computer C


Note  that  the  gateway  has IP addresses assigned to each interface.
The  computers'  routing  tables  are  set  up  to   forward   through
appropriate  address.    For  example,  computer A has a routing entry
saying that it should use the  gateway  128.6.5.1  to  get  to  subnet
128.6.4.

Because  the  computers  know  about the gateway, the gateway does not
need to scan all the packets on the Ethernet.  The computers will send
datagrams  to  it  when  appropriate.  For example, suppose computer A
needs to send a message to computer B. Its routing table will tell  it
to  use  gateway  128.6.5.1.    It  will issue an ARP request for that
address.  The gateway will respond to the ARP  request,  just  as  any
host  would.  From then on, datagrams destined for B will be sent with
the gateway's Ethernet address.


                                  42


6.2.3 More about bridges


There are several advantages to using  the  MAC-level  address,  as  a
bridge  does.   First, every packet on an Ethernet or IEEE network has
such an address.  The address is in the same place for  every  packet,
whether  it  is  IP,  DECnet,  or  some  other  protocol.   Thus it is
relatively fast to get the address from the packet.   A  gateway  must
decode  the  entire IP header, and if it is to support protocols other
than IP, it must have software for each such  protocol.    This  means
that  a bridge automatically supports every possible protocol, whereas
a gateway requires specific provisions for  each  protocol  it  is  to
support.

However  there  are  also disadvantages.  The one that is intrinsic to
the design of a bridge is

   - A bridge must look at every packet on the network, not just those
     addressed  to  it.    Thus it is possible to overload a bridge by
     putting it on a very busy network, even if very little traffic is
     actually going through the bridge.

However there is another disadvantage that is based on the way bridges
are usually built.  It is possible in principle to design bridges that
do not have this disadvantage, but I don't know of any plans to do so.
It stems from the fact that bridges do not  have  a  complete  routing
table  that describes the entire system of networks.  They simply have
a list of the Ethernet addresses that lie on  each  of  its  networks.
This means

   - Networks  that  use  bridges cannot have loops in them.  If there
     were a loop,  some  bridges  would  see  traffic  from  the  same
     Ethernet address coming from both directions, and would be unable
     to decide which table to put that address  in.    Note  that  any
     parallel  paths  to the same destination constitute a loop.  This
     means  that  multiple  paths  cannot  be  used  for  purposes  of
     splitting the load or providing redundancy.

There  are  some  ways  of  getting around the problem of loops.  Many
bridges allow configurations with redundant connections, but turn  off
links  until  there are no loops left.  Should a link fail, one of the
disabled ones is then brought back into service.  Thus redundant links
can  still  buy  you  extra  reliability.    But they can't be used to
provide extra capacity.  It is also possible to build  a  bridge  that
will  make  use  of  parallel point to point lines, in the one special
case where those lines go between a  single  pair  of  bridges.    The
bridges  would  treat  the two lines as a single virtual line, and use
them alternately in round-robin fashion.

The process of disabling redundant  connections  until  there  are  no
loops  left  is  called  a "spanning tree algorithm".  This name comes
from the fact that a tree is defined as a pattern of connections  with
no loops.  Thus one wants to disable connections until the connections
that are left form a tree that "spans" (includes) all of the  networks
in  the  system.  In order to do this, all of the bridges in a network
                                  43


system must communicate among themselves.  There is an  IEEE  proposal
to  standardize  the protocol for doing this, and for constructing the
spanning tree.

Note that there is a tendency  for  the  resulting  spanning  tree  to
result  in  high  network  loads  on certain parts of the system.  The
networks near the "top of the tree" handle all traffic between distant
parts  of  the  network.  In a network that uses gateways, it would be
possible to put in an extra link between parts  of  the  network  that
have  heavy  traffic between them.  However such extra links cannot be
used by a set of bridges.


6.2.4 More about gateways


Gateways have their own advantages and disadvantages.   In  general  a
gateway  is more complex to design and to administer than a bridge.  A
gateway must participate in all of the protocols that it  is  designed
to  forward.  For example, an IP gateway must respond to ARP requests.
The IP standards also require it to completely process the IP  header,
decrementing the time to live field and obeying any IP options.

Gateways  are  designed to handle more complex network topologies than
bridges.  As such, they have a different (and  more  complex)  set  of
decisions  to make.  In general a bridge has fairly simple decision to
make: should it forward a datagram, and if so which  interface  should
it send the datagram out?  When a gateway forwards a datagram, it must
decide what host or gateway to send the datagram  to  next.    If  the
gateway  sends  a datagram back onto the same network it came from, it
should also issue a redirect to the source of the datagram telling  it
to  use a better route.  Many gateways can also handle parallel paths.
If there are several equally good paths to a destination, the  gateway
will  alternate  among  them in round-robin fashion.  (This is done by
some bridges also, though it is less common there.    In  both  cases,
there  are some issues raised by round-robin alternation.  It tends to
lead to datagrams arriving in an order different  than  the  order  in
which  they  were  sent.    This  can  complicate  processing  by  the
destination host.  Some older  TCP/IP  implementations  have  bugs  in
handling out of order datagrams.)

In  order  to  handle these decisions, a gateway will typically have a
routing table that looks very much  like  a  host's.    As  with  host
routing  tables, a gateway's table contains an entry for each possible
network number.  For each network, there is  either  an  entry  saying
that that network is connected directly to the gateway, or there is an
entry saying that traffic for that network should be forwarded through
some  other  gateway  or  gateways.    We  will  describe the "routing
protocols" used to build up this information later, in the  discussion
on how to configure a gateway.


                                  44


6.3 Comparing the switching technologies


Repeaters,  buffered repeaters, bridges, and gateways form a spectrum.
Those devices near the beginning of the  list  are  best  for  smaller
networks.    They  are  less expensive, and easier to set up, but less
general.  Those near the end of the list  are  suitable  for  building
more complex networks.  Many networks will contain a mixture of switch
types, with repeaters being used  to  connect  a  few  nearby  network
segments,  bridges  used  for somewhat larger areas, and gateways used
for long-distance links.

Note that this document so far has  assumed  that  only  gateways  are
being  used.  The section on setting up a host described how to set up
a routing table  listing  the  gateways  to  use  to  get  to  various
networks.    Repeaters  and bridges are invisible to IP.  So as far as
previous sections are concerned, networks connected by them are to  be
considered a single network.  Section 3.4 describes how to configure a
host in the case  where  several  subnets  are  carried  on  a  single
physical  network.  The same configuration should be used when several
subnets are connected by repeaters or bridges.

As mentioned above, the most important dimensions  on  which  switches
vary are isolation, performance, routing, network management.


6.3.1 Isolation


Generally  people  use switches to connect networks to each other.  So
they are normally thinking  of  gaining  connectivity,  not  providing
isolation.  However isolation is worth thinking about.  If you connect
two networks and  provide  no  isolation  at  all,  then  any  network
problems  on  other  networks suddenly appear on yours as well.  Also,
the two networks together may have enough traffic  to  overwhelm  your
network.  Thus it is well to think of choosing an appropriate level of
protection.

Isolation comes in  two  kinds:  isolation  against  malfunctions  and
traffic  isolation.  In order to discuss isolation of malfunctions, we
have to have a taxonomy of malfunctions.  Here are the  major  classes
of malfunctions, and which switches can isolate them:

   - Electrical  faults,  e.g.    a short in the cable or some sort of
     fault that distorts the signal.  All types of switch will confine
     this  to  one  side  of  the switch: repeater, buffered repeater,
     bridge, gateway.  These are worth  protecting  against,  although
     their frequency depends upon how often your cables are changed or
     disturbed.  It is rare for this sort of fault  to  occur  without
     some disturbance of the cable.

   - Transceiver and controller problems that general signals that are
     valid electrically but nevertheless incorrect (e.g. a continuous,
     infinitely  long  packet,  spurious  collisions,  never  dropping
                                  45


     carrier).  All except the  simple  repeater  will  confine  this:
     buffered  repeater, bridge, gateway.  (Such problems are not very
     common.)

   - Software malfunctions that  lead  to  excessive  traffic  between
     particular  hosts  (i.e.  not  broadcasts).  Bridges and gateways
     will isolate these.  (This type of failure is fairly rare.   Most
     software and protocol problems generate broadcasts.)

   - Software  malfunctions  that lead to excessive broadcast traffic.
     Gateways will isolate these.  Generally bridges will not, because
     they  must pass broadcasts.  Bridges with user-settable filtering
     can protect against some  broadcast  malfunctions.    However  in
     general  bridges  must  pass ARP, and most broadcast malfunctions
     involve ARP.    This  problem  is  not  severe  on  single-vendor
     networks  where software is under careful control.  However sites
     with  complex  network  environments  or   experimental   network
     software may see problems of this sort regularly.

Traffic isolation is provided by bridges and gateways.  The most basic
decision is how many computers can  be  put  onto  a  network  without
overloading  its capacity.  This requires knowledge of the capacity of
the network, but also how the hosts will use  it.    For  example,  an
Ethernet  may  support  hundreds of systems if all the network is used
for is remote logins and an occasional file transfer.  However if  the
computers  are diskless, and use the network for swapping, an Ethernet
will support between 10 and 40, depending upon their  speeds  and  I/O
rates.

When you have to put more computers onto a network than it can handle,
you split it into several networks and put some sort of switch between
them.    If  you  do  the split correctly, most of the traffic will be
between machines on the same piece.  This means putting clients on the
same  network  as  their servers, putting terminal servers on the same
network as the hosts that they access most commonly, etc.

Bridges and gateways generally  provide  similar  degrees  of  traffic
isolation.    In both cases, only traffic bound for hosts on the other
side of the switch is passed.  However see the discussion on routing.


6.3.2 Performance


Absolute performance limits are becoming less of an issue as time goes
on,  since the switching technology is improving.  Generally repeaters
can handle the full bandwidth of the network.  (By their very  nature,
a  simple  repeater must be able to do so.) Bridges and gateways often
have performance limitations of  various  sorts.    Bridges  have  two
numbers  of  interest:  packet  scanning  rate  and  throughput.    As
explained above, a bridge must look at every packet  on  the  network,
even  ones that it does not forward.  The number of packets per second
that it can scan in this way is the packet scanning rate.   Throughput
applies  to both bridges and gateways.  This is the rate at which they
                                  46


can forward traffic.  Generally this depends upon datagram size.

Normally the number of datagrams per second that  a  unit  can  handle
will  be  greater for short datagrams than long ones.  Early models of
bridge varied from a few hundred datagrams per second to around  7000.
The higher speeds are for equipment that uses special-purpose hardware
to speed  up  the  process  of  scanning  packets.    First-generation
gateways  varied  from  a  few hundred datagrams per second to 1000 or
more.  However second-generation gateways  are  now  available,  using
special-purpose  hardware  of  the same sophistication as that used by
bridges.  They can handle on the order of 10000 datagrams per  second.
Thus  at  the  moment high-performance bridges and gateways can switch
most of the bandwidth of an Ethernet.   This  means  that  performance
should  no  longer  be  a  basis for choosing between types of switch.
However within a given type of switch, there are still specific models
with  higher or lower capacity.  And there may still be differences in
price/performance.  This is particularly true at the  low  end.    The
least  expensive bridges are currently less than half the price of the
least expensive gateway.

Unfortunately there  is  no  single  number  on  which  you  can  base
performance estimates.  The figure most commonly quoted is packets per
second.  Be aware that most vendors count a datagram only once  as  it
goes through a gateway, but that one prominent vendor counts datagrams
twice.  Thus their switching rates must be deflated by a factor of  2.
Also,  when comparing numbers make sure that they are for datagrams of
the same size.  A simple performance model is

    processing time = switching time + datagram size * time per byte

That is, the  time  to  switch  a  datagram  is  normally  a  constant
switching  time,  representing  interrupt  latency, header processing,
routing table lookup, etc., plus a component proportional to  datagram
size,  representing  the  time needed to do any datagram copying.  One
reasonable approach to reporting performance is to give datagrams  per
second  for  minimum and maximum size datagrams.  Another is to report
limiting switching speed in datagrams per  second  and  throughput  in
bytes per second, i.e.  the two terms of the equation above.


6.3.3 Routing


Routing  refers  to  the  technology  used  to  decide where to send a
datagram next.  Of course for a repeater this is not an  issue,  since
repeaters forward every packet.

The  routing  strategy  for  a  bridge  turns  into two decisions: (1)
enabling and disabling links in order to maintain the  spanning  tree,
and  (2) deciding whether it should forward any particular packet, and
out what interface (if the bridge is capable of handling more than two
interfaces).    The  second  decision  is  usually based on a table of
MAC-level addresses.  As described above, this is built up by scanning
traffic  visible  from  each  interface.  The goal is to forward those
                                  47


packets whose destination is on the other side of the  bridge.    This
algorithm  requires  that  the  network configuration have no loops or
redundant lines.  Less sophisticated bridges  leave  this  up  to  the
system  designer.  With these bridges, you must set up your network so
that there are no loops in  it.    More  sophisticated  bridges  allow
arbitrary  topology,  but  disable  links until no loops remain.  This
provides extra reliability.  If a link fails, an alternative link will
be  turned  on  automatically.    Bridges  that  work  this way have a
protocol that allows them to detect when a unit must  be  disabled  or
reenabled,  so  that  at  any  instant the set of active links forms a
"spanning tree".  If you require the extra  reliability  of  redundant
links,  make  sure  that  the  bridges  you use can disable and enable
themselves in this way.  There is currently no official  standard  for
the  protocol  used among bridges, although there is a standard in the
proposal stage.  If you buy bridges from more than  one  vendor,  make
sure that their spanning-tree protocols will interoperate.

Gateways generally allow arbitrary network topologies, including loops
and  redundant  links.    Because  of  their  more   general   routing
algorithms,  gateways  must  maintain  a  model  of the entire network
topology.  Different routing techniques maintain models of greater  or
lesser   complexity,   and  use  the  data  with  varying  degrees  of
sophistication.  Gateways that handle IP should generally support  the
two  Internet  standard  routing  protocols:  RIP (Routing Information
Protocol)  and  EGP  (External  Gateway   Protocol).      EGP   is   a
special-purpose protocol for use in networks where there is a backbone
under a separate administration.  It allows exchange  of  reachability
information  with  the  backbone  in  a  controlled way.  If you are a
member of such a network, your gateway must  support  EGP.    This  is
becoming  common  enough  that it is probably a good idea to make sure
that all gateways support EGP.

RIP is a protocol designed to handle routing within small to  moderate
size networks, where line speeds do not differ radically.  Its primary
limitations are:

   - It cannot be used with networks where any path goes through  more
     than  15  gateways.  This range may be further reduced if you use
     an optional feature for giving a slow line a weight  larger  than
     one.

   - It  cannot  share  traffic  between parallel lines (although some
     implementations allow this if the lines are between the same pair
     of gateways).

   - It cannot adapt to changes in network load.

   - It  is  not well suited to situations where there are alternative
     routes through lines of very different speeds.

   - It may not be stable in networks where lines or gateways change a
     lot.

Some  vendors supply proprietary modifications to RIP that improve its
operation with EGP or increase the maximum path length beyond 15,  but
                                  48


do  not  otherwise modify it very much.  If you expect your network to
involve gateways from more  than  one  vendor,  you  should  generally
require  that  all of them support RIP, since this is the only routing
protocol that is generally available.  If you expect  to  use  a  more
sophisticated  protocol in addition, it may be useful for the gateways
to translate between their own protocol and RIP.    However  for  very
large  or  complex  networks,  there  may be no choice but to use some
other protocol throughout.

More sophisticated routing protocols are possible.  The  primary  ones
being considered today are cisco System's IGRP, and protocols based on
the SPF (shortest-path first) algorithms.  In general these  protocols
are designed for larger or more complex networks.  They are in general
stable under a wider  variety  of  conditions,  and  they  can  handle
arbitrary combinations of line type and speed.  Some of them allow you
to  split  traffic  among  parallel  paths,  to  get  better   overall
throughput.    Some newer technologies may allow the network to adjust
to take into account paths that are overloaded.  However at the moment
I  do  not  know of any commercial gateway that does this.  (There are
very serious problems with maintaining stable  routing  when  this  is
done.) There are enough variations among routing technology, and it is
changing rapidly enough, that you should discuss your proposed network
topology  in  detail with all of the vendors that you are considering.
Make sure that their technology can  handle  your  topology,  and  can
support  any  special  requirements  that you have for sharing traffic
among parallel lines, and for adjusting topology to take into  account
failures.    In  the  long  run,  we expect one or more of these newer
routing protocols to attain the status of a standard, at least on a de
facto basis.  However at the moment, there is no generally implemented
routing technology other than RIP.

One additional routing topic to consider is policy-based routing.   In
general routing protocols are designed to find the shortest or fastest
possible path for every datagram.  In some cases, this is not desired.
For  reasons  of  security, cost accountability, etc., you may wish to
limit certain paths to certain uses.   Most  gateways  now  have  some
ability to control the spread of routing information so as to give you
some administrative control over the way routes are used.    Different
gateways  vary  in the degree of control that they support.  Make sure
that you discuss any requirements that you have for control  with  all
prospective gateway vendors.


6.3.4 Network management


Network  management  covers  a  wide variety of topics.  In general it
includes gathering statistical data and status information about parts
of  your network, and taking action as necessary to deal with failures
and  other  changes.    The  most  primitive  technique  for   network
monitoring  is  periodic  "pinging"  of  critical hosts.  Pinging is a
monitoring technique that depends on an "echo" datagram.   This  is  a
specific  type  of  datagram  that  requests an immediate reply.  Most
TCP/IP implementations contain a program (usually called "ping")  that
                                  49


sends  an echo to a specified host.  If you get a reply, you know that
the host is up, and that the network connection to the host works.  If
you  don't  get  a reply, you know that something is wrong with one of
the other.  By pinging a reasonable sample of hosts, you can  normally
tell  what  is  going on.  If all the hosts on a network suddenly stop
returning pings, it is reasonable to conclude that the  connection  to
that  network  has  gone  bad.  If one host stops returning pings, but
other hosts on the same network still do, then  it  is  reasonable  to
conclude that the host has crashed.

More  sophisticated  network  monitoring  requires  the ability to get
specific status and statistical information from  various  devices  on
the  network.   These should include various sorts of datagram counts,
as well as counts of errors of various kinds.  This data is likely  to
be  most detailed in a gateway, since the gateway classifies datagrams
using the protocols, and may even respond to certain types of datagram
itself.    However  bridges  and even buffered repeaters can certainly
have counts of datagrams forwarded, interface errors, etc.  It  should
be possible to collect this data from a central monitoring point.

There  is  now an official TCP/IP approach to network monitoring.  The
first stages use a related set of protocols, SGMP and SNMP.   Both  of
these  protocols  are designed to allow you to collect information and
to make changes in configuration parameters  for  gateways  and  other
entities  on  your  network.   You can run the corresponding interface
programs on any host in your network.    SGMP  is  now  available  for
several  commercial  gateways,  as  well  as for Unix systems that are
acting as gateways.  There is a limited set of information  which  any
SGMP  implementation  is  required  to  supply,  as  well as a uniform
mechanism for vendors to add information of their own.  By late  1988,
the  second  generation  of this protocol, SNMP, should be in service.
This is a slightly more sophisticated protocol.  It has with it a more
complete  set  of  information  that  can be monitored, called the MIB
(Management Information Base).  Unlike the somewhat ad hoc  collection
of  SGMP  variables,  the  MIB  is  the  result  of numerous committee
deliberations involving a number of vendors and users.  Eventually  it
is  expected  that  there will be a TCP/IP equivalent of CMIS, the ISO
network monitoring service.  However CMIS, and  its  protocols,  CMIP,
are  not  yet  official  ISO  standards,  so  they  are  still  in the
experimental stages.

In general terms all of these protocols  accomplish  the  same  thing:
They  allow  you to collect critical information in a uniform way from
all vendors' equipment.  You send commands as  UDP  datagrams  from  a
network  management  program  running  on  some  host in your network.
Generally the interaction is fairly simple,  with  a  single  pair  of
datagrams exchanged: a command and a response.  At the moment security
is fairly simple.  It  is  possible  to  require  what  amounts  to  a
password  in  the  command.   (In SGMP it is referred to as a "session
name", rather  than  a  password.)  More  elaborate,  encryption-based
security is being developed.

You  will  probably  want to configure the network management tools at
your disposal to do several different things.  For short-term  network
monitoring,  you will want to keep track of switches crashing or being
                                  50


taken down for maintenance, and of failure of communications lines and
other  hardware.  It is possible to configurate SGMP and SNMP to issue
"traps" (unsolicited messages) to a specified host or  list  of  hosts
when  some  of  these  critical events occur (e.g. lines up and down).
However it is unrealistic to expect a switch to  notify  you  when  it
crashes.    It  is  also  possible for trap messages to be lost due to
network failure or overload.  Thus  you  can't  depend  completely  on
traps.    You  should  also  poll  your  switches  regularly to gather
information.  Various displays are available, including a map of  your
network  where items change color as their status changes, and running
"strip charts" that  show  datagram  rates  and  other  items  through
selected switches.  This software is still in its early stages, so you
should expect to see a lot of change here.  However at the very  least
you  should  expect  to  be notified in some way of failures.  You may
also want to be able to take actions  to  reconfigure  the  system  in
response  to  failures,  although  security  issues make some managers
nervous about doing that through the existing management protocols.

The second type of monitoring you are likely  to  want  to  do  is  to
collect information for use in periodic reports on network utilization
and  performance.    For  this,  you  need  to  sample   each   switch
perodically,  and  retrieve numbers of interest.  At Rutgers we sample
hourly, and get the number of datagrams forwarded for IP and DECnet, a
count  of reloads, and various error counts.  These are reported daily
in some detail.  Monthly summaries are produced giving traffic through
each  gateway,  and a few key error rates chosen to indicate a gateway
that is being overloaded (datagrams dropped in input and output).

It should be possible to use monitoring techniques of this  kind  with
most  types  of switch.  At the moment, simple repeaters do not report
any statistics.  Since they do not generally have processors in  them,
doing  so  would  cause  a  major  increase in their cost.  However it
should be possible to put  network  management  software  in  buffered
repeaters,  bridges,  and  gateways.   Gateways are the most likely to
contain sophisticated  network  management  software.    Most  gateway
vendors  that  handle  IP  are  expected  to  implement the monitoring
protocols described above.  Many bridge vendors make  some  provisions
for   collecting   performance   data.      Since   bridges   are  not
protocol-specific, most of them do not have the software necessary  to
implement  TCP/IP-based  network management protocols.  In some cases,
monitoring can be done only by typing commands to a  directly-attached
console.    (We  have  seen one case where it is necessary to take the
bridge out of service to gather this data.)  In  other  cases,  it  is
possible  to  gather data via the network, but the monitoring protocol
is ad hoc or even proprietary.

Except for very small networks, you should probably  insist  that  any
switch  more  complex than a simple repeater should collect statistics
and provide some way of querying  them  remotely.    Portions  of  the
network  that  do  not  support  such  operations  can be monitored by
pinging.  However ping simply detects  gross  failures.  It  does  not
allow  you  to  look  at  the  noise  level of a serial line and other
quantities needed to do high-quality maintenance.  In  the  long  run,
you  can  expect  the  most  software  to  be  available  for standard
protocols such as SGMP/SNMP and CMIS.  However proprietary  monitoring
                                  51


tools may be sufficient as long as they work with all of the equipment
that you have.


6.3.5 A final evaluation


Here is a summary of the places where each kind of  switch  technology
is normally used:

   - Repeaters are normally confined to a single building.  Since they
     provide no traffic isolation, you must make sure that the  entire
     set of networks connected by repeaters can carry the traffic from
     all of the computers on it.   Since  they  generally  provide  no
     network  monitoring tools, you will not want to use repeaters for
     a link that is likely to fail.

   - Bridges and gateways should be placed sufficiently frequently  to
     break  your  network  into pieces for which the traffic volume is
     manageable.  You may want to place bridges or  gateways  even  in
     places  where  traffic  level  alone  would  not require them for
     network monitoring reasons.

   - Because bridges must pass broadcast datagrams, there is  a  limit
     to the size network you can construct using them.  It is probably
     a good idea to limit  the  network  connected  by  bridges  to  a
     hundred systems or so.  This number can be increased somewhat for
     bridges with good facilities for filtering.

   - Because certain kinds of  network  misbehavior  will  be  passed,
     bridges should be used only among portions of the network where a
     single group is responsible for diagnosing problems.  You have to
     be  crazy  to  use  a  bridge between networks owned by different
     organizations.  Portions of your network  where  experiments  are
     being  done  in network technology should always be isolated from
     the rest of the network by gateways.

   - For many applications it is more important to  choose  a  product
     with  the  right  combination  of performance, network management
     tools, and other features  than  to  make  the  decision  between
     bridges and gateways.


7. Configuring Gateways


This  section  deals  with  configuration  issues that are specific to
gateways.  Gateways that handle  IP  are  themselves  Internet  hosts.
Thus  the  discussions  above  on  configuring  addresses  and routing
information apply to gateways as well as to hosts.  The exact way  you
configure  a  gateway will depend upon the vendor.  In some cases, you
edit files stored on a disk  in  the  gateway  itself.    However  for
reliability reasons most gateways do not have disks of their own.  For
                                  52


them, configuration information is stored in non-volatile memory or in
configuration  files  that  are uploaded from one or more hosts on the
network.

At a minimum, configuration involves specifying  the  IP  address  and
address  mask  for each interface, and enabling an appropriate routing
protocol.  However generally a few other options are desirable.  There
are often parameters in addition to the IP address that you should set
for each interface.

One important parameter is the broadcast address.  As explained above,
older  software may react badly when broadcasts are sent using the new
standard broadcast address.  For this reason, some vendors  allow  you
to choose a broadcast address to be used on each interface.  It should
be set using your knowledge of what  computers  are  on  each  of  the
networks.    In  general  if the computers follow current standards, a
broadcast address of 255.255.255.255 should be used.    However  older
implementations  may  behave better with other addresses, particularly
the address that uses zeros for the host number.    (For  the  network
128.6  this  would be 128.6.0.0.  For compatibility with software that
does not implement subnets, you would use 128.6.0.0 as  the  broadcast
address  even  for  a  subnet  such as 128.6.4.) You should watch your
network with  a  network  monitor  and  see  the  results  of  several
different  broadcast address choices.  If you make a bad choice, every
time the gateway sends a routing update broadcast,  many  machines  on
your  network  will respond with ARP's or ICMP errors.  Note that when
you change the broadcast address in  the  gateway,  you  may  need  to
change  it on the individual computers as well.  Generally the idea is
to change the address on the systems that you can  configure  to  give
behavior that is compatible with systems that you can't configure.

Other interface parameters may be necessary to deal with peculiarities
of the network it is connected to.  For example,  many  gateways  test
Ethernet  interfaces  to make sure that the cable is connected and the
transceiver is working correctly.  Some of these tests will  not  work
properly  with  the older Ethernet version 1 transceivers.  If you are
using such a transceiver, you would have  to  disable  this  keepalive
testing.    Similarly, gateways connected by a serial line normally do
regular testing to make sure that the line is still  working.    There
can be situations where this needs to be disabled.

Often  you  will have to enable features of the software that you want
to use.  For example, it is often necessary to  turn  on  the  network
management  protocol explicitly, and to give it the name or address of
a host that is running software to accept traps (error messages).

Most gateways have options that relate to security.    At  a  minimum,
this may include setting password for making changes remotely (and the
"session name" for SGMP).  If you need to control  access  to  certain
parts  of  your  network,  you will also need to define access control
lists or whatever other mechanism your gateway uses.

Gateways that load configuration information over the network  present
special  issues.    When  such  a  gateway  boots,  it sends broadcast
requests of various kinds, attempting to find its Internet address and
                                  53


then  to load configuration information.  Thus it is necessary to make
sure that there is some computer that is prepared to respond to  these
requests.    In  some cases, this is a dedicated micro running special
software.  In other cases, generic software is available that can  run
on a variety of machines.  You should consult your vendor to make sure
that this can be arranged.  For reliability reasons, you  should  make
sure  that  there  is  more  than  one  host  with the information and
programs that your gateways need.  In some  cases  you  will  have  to
maintain  several  different files.  For example, the gateways used at
Rutgers use a program called "bootp" to supply their Internet address,
and  they then load the code and configuration information using TFTP.
This means that we have to maintain a file  for  bootp  that  contains
Ethernet  and  Internet addresses for each gateway, and a set of files
containing other configuration information for each gateway.  If  your
network  is  large,  it is worth taking some trouble to make sure that
this information remains consistent.  We keep master copies of all  of
the  configuration information on a single computer, and distribute it
to other systems when it changes, using the Unix  utilities  make  and
rdist.    If  your  gateway  has  an  option  to  store  configuration
information in non-volatile memory, you will eliminate some  of  these
logistical  headaches.    However this presents its own problems.  The
contents of non-volatile memory should be backed up  in  some  central
location.   It will also be harder for network management personnel to
review configuration  information  if  it  is  distributed  among  the
gateways.

Starting   a   gateway   is   particularly  challenging  if  it  loads
configuration information from  a  distant  portion  of  the  network.
Gateways  that  expect  to  take  configuration  information  from the
network generally issue broadcast requests on all of the  networks  to
which  they  are  connected.    If there is a computer on one of those
networks that is prepared  to  respond  to  the  request,  things  are
straightforward.    However  some  gateways may be in remote locations
where there are no  nearby  computer  systems  that  can  support  the
necessary protocols.  In this case, it is necessary to arrange for the
requests to be routed back to the network where there are  appropriate
computers.  This requires what is strictly speaking a violation of the
basic design philosophy for gateways.  Generally a gateway should  not
allow  broadcasts  from  one  network  to  pass through to an adjacent
network.  In order to allow  a  gateway  to  get  information  from  a
computer  on  a  different  network,  at  least one of the gateways in
between will have to be configured to pass  the  particular  class  of
broadcasts  used  to retrieve this information.  If you have this sort
of configuration, you should test the loading process regularly.    It
is  not  unusual  to  find  that gateways do not come up after a power
failure because someone changed the configuration of  another  gateway
and made it impossible to load some necessary information.


                                  54


7.1 Configuring routing for gateways


The final topic to be considered is configuring routing.  This is more
complex for a gateway than for a normal host.    Most  TCP/IP  experts
recommend that routing be left to the gateways.  Thus hosts may simply
have a default route that points to the nearest gateway.    Of  course
the  gateways  themselves  can't  get by with this.  They need to have
complete routing tables.

In order to understand how to configure a gateway, we have to look  in
a  bit more detail at how gateways communicate routes.  When you first
turn on a gateway, the only networks it knows about are the ones  that
are   directly   connected   to  it.    (They  are  specified  by  the
configuration information.)  In order to find out how to get  to  more
distant  parts  of  the  network,  it engages in some sort of "routing
protocol".  A routing protocol is simply a protocol that  allows  each
gateway  to advertise which networks it can get to, and to spread that
information from one gateway to the next.   Eventually  every  gateway
should  know  how to get to every network.  There are different styles
of routing protocol.  In one common type, gateways talk only to nearby
gateways.    In  another  type,  every  gateway  builds  up a database
describing every other gateway in the system.    However  all  of  the
protocols have some way for each gateway in the system to find out how
to get to every destination.

A metric is some number or set of numbers that can be used to  compare
routes.    The  routing  table is constructed by gathering information
from other gateways.  If two other gateways claim to be able to get to
the  same destination, there must be some way of deciding which one to
use.  The metric is used to make that decision.  Metrics all  indicate
in  some  general  sense the "cost" of a route.  This may be a cost in
dollars  of  sending  datagrams  over  that  route,   the   delay   in
milliseconds,  or  some  other measure.  The simplest metric is just a
count of the number of gateways along the path.  This is  referred  to
as  a  "hop  count".   Generally this metric information is set in the
gateway configuration files, or is derived from information  appearing
there.

At  a minimum, routing configuration is likely to consist of a command
to enable the routing protocol that you want to  use.    Most  vendors
will have a prefered routing protocol.  Unless you have some reason to
choose another, you should use that.  The normal reason  for  choosing
another  protocol  is  for  compatibility with other kinds of gateway.
For example, your network may be  connected  to  a  national  backbone
network  that  requires  you to use EGP (exterior gateway protocol) to
communicate routes with it.  EGP is only appropriate for that specific
case.    You  should  not use EGP within your own network, but you may
need to use it  in  addition  to  your  regular  routing  protocol  to
communicate  with a national network.  If your own network has several
different types of gateway, then  you  may  need  to  pick  a  routing
protocol  that  all of them support.  At the moment, this is likely to
be RIP (Routing Information Protocol).  Depending upon the  complexity
of  your  network,  you  could  use  RIP  throughout it, or use a more
sophisticated protocol among the gateways that support it, and use RIP
                                  55


only at the boundary between gateways from different vendors.

Assuming  that  you  have  chosen a routing protocol and turned it on,
there are some additional decisions that you may need to make.  One of
the  more  basic configuration options has to do with supplying metric
information.  As indicated above, metrics are numbers which  are  used
to decide which route is the best.  Unsophisticated routing protocols,
e.g. RIP, normally just count hops.  So a route that passes through  2
gateways  would  be  considered better than one that passes through 3.
Of course if the latter route used 1.5Mbps lines and the  former  9600
bps  lines,  this  would  be  the  wrong  decision.  Thus most routing
protocols allow you to set parameters to take this sort of thing  into
account.  With RIP, you would arrange to treat the 9600 bps line as if
it were several hops.  You would  increase  the  effective  hop  count
until  the  better route was chosen.  More sophisticated protocols may
take the bit rate of the line into account automatically.  However you
should  be on the lookout for configuration parameters that need to be
set.    Generally  these  parameters  will  be  associated  with   the
particular  interface.   For example, with RIP you would have to set a
metric value for the interface connected to the 9600 bps line.    With
protocols  that  are  based on bit rate, you might need to specify the
speed  of  each  line  (if  the   gateway   cannot   figure   it   out
automatically).

Most  routing  protocols  are  designed  to let each gateway learn the
topology of the entire network, and to choose the best possible  route
for  each  datagram.  In some cases you may not want to use the "best"
route.  You may want traffic to stay out of a certain portion  of  the
network  for  security  or  cost  reasons.   One way to institute such
controls is by specifying routing options.  These options  are  likely
to be different for different vendors.  But the basic strategy is that
if the rest of the network doesn't know about a  route,  it  won't  be
used.    So  controls normally take the form of limiting the spread of
information about routes whose use you want to control.

Note that there  are  ways  for  the  user  to  override  the  routing
decisions made by your gateways.  If you really need to control access
to a certain network, you will have to do two separate things:

   - Use routing controls to make sure that the gateways use only  the
     routes you want them to.

   - Use access control lists on the gateways that are adjacent to the
     sensitive networks.

These two mechanisms act at different levels.   The  routing  controls
affect  what  happens  to most datagrams: those where the user has not
specified routing manually.  Your routing mechanism must be set up  to
choose an acceptable route for them.  The access control list provides
an additional limitation which prevents users from supplying their own
routing and bypassing your controls.

For  reliability  and  security reasons, there may also be controls to
allow you to list the gateways from which you will accept information.
It  may  also  be possible to rank gateways by priority.  For example,
                                  56


you might decide to listen to routes from within your own organization
before   routes  from  other  organizations  or  other  parts  of  the
organization.  This would  have  the  effect  of  having  traffic  use
internal  routes  in preference to external ones, even if the external
ones appear to be better.

If you use several different routing protocols, you will probably have
some  decisions  to  make regarding how much information to pass among
them.  Since multiple routing  protocols  are  often  associated  with
multiple  organizations,  you  must be sure to make these decisions in
consultation  with  management  of  all  of  the  relevant   networks.
Decisions  that  you  make may have consequences for the other network
which are not immediately obvious.  You might think it would  be  best
to  configure  the gateway so that everything it knows is passed on by
all routing protocols.  However here are some reasons why you may  not
want to do so:

   - The  metrics  used  by  different  routing  protocols  may not be
     comparable.  If you  are  connected  to  two  different  external
     networks,  you  want to specify that one should always be used in
     preference to the other, or that the nearest one should be  used,
     rather  than  attempting  to  compare metric information received
     from the two networks to see which has the better route.

   - EGP is particularly sensitive, because the  EGP  protocol  cannot
     handle  loops.    Thus  there  are  strict  rules  governing what
     information may be communicated to a backbone that uses EGP.   In
     situations  where  EGP  is being used, management of the backbone
     network should help you configure your routing.

   - If you have slow lines in your network (9600 bps or slower),  you
     may  prefer  not  to send a complete routing table throughout the
     network.  If you are connected to an external  network,  you  may
     prefer  to treat it as a default route, rather than to inject all
     of its routing information into your routing protocol.


                                  57