textfiles/phreak/password.sec


 ***************************************************************
 **                                                           **
 **          A Case History on Password Security              **
 **   by Robert Morris (Mr. Internet Worm) and Ken Thompson   **
 **       Aquired by Weapons, Distriubted by THUGS.           **
 **              A Solsbury Hill Text File.                   **
 **                                                           **
 ***************************************************************


delim $$ Password Security: A  Case  History  Robert  Morris  Ken
Thompson  This  paper  describes the history of the design of the
password security scheme on a remotely accessed time-sharing sys-
tem.   The  present  design was the result of countering observed
attempts to penetrate the system.  The  result  is  a  compromise
between  extreme security and ease of use.  INTRODUCTION Password
security on the time-sharing system [1] is provided by a  collec-
tion  of  programs whose elaborate and strange design is the out-
growth of many years of experience  with  earlier  versions.   To
help  develop  a secure system, we have had a continuing competi-
tion to devise new ways to attack the security of the system (the
bad  guy)  and,  at  the  same  time, to devise new techniques to
resist the new attacks (the good guy).  This competition has been
in  the  same  vein  as  the competition of long standing between
manufacturers of armor plate and those of armor-piercing  shells.
For this reason, the description that follows will trace the his-
tory of the password system rather  than  simply  presenting  the
program  in  its current state.  In this way, the reasons for the
design will be made clearer, as the design cannot  be  understood
without  also understanding the potential attacks.  An underlying
goal has been to provide password security at minimal  inconveni-
ence  to the users of the system.  For example, those who want to
run a completely open system without passwords, or to have  pass-
words  only at the option of the individual users, are able to do
so, while those who require all of their users to have  passwords
gain  a high degree of security against penetration of the system
by unauthorized users.  The password system must be able not only
to  prevent  any access to the system by unauthorized users (i.e.
prevent them from logging in at all), but it  must  also  prevent
users  who  are already logged in from doing things that they are
not authorized to do.  The so called ``super-user'' password, for
example,  is  especially  critical because the super-user has all
sorts of permissions and has essentially unlimited access to  all
system  resources.   Password security is of course only one com-
ponent of overall system security, but it is  an  essential  com-
ponent.   Experience has shown that attempts to penetrate remote-
access systems have been  astonishingly  sophisticated.   Remote-
access  systems  are peculiarly vulnerable to penetration by out-
siders as there are threats at the  remote  terminal,  along  the
communications link, as well as at the computer itself.  Although
the security of a password encryption algorithm is an interesting
intellectual  and mathematical problem, it is only one tiny facet
of a very large problem.  In practice, physical security  of  the
computer, communications security of the communications link, and
physical control of the computer itself loom as far  more  impor-
tant  issues.   Perhaps most important of all is control over the
actions of ex-employees, since they are not under any direct con-
trol  and  they may have intimate knowledge about the system, its
resources, and methods of access.  Good system security  involves
realistic  evaluation of the risks not only of deliberate attacks
but also of casual unauthorized access and accidental disclosure.
PROLOGUE  The  UNIX  system was first implemented with a password
file that contained the actual passwords of all  the  users,  and
for  that  reason  the  password file had to be heavily protected
against being either read  or  written.   Although  historically,
this  had  been  the technique used for remote-access systems, it
was completely unsatisfactory for several reasons.  The technique
is  excessively vulnerable to lapses in security.  Temporary loss
of protection can occur when the password file is being edited or
otherwise  modified.   There  is  no way to prevent the making of
copies by privileged  users.   Experience  with  several  earlier
remote-access  systems showed that such lapses occur with fright-
ening frequency.  Perhaps the most memorable  such  occasion  oc-
curred  in the early 60's when a system administrator on the CTSS
system at MIT was editing the password file  and  another  system
administrator  was  editing  the daily message that is printed on
everyone's terminal on login.  Due to a  software  design  error,
the temporary editor files of the two users were interchanged and
thus, for a time, the password file was printed on every terminal
when  it  was  logged in.  Once such a lapse in security has been
discovered, everyone's password must be changed,  usually  simul-
taneously,  at a considerable administrative cost.  This is not a
great matter, but far more serious is  the  high  probability  of
such  lapses going unnoticed by the system administrators.  Secu-
rity against unauthorized disclosure of the passwords was, in the
last  analysis, impossible with this system because, for example,
if the contents of the file system are put on  to  magnetic  tape
for  backup, as they must be, then anyone who has physical access
to the tape can read anything on it with  no  restriction.   Many
programs must get information of various kinds about the users of
the system, and these programs in general should have no  special
permission  to  read  the  password  file.  The information which
should have been in the password file  actually  was  distributed
(or  replicated)  into  a number of files, all of which had to be
updated whenever a user was added to or dropped from the  system.
THE  FIRST  SCHEME  The  obvious  solution is to arrange that the
passwords not appear in the system at all, and it is  not  diffi-
cult  to  decide  that this can be done by encrypting each user's
password, putting only the encrypted form in the  password  file,
and  throwing  away  his original password (the one that he typed
in).  When the user later tries to log  in  to  the  system,  the
password  that  he  types  is encrypted and compared with the en-
crypted version in the password file.  If the two match, his  lo-
gin  attempt  is  accepted.  Such a scheme was first described in
[3, p.91ff.].  It also seemed advisable to  devise  a  system  in
which  neither  the password file nor the password program itself
needed to be protected against being read by  anyone.   All  that
was  needed  to  implement these ideas was to find a means of en-
cryption that was very difficult to invert, even when the encryp-
tion  program  is  available.   Most  of  the standard encryption
methods used (in the past) for encryption of messages are  rather
easy  to invert.  A convenient and rather good encryption program
happened to exist on the system at the time; it simulated the  M-
209 cipher machine [4] used by the U.S. Army during World War II.
It turned out that the M-209 program was usable, but with a given
key,  the ciphers produced by this program are trivial to invert.
It is a much more difficult matter to find out the key given  the
cleartext input and the enciphered output of the program.  There-
fore, the password was used not as the text to be  encrypted  but
as the key, and a constant was encrypted using this key.  The en-
crypted result was entered into the password  file.   ATTACKS  ON
THE  FIRST  APPROACH  Suppose  that the bad guy has available the
text of the password encryption program and the complete password
file.  Suppose also that he has substantial computing capacity at
his disposal.  One obvious approach to penetrating  the  password
mechanism is to attempt to find a general method of inverting the
encryption algorithm.  Very possibly this can be  done,  but  few
successful  results  have  come to light, despite substantial ef-
forts extending over a period  of  more  than  five  years.   The
results have not proved to be very useful in penetrating systems.
Another approach to penetration is simply to keep  trying  poten-
tial passwords until one succeeds; this is a general cryptanalyt-
ic approach called key search.  Human beings being what they are,
there  is a strong tendency for people to choose relatively short
and simple passwords that they can remember.  Given free  choice,
most people will choose their passwords from a restricted charac-
ter set (e.g. all lower-case  letters),  and  will  often  choose
words  or  names.   This  human  habit makes the key search job a
great deal easier.  The critical factor involved in key search is
the  amount of time needed to encrypt a potential password and to
check the result against an entry in the password file.  The run-
ning  time  to  encrypt  one  trial password and check the result
turned out to be approximately 1.25 milliseconds on  a  PDP-11/70
when  the encryption algorithm was recoded for maximum speed.  It
is takes essentially no more time to  test  the  encrypted  trial
password against all the passwords in an entire password file, or
for that matter, against any collection of  encrypted  passwords,
perhaps  collected  from many installations.  If we want to check
all passwords of length n that  consist  entirely  of  lower-case
letters,  the number of such passwords is $26 sup n$.  If we sup-
pose that the password consists  of  printable  characters  only,
then  the  number of possible passwords is somewhat less than $95
sup n$.  (The standard  system  ``character  erase''  and  ``line
kill'' characters are, for example, not prime candidates.) We can
immediately estimate the running time of a program that will test
every  password  of  a  given  length  with all of its characters
chosen from some set of characters.  The  following  table  gives
estimates of the running time required on a PDP-11/70 to test all
possible character strings of length $n$ chosen from various sets
of  characters:  namely,  all  lower-case letters, all lower-case
letters plus digits, all alphanumeric characters, all  95  print-
able  ASCII  characters,  and  finally  all 128 ASCII characters.
cccccc  cccccc  nnnnnn.           26  lower-case   36  lower-case
letters   62    alphanumeric 95    printable    all   128   ASCII
n       letters and
digits      characters      characters      characters
1       30   msec.        40   msec.        80   msec.        120
msec.       160 msec.  2       800 msec.       2 sec.  5 sec.  11
sec. 20 sec.  3       22 sec. 58  sec. 5  min.  17  min. 43  min.
4       10   min. 35  min. 5  hrs.  28  hrs. 93  hrs.   5       4
hrs.  21 hrs. 318 hrs.  6       107 hrs.   One  has  to  conclude
that it is no great matter for someone with access to a PDP-11 to
test all lower-case alphabetic strings up  to  length  five  and,
given  access  to the machine for, say, several weekends, to test
all such strings up to six characters in length.  By using such a
program  against  a  collection  of actual encrypted passwords, a
substantial fraction of all the passwords will be found.  Another
profitable  approach for the bad guy is to use the word list from
a dictionary or to use a list of names.   For  example,  a  large
commercial  dictionary  contains  typicallly about 250,000 words;
these words can be checked in about five minutes.  Again,  a  no-
ticeable  fraction  of any collection of passwords will be found.
Improvements and extensions will be (and have been)  found  by  a
determined  bad  guy.   Some ``good'' things to try are: The dic-
tionary with the words spelled backwards.  A list of first  names
(best  obtained  from  some  mailing  list).   Last names, street
names, and city names also work well.   The  above  with  initial
upper-case  letters.   All  valid  license  plate numbers in your
state.  (This  takes  about  five  hours  in  New  Jersey.)  Room
numbers,  social  security  numbers,  telephone  numbers, and the
like.  The authors have conducted experiments to try to determine
typical  users'  habits  in  the choice of passwords when no con-
straint is put on their choice.  The results were  disappointing,
except  to the bad guy.  In a collection of 3,289 passwords gath-
ered from many users over a long period of time; 15 were a single
ASCII  character;  72  were  strings of two ASCII characters; 464
were strings of three ASCII characters; 477 were string  of  four
alphamerics;  706 were five letters, all upper-case or all lower-
case; 605 were six letters, all lower-case.   An  additional  492
passwords appeared in various available dictionaries, name lists,
and the like.  A total of 2,831, or 86% of this sample  of  pass-
words fell into one of these classes.  There was, of course, con-
siderable overlap between the dictionary results and the  charac-
ter string searches.  The dictionary search alone, which required
only five minutes to run, produced about one third of  the  pass-
words.   Users  could  be  urged (or forced) to use either longer
passwords or passwords chosen from a larger character set, or the
system  could itself choose passwords for the users.  AN ANECDOTE
An entertaining and instructive example is the  attempt  made  at
one  installation  to  force  users to use less predictable pass-
words.  The users did not choose their own passwords; the  system
supplied them.  The supplied passwords were eight characters long
and were taken from the character set  consisting  of  lower-case
letters  and  digits.   They  were  generated  by a pseudo-random
number generator with only $2 sup 15$ starting values.  The  time
required  to  search (again on a PDP-11/70) through all character
strings of length 8 from a 36-character alphabet  is  112  years.
Unfortunately, only $2 sup 15$ of them need be looked at, because
that is the number of possible outputs of the random number  gen-
erator.   The  bad  guy  did,  in fact, generate and test each of
these strings and found every one of the  system-generated  pass-
words  using  a  total  of only about one minute of machine time.
IMPROVEMENTS TO THE FIRST APPROACH Slower  Encryption  Obviously,
the  first  algorithm used was far too fast.  The announcement of
the DES encryption algorithm [2] by the National Bureau of  Stan-
dards  was  timely and fortunate.  The DES is, by design, hard to
invert, but equally valuable is the fact  that  it  is  extremely
slow  when  implemented in software.  The DES was implemented and
used in the following way: The  first  eight  characters  of  the
user's password are used as a key for the DES; then the algorithm
is used to encrypt a constant.  Although this constant is zero at
the   moment,   it   is   easily   accessible  and  can  be  made
installation-dependent.  Then the DES algorithm  is  iterated  25
times  and  the resulting 64 bits are repacked to become a string
of 11 printable characters.  Less Predictable Passwords The pass-
word  entry  program  was  modified so as to urge the user to use
more obscure passwords.  If the user enters an  alphabetic  pass-
word  (all upper-case or all lower-case) shorter than six charac-
ters, or a password from a larger character set shorter than five
characters, then the program asks him to enter a longer password.
This further reduces the efficacy of key search.  These  improve-
ments  make it exceedingly difficult to find any individual pass-
word.  The user is warned of the risks and if he  cooperates,  he
is very safe indeed.  On the other hand, he is not prevented from
using his spouse's name if he wants to.  Salted Passwords The key
search  technique is still likely to turn up a few passwords when
it is used on a large collection of passwords, and it seemed wise
to  make this task as difficult as possible.  To this end, when a
password is first entered, the password program obtains a  12-bit
random  number  (by reading the real-time clock) and appends this
to the password typed in by the user.  The concatenated string is
encrypted and both the 12-bit random quantity (called the $salt$)
and the 64-bit result of the  encryption  are  entered  into  the
password  file.   When  the user later logs in to the system, the
12-bit quantity is extracted from the password file and  appended
to  the typed password.  The encrypted result is required, as be-
fore, to be the same as the remaining 64  bits  in  the  password
file.   This  modification  does not increase the task of finding
any individual password, starting from scratch, but now the  work
of testing a given character string against a large collection of
encrypted passwords has been multiplied by  4096  ($2  sup  12$).
The  reason for this is that there are 4096 encrypted versions of
each password and one of them has been picked  more  or  less  at
random  by the system.  With this modification, it is likely that
the bad guy can spend days of computer  time  trying  to  find  a
password on a system with hundreds of passwords, and find none at
all.  More important is the fact that it becomes  impractical  to
prepare  an  encrypted  dictionary in advance.  Such an encrypted
dictionary could be used to crack new passwords  in  milliseconds
when  they  appear.   There is a (not inadvertent) side effect of
this modification.  It becomes  nearly  impossible  to  find  out
whether  a  person with passwords on two or more systems has used
the same password on all of them, unless you already  know  that.
The  Threat  of  the DES Chip Chips to perform the DES encryption
are already commercially available and they are very  fast.   The
use  of  such a chip speeds up the process of password hunting by
three orders of magnitude.  To avert this possibility, one of the
internal  tables  of  the  DES  algorithm (in particular, the so-
called E-table) is changed in a way that depends  on  the  12-bit
random  number.   The  E-table  is inseparably wired into the DES
chip, so that the commercial chip cannot be used.  Obviously, the
bad  guy could have his own chip designed and built, but the cost
would be unthinkable.  A Subtle Point To  login  successfully  on
the UNIX system, it is necessary after dialing in to type a valid
user name, and then the correct password for that user name.   It
is  poor  design to write the login command in such a way that it
tells an interloper when he has typed in  a  invalid  user  name.
The response to an invalid name should be identical to that for a
valid name.  When the slow encryption algorithm was first  imple-
mented,  the encryption was done only if the user name was valid,
because otherwise there was no encrypted password to compare with
the  supplied password.  The result was that the response was de-
layed by about one-half second if the name was valid, but was im-
mediate if invalid.  The bad guy could find out whether a partic-
ular user name was valid.  The routine was modified to do the en-
cryption  in  either  case.  CONCLUSIONS On the issue of password
security, UNIX is probably better than most systems.  The use  of
encrypted  passwords  appears reasonably secure in the absence of
serious attention of experts in the field.  It is also worth some
effort  to  conceal even the encrypted passwords.  Some UNIX sys-
tems have instituted what is called an ``external security code''
that  must be typed when dialing into the system, but before log-
ging in.  If this code is changed periodically, then someone with
an old password will likely be prevented from using it.  Whenever
any security procedure is instituted that attempts to deny access
to unauthorized persons, it is wise to keep a record of both suc-
cessful and unsuccessful attempts to get at the secured resource.
Just  as  an  out-of-hours  visitor to a computer center normally
must not only identify himself, but a record is usually also kept
of  his entry.  Just so, it is a wise precaution to make and keep
a record of all attempts to log into a remote-access time-sharing
system,  and  certainly all unsuccessful attempts.  Bad guys fall
on a spectrum whose one end is someone with ordinary access to  a^@
system  and whose goal is to find out a particular password (usu-
ally that of the super-user) and, at the other end,  someone  who
wishes  to  collect as much password information as possible from
as many systems as possible.  Most  of  the  work  reported  here
serves  to  frustrate  the  latter type; our experience indicates
that the former type of bad guy never was  very  successful.   We
recognize  that  a  time-sharing system must operate in a hostile
environment.  We did not attempt to hide the security aspects  of
the  operating system, thereby playing the customary make-believe
game in which weaknesses of  the  system  are  not  discussed  no
matter how apparent.  Rather we advertised the password algorithm
and invited attack in the belief that this approach would  minim-
ize  future  trouble.   The approach has been successful.  Refer-
ences Ritchie, D.M. and Thompson, K.  The UNIX Time-Sharing  Sys-
tem.   Comm.  ACM  17 (July 1974), pp. 365-375.  Proposed Federal
Information Processing Data Encryption Standard.  Federal  Regis-
ter  (40FR12134), March 17, 1975 Wilkes, M. V.  Time-Sharing Com-
puter Systems.  American Elsevier, New York, (1968).  U.  S.  Pa-
tent Number 2,089,603.


$