367 lines
18 KiB
Plaintext
367 lines
18 KiB
Plaintext
|
|
Archive-name: cryptography-faq/part08
|
|
|
|
|
|
This is the eighth of ten parts of the sci.crypt FAQ. The parts are
|
|
mostly independent, but you should read the first part before the rest.
|
|
We don't have the time to send out missing parts by mail, so don't ask.
|
|
Notes such as ``[KAH67]'' refer to the reference list in the last part.
|
|
|
|
|
|
|
|
Contents
|
|
|
|
8.1. How do I recover from lost passwords in WordPerfect?
|
|
8.2. How do I break a Vigenere (repeated-key) cipher?
|
|
8.3. How do I send encrypted mail under UNIX? [PGP, RIPEM, PEM, ...]
|
|
8.4. Is the UNIX crypt command secure?
|
|
8.5. How do I use compression with encryption?
|
|
8.6. Is there an unbreakable cipher?
|
|
8.7. What does ``random'' mean in cryptography?
|
|
8.8. What is the unicity point (a.k.a. unicity distance)?
|
|
8.9. What is key management and why is it important?
|
|
8.10. Can I use pseudo-random or chaotic numbers as a key stream?
|
|
8.11. What is the correct frequency list for English letters?
|
|
8.12. What is the Enigma?
|
|
8.13. How do I shuffle cards?
|
|
8.14. Can I foil S/W pirates by encrypting my CD-ROM?
|
|
8.15. Can you do automatic cryptanalysis of simple ciphers?
|
|
8.16. What is the coding system used by VCR+?
|
|
|
|
|
|
8.1. How do I recover from lost passwords in WordPerfect?
|
|
|
|
WordPerfect encryption has been shown to be very easy to break.
|
|
The method uses XOR with two repeating key streams: a typed password
|
|
and a byte-wide counter initialized to 1+<the password length>. Full
|
|
descriptions are given in Bennett [BEN87] and Bergen and Caelli
|
|
[BER91].
|
|
|
|
Chris Galas writes: ``Someone awhile back was looking for a way to
|
|
decrypt WordPerfect document files and I think I have a solution.
|
|
There is a software company named: Accessdata (87 East 600 South,
|
|
Orem, UT 84058), 1-800-658-5199 that has a software package that will
|
|
decrypt any WordPerfect, Lotus 1-2-3, Quatro-Pro, MS Excel and Paradox
|
|
files. The cost of the package is $185. Steep prices, but if you
|
|
think your pw key is less than 10 characters, (or 10 char) give them a
|
|
call and ask for the free demo disk. The demo disk will decrypt files
|
|
that have a 10 char or less pw key.'' Bruce Schneier says the phone
|
|
number for AccessData is 801-224-6970.
|
|
|
|
8.2. How do I break a Vigenere (repeated-key) cipher?
|
|
|
|
A repeated-key cipher, where the ciphertext is something like the
|
|
plaintext xor KEYKEYKEYKEY (and so on), is called a Vigenere cipher.
|
|
If the key is not too long and the plaintext is in English, do the
|
|
following:
|
|
|
|
1. Discover the length of the key by counting coincidences.
|
|
(See Gaines [GAI44], Sinkov [SIN66].) Trying each displacement of
|
|
the ciphertext against itself, count those bytes which are equal.
|
|
If the two ciphertext portions have used the same key, something
|
|
over 6% of the bytes will be equal. If they have used different
|
|
keys, then less than 0.4% will be equal (assuming random 8-bit bytes
|
|
of key covering normal ASCII text). The smallest displacement which
|
|
indicates an equal key is the length of the repeated key.
|
|
|
|
2. Shift the text by that length and XOR it with itself. This
|
|
removes the key and leaves you with text XORed with itself. Since
|
|
English has about 1 bit of real information per byte, 2 streams of
|
|
text XORed together has 2 bits of info per 8-bit byte, providing
|
|
plenty of redundancy for choosing a unique decryption. (And in fact
|
|
one stream of text XORed with itself has just 1 bit per byte.)
|
|
|
|
If the key is short, it might be even easier to treat this as a
|
|
standard polyalphabetic substitution. All the old cryptanalysis
|
|
texts show how to break those. It's possible with those methods, in
|
|
the hands of an expert, if there's only ten times as much text as key.
|
|
See, for example, Gaines [GAI44], Sinkov [SIN66].
|
|
|
|
8.3. How do I send encrypted mail under UNIX? [PGP, RIPEM, PEM, ...]
|
|
|
|
Here's one popular method, using the des command:
|
|
|
|
cat file | compress | des private_key | uuencode | mail
|
|
|
|
Meanwhile, there is a de jure Internet standard in the works called
|
|
PEM (Privacy Enhanced Mail). It is described in RFCs 1421 through
|
|
1424. To join the PEM mailing list, contact pem-dev-request@tis.com.
|
|
There is a beta version of PEM being tested at the time of this
|
|
writing.
|
|
|
|
There are also two programs available in the public domain for encrypting
|
|
mail: PGP and RIPEM. Both are available by FTP. Each has its own
|
|
newsgroup: alt.security.pgp and alt.security.ripem. Each has its own FAQ
|
|
as well.
|
|
|
|
PGP is most commonly used outside the USA since it uses the RSA algorithm
|
|
without a license and RSA's patent is valid only (or at least primarily)
|
|
in the USA.
|
|
|
|
RIPEM is most commonly used inside the USA since it uses the RSAREF which
|
|
is freely available within the USA but not available for shipment outside
|
|
the USA.
|
|
|
|
Since both programs use a secret key algorithm for encrypting the body of
|
|
the message (PGP used IDEA; RIPEM uses DES) and RSA for encrypting the
|
|
message key, they should be able to interoperate freely. Although there
|
|
have been repeated calls for each to understand the other's formats and
|
|
algorithm choices, no interoperation is available at this time (as far as
|
|
we know).
|
|
|
|
8.4. Is the UNIX crypt command secure?
|
|
|
|
No. See [REE84]. There is a program available called cbw (crypt
|
|
breaker's workbench) which can be used to do ciphertext-only attacks
|
|
on files encrypted with crypt. One source for CBW is [FTPCB].
|
|
|
|
8.5. How do I use compression with encryption?
|
|
|
|
A number of people have proposed doing perfect compression followed by
|
|
some simple encryption method (e.g., XOR with a repeated key).
|
|
|
|
Unfortunately, you can only compress perfectly if you know the exact
|
|
distribution of possible inputs. For all practical purposes it's
|
|
impossible to describe ``the typical English text'' beyond coarse
|
|
characteristics such as single-letter frequencies. You can build up
|
|
more and more sophisticated models of your inputs, but if the enemy
|
|
has a slightly more accurate model, he'll be able to find some
|
|
redundancy in your compressed output.
|
|
|
|
Note that nearly all practical compression schemes, unless they
|
|
have been designed with cryptography in mind, produce output that
|
|
actually starts off with high redundancy. For example, the output of
|
|
UNIX compress begins with a well-known three-byte ``magic number''
|
|
that can serve as an entering wedge for cryptanalysis.
|
|
|
|
This is not to say that compression before encryption is inherently a
|
|
bad idea; it just has to be done very, very carefully, and by no means
|
|
removes the need for strong encryption.
|
|
|
|
Compression after encryption is silly.
|
|
|
|
8.6. Is there an unbreakable cipher?
|
|
|
|
Yes. The one-time pad is unbreakable; see part 4. Unfortunately the
|
|
one-time pad requires secure distribution of as much key material as
|
|
plaintext.
|
|
|
|
Of course, a cryptosystem need not be utterly unbreakable to be
|
|
useful. Rather, it needs to be strong enough to resist attacks by
|
|
likely enemies for whatever length of time the data it protects is
|
|
expected to remain valid.
|
|
|
|
8.7. What does ``random'' mean in cryptography?
|
|
|
|
Cryptographic applications demand much more out of a pseudorandom
|
|
number generator than most applications. For a source of bits to be
|
|
cryptographically random, it must be computationally impossible to
|
|
predict what the Nth random bit will be given complete knowledge of
|
|
the algorithm or hardware generating the stream and the sequence of
|
|
0th through N-1st bits, for all N up to the lifetime of the source.
|
|
|
|
A software generator (also known as pseudo-random) has the function
|
|
of expanding a truly random seed to a longer string of apparently
|
|
random bits. This seed must be large enough not to be guessed by
|
|
the opponent. Ideally, it should also be truly random (perhaps
|
|
generated by a hardware random number source).
|
|
|
|
Those who have Sparcstation 1 workstations could, for example,
|
|
generate random numbers using the audio input device as a source of
|
|
entropy, by not connecting anything to it. For example,
|
|
|
|
cat /dev/audio | compress - >foo
|
|
|
|
gives a file of high entropy (not random but with much randomness in
|
|
it). One can then encrypt that file using part of itself as a key,
|
|
for example, to convert that seed entropy into a pseudo-random
|
|
string.
|
|
|
|
When looking for hardware devices to provide this entropy, it is
|
|
important really to measure the entropy rather than just assume that
|
|
because it looks complicated to a human, it must be "random". For
|
|
example, disk operation completion times sound like they might be
|
|
unpredictable (to many people) but a spinning disk is much like a
|
|
clock and its output completion times are relatively low in entropy.
|
|
|
|
8.8. What is the unicity point (a.k.a. unicity distance)?
|
|
|
|
See [SHA49]. The unicity distance is an approximation to that amount
|
|
of ciphertext such that the sum of the real information (entropy) in
|
|
the corresponding source text and encryption key equals the number
|
|
of ciphertext bits used. Ciphertexts significantly longer than this
|
|
can be shown probably to have a unique decipherment. This is used to
|
|
back up a claim of the validity of a ciphertext-only cryptanalysis.
|
|
Ciphertexts significantly shorter than this are likely to have
|
|
multiple, equally valid decryptions and therefore to gain security
|
|
from the opponent's difficulty choosing the correct one.
|
|
|
|
Unicity distance, like all statistical or information-theoretic
|
|
measures, does not make deterministic predictions but rather gives
|
|
probabilistic results: namely, the minimum amount of ciphertext
|
|
for which it is likely that there is only a single intelligible
|
|
plaintext corresponding to the ciphertext, when all possible keys
|
|
are tried for the decryption. Working cryptologists don't normally
|
|
deal with unicity distance as such. Instead they directly determine
|
|
the likelihood of events of interest.
|
|
|
|
Let the unicity distance of a cipher be D characters. If fewer than
|
|
D ciphertext characters have been intercepted, then there is not
|
|
enough information to distinguish the real key from a set of
|
|
possible keys. DES has a unicity distance of 17.5 characters,
|
|
which is less than 3 ciphertext blocks (each block corresponds to
|
|
8 ASCII characters). This may seem alarmingly low at first, but
|
|
the unicity distance gives no indication of the computational work
|
|
required to find the key after approximately D characters have been
|
|
intercepted.
|
|
|
|
In fact, actual cryptanalysis seldom proceeds along the lines used
|
|
in discussing unicity distance. (Like other measures such as key
|
|
size, unicity distance is something that guarantees insecurity if
|
|
it's too small, but doesn't guarantee security if it's high.) Few
|
|
practical cryptosystems are absolutely impervious to analysis; all
|
|
manner of characteristics might serve as entering ``wedges'' to crack
|
|
some cipher messages. However, similar information-theoretic
|
|
considerations are occasionally useful, for example, to determine a
|
|
recommended key change interval for a particular cryptosystem.
|
|
Cryptanalysts also employ a variety of statistical and
|
|
information-theoretic tests to help guide the analysis in the most
|
|
promising directions.
|
|
|
|
Unfortunately, most literature on the application of information
|
|
statistics to cryptanalysis remains classified, even the seminal
|
|
1940 work of Alan Turing (see [KOZ84]). For some insight into the
|
|
possibilities, see [KUL68] and [GOO83].
|
|
|
|
8.9. What is key management and why is it important?
|
|
|
|
One of the fundamental axioms of cryptography is that the enemy is in
|
|
full possession of the details of the general cryptographic system,
|
|
and lacks only the specific key data employed in the encryption. (Of
|
|
course, one would assume that the CIA does not make a habit of telling
|
|
Mossad about its cryptosystems, but Mossad probably finds out anyway.)
|
|
Repeated use of a finite amount of key provides redundancy that can
|
|
eventually facilitate cryptanalytic progress. Thus, especially in
|
|
modern communication systems where vast amounts of information are
|
|
transferred, both parties must have not only a sound cryptosystem but
|
|
also enough key material to cover the traffic.
|
|
|
|
Key management refers to the distribution, authentication, and
|
|
handling of keys.
|
|
|
|
A publicly accessible example of modern key management technology
|
|
is the STU III secure telephone unit, which for classified use
|
|
employs individual coded ``Crypto Ignition Keys'' and a central Key
|
|
Management Center operated by NSA. There is a hierarchy in that
|
|
certain CIKs are used by authorized cryptographic control
|
|
personnel to validate the issuance of individual traffic keys and
|
|
to perform installation/maintenance functions, such as the
|
|
reporting of lost CIKs.
|
|
|
|
This should give an inkling of the extent of the key management
|
|
problem. For public-key systems, there are several related issues,
|
|
many having to do with ``whom do you trust?''
|
|
|
|
8.10. Can I use pseudo-random or chaotic numbers as a key stream?
|
|
|
|
Chaotic equations and fractals produce an apparent randomness from
|
|
relatively compact generators. Perhaps the simplest example is a
|
|
linear congruential sequence, one of the most popular types of random
|
|
number generators, where there is no obvious dependence between seeds
|
|
and outputs. Unfortunately the graph of any such sequence will, in a
|
|
high enough dimension, show up as a regular lattice. Mathematically
|
|
this lattice corresponds to structure which is notoriously easy for
|
|
cryptanalysts to exploit. More complicated generators have more
|
|
complicated structure, which is why they make interesting pictures---
|
|
but a cryptographically strong sequence will have no computable
|
|
structure at all.
|
|
|
|
See [KNU81], exercise 3.5-7; [REE77]; and [BOY89].
|
|
|
|
8.11. What is the correct frequency list for English letters?
|
|
|
|
There are three answers to this question, each slightly deeper than
|
|
the one before. You can find the first answer in various books:
|
|
namely, a frequency list computed directly from a certain sample of
|
|
English text.
|
|
|
|
The second answer is that ``the English language'' varies from author
|
|
to author and has changed over time, so there is no definitive list.
|
|
Of course the lists in the books are ``correctly'' computed, but
|
|
they're all different: exactly which list you get depends on which
|
|
sample was taken. Any particular message will have different
|
|
statistics from those of the language as a whole.
|
|
|
|
The third answer is that yes, no particular message is going to have
|
|
exactly the same characteristics as English in general, but for all
|
|
reasonable statistical uses these slight discrepancies won't matter.
|
|
In fact there's an entire field called ``Bayesian statistics'' (other
|
|
buzzwords are ``maximum entropy methods'' and ``maximum likelihood
|
|
estimation'') which studies questions like ``What's the chance that a
|
|
text with these letter frequencies is in English?'' and comes up with
|
|
reasonably robust answers.
|
|
|
|
So make your own list from your own samples of English text. It will
|
|
be good enough for practical work, if you use it properly.
|
|
|
|
8.12. What is the Enigma?
|
|
|
|
``For a project in data security we are looking for sources of
|
|
information about the German Enigma code and how it was broken by
|
|
the British during WWII.''
|
|
|
|
See [WEL82], [DEA85], [KOZ84], [HOD83], [KAH91].
|
|
|
|
8.13. How do I shuffle cards?
|
|
|
|
Card shuffling is a special case of the permutation of an array of
|
|
values, using a random or pseudo-random function. All possible output
|
|
permutations of this process should be equally likely. To do this, you
|
|
need a random function (modran(x)) which will produce a uniformly
|
|
distributed random integer in the interval [0..x-1]. Given that
|
|
function, you can shuffle with the following [C] code: (assuming ARRLTH
|
|
is the length of array arr[] and swap() interchanges values at the two
|
|
addresses given)
|
|
|
|
for ( n = ARRLTH-1; n > 0 ; n-- ) swap( &arr[modran( n+1 )], &arr[n] ) ;
|
|
|
|
modran(x) can not be achieved exactly with a simple (ranno() % x) since
|
|
ranno()'s interval may not be divisible by x, although in most cases the
|
|
error will be very small. To cover this case, one can take ranno()'s
|
|
modulus mod x, call that number y, and if ranno() returns a value less
|
|
than y, go back and get another ranno() value.
|
|
|
|
See [KNU81] for further discussion.
|
|
|
|
8.14. Can I foil S/W pirates by encrypting my CD-ROM?
|
|
|
|
Someone will frequently express the desire to publish a CD-ROM with
|
|
possibly multiple pieces of software, perhaps with each encrypted
|
|
separately, and will want to use different keys for each user (perhaps
|
|
even good for only a limited period of time) in order to avoid piracy.
|
|
|
|
As far as we know, this is impossible, since there is nothing in standard
|
|
PC or workstation hardware which uniquely identifies the user at the
|
|
keyboard. If there were such an identification, then the CD-ROM could be
|
|
encrypted with a key based in part on the one sold to the user and in
|
|
part on the unique identifier. However, in this case the CD-ROM is one
|
|
of a kind and that defeats the intended purpose.
|
|
|
|
If the CD-ROM is to be encrypted once and then mass produced, there must
|
|
be a key (or set of keys) for that encryption produced at some stage in
|
|
the process. That key is useable with any copy of the CD-ROM's data.
|
|
The pirate needs only to isolate that key and sell it along with the
|
|
illegal copy.
|
|
|
|
8.15. Can you do automatic cryptanalysis of simple ciphers?
|
|
|
|
Certainly. For commercial products you can try AccessData; see
|
|
question 8.1. We are not aware of any FTP sites for such software,
|
|
but there are many papers on the subject. See [PEL79], [LUC88],
|
|
[CAR86], [CAR87], [KOC87], [KOC88], [KIN92], [KIN93], [SPI93].
|
|
|
|
8.16. What is the coding system used by VCR+?
|
|
|
|
One very frequently asked question in sci.crypt is how the VCR+ codes
|
|
work. See [SHI92] for an attempt to describe it.
|