64 lines
3.0 KiB
Plaintext
64 lines
3.0 KiB
Plaintext
|
|
The Commodore ARC format
|
|
|
|
Disclaimer: The description below is a mere extrapolation from the files and
|
|
programs I have encountered.
|
|
|
|
ARC is used to compress multiple files into a single archive. There is no
|
|
signature at the beginning of the archives.
|
|
Normally archives start at the beginning of the file but there can be
|
|
different extractors prepended to the archive. A reliable method to get the
|
|
start offset of the archive is reading the BASIC line at the beginning of the
|
|
extractor, subtracting 6 from the BASIC line number and multiplying it by 254.
|
|
In case the argument of the SYS instruction starts with the digit 7, you have
|
|
to subtract 1 from the start offset.
|
|
The archives contain one or more blocks that consist of a file header and
|
|
the compressed data of the file. The original file data is compressed using LZ
|
|
compression bundled with a dynamic Huffman algorithm and is protected with an
|
|
16-bit checksum.
|
|
|
|
The file header has the following structure:
|
|
|
|
POSITION DESCRIPTION
|
|
$00 Entry version (1 or 2)
|
|
$01 Compression method (0-2 for version 1 and 0-5 for version 2)
|
|
$02-$03 Checksum of the file
|
|
$04-$06 Size of the original file data (warning, only three bytes)
|
|
$07-$08 Size of the packed file data in blocks
|
|
$09 Type of the file
|
|
$0A Length of the original file name
|
|
$0B-(N-1) Original name of the file
|
|
|
|
The following two entries only exist in version 2 entries:
|
|
N Record length of the original file
|
|
(N+1)-(N+2) Original date stamp of file (MS-DOS format)
|
|
|
|
Compression methods are the following: 0 means that the file is stored
|
|
without any compression, 1 is for RLE (run length encoding), 2 is for Huffman
|
|
algorithm, 3 is for LZ compression, 4 is for Huffman algorithm with RLE and 5
|
|
is for LZ compression in one pass.
|
|
|
|
The MS-DOS date stmap format packs the last modification date into a word:
|
|
|
|
BIT POSITION DESCRIPTION
|
|
0- 4 Day (1-31)
|
|
5- 9 Month (1-12)
|
|
10-15 Year minus 1980
|
|
|
|
The file name is a Pascal-style PETSCII string, its length being its first
|
|
character.
|
|
|
|
You can read through an ARC archive with the following algorithm:
|
|
|
|
1. Determine the start of the archive, with the above mentioned method.
|
|
2. If you reached the end of the file, stop.
|
|
3. Read in the first part of the header, until the length of the file name
|
|
(11 bytes). Read in the file name (fetch its length from the header) and,
|
|
if a version 2 entry, the record length and the date stamp (3 bytes),
|
|
too. Now you can process the header.
|
|
4. If the file is RLE-packed then read in the RLE control byte (1 byte). Now
|
|
you can process the file.
|
|
4. Add the block count of the packed file size, multiplied by 254, to the
|
|
file position of the beginning of the header. By seeking there you get to
|
|
the next header, goto step 2.
|