277 lines
11 KiB
Plaintext
277 lines
11 KiB
Plaintext
CHOP
|
||
Version 3.1
|
||
|
||
Author: Walter J. Kennamer Compuserve PPN: 74025,514
|
||
|
||
|
||
CHOP breaks big files into smaller ones. A number of options are
|
||
supported to determine exactly where the breaks take place. This
|
||
version also allows you to extract a portion of a file.
|
||
|
||
More information about CHOP follows; but first, a word from my lawyers:
|
||
|
||
------------------------------------------------------------------------------
|
||
Copyright (c) 1986, 1987 Walter J. Kennamer. All Rights Reserved.
|
||
|
||
You are free to use, copy and distribute CHOP providing that:
|
||
|
||
NO FEE IS CHARGED FOR USE, COPYING OR DISTRIBUTION.
|
||
|
||
IT IS NOT MODIFIED IN ANY WAY.
|
||
|
||
THIS DOCUMENTATION FILE (UNMODIFIED) ACCOMPANIES ALL COPIES.
|
||
|
||
This program is provided AS IS without any warranty, expressed or
|
||
implied, including but not limited to fitness for a particular purpose.
|
||
|
||
------------------------------------------------------------------------------
|
||
|
||
|
||
Usage:
|
||
A>CHOP infile [-switches]
|
||
|
||
"Infile" is a unambiguous file name (i.e., no wildcards are allowed).
|
||
The original input file will be unchanged. The output files will have
|
||
the same stem as the input file, but the extension will be numbered
|
||
consecutively from 1. For example, if you break FOO.BAR into three
|
||
smaller files, FOO.BAR will be unchanged, and there will be three
|
||
output files--FOO.1, FOO.2 and FOO.3.
|
||
|
||
Each switch requires a separate switch character (e.g., "-a -b" rather
|
||
than "-ab"). Switch order does not matter, unless you enter mutually-
|
||
exclusive switches, in which case the last one takes precedence.
|
||
|
||
Type CHOP by itself to see help.
|
||
|
||
|
||
==============================
|
||
Command line options:
|
||
==============================
|
||
|
||
This table summarizes command line options. Options can be preceded
|
||
by a hyphen(-) or a slash(/) and can be upper or lower case.
|
||
|
||
These options determine how much of the file will be output
|
||
|
||
-Bx Beginning byte to extract (default = 1).
|
||
-Ex Ending byte to extract (default = end of file).
|
||
|
||
These options determine how the file will be partitioned. They
|
||
are mutually exclusive.
|
||
|
||
-Px Chop file into x pieces (default = 2).
|
||
-Sxxx Chop file into xxx-sized pieces.
|
||
|
||
These options determine where the data comes from and where it goes.
|
||
|
||
-Ifilename Read input from "filename".
|
||
-Odirectory Send output to "directory".
|
||
-T Trample over existing files.
|
||
|
||
These options determine if the break will occur at a precise byte or at a
|
||
set of characters near the computed boundary. -R and -X are mutually
|
||
exclusive. -N, -A, -H, -L, -C and -W have no meaning if -X is selected.
|
||
-R is the default option. The term "return character(s)" means the
|
||
character(s) that determine where the file will be chopped.
|
||
|
||
-R Try to chop at a "return" character (default is CR/LF).
|
||
-Nfoo Define a sequence of "return" characters (e.g.,"foo").
|
||
-A Chop after the "return" characters (default).
|
||
-H Chop before the "return" characters.
|
||
-Lxxx Limit search for "return" characters to xxx bytes.
|
||
-C Make "return" characters case sensitive.
|
||
-W Chop at each occurrence of the "return" string.
|
||
-X Chop at the exact computed byte.
|
||
|
||
-Mxxx Define the maximum number of chops (default = 256).
|
||
-Gxxx Start output file numbering with xxx.
|
||
-Q Quiet. Do not show program status on screen.
|
||
-Z Do not insert a Ctrl-Z EOF at end of each output file.
|
||
-J Pause for a keystroke between chops.
|
||
|
||
==============================
|
||
Terminology and other notes:
|
||
==============================
|
||
|
||
The term "computed break point" means the place in the file where the
|
||
split would normally occur, if CHOP was not doing something special
|
||
about return characters. For example, if you have a file of 100,000
|
||
bytes that is being split into 5 parts, the computed break points are
|
||
at these bytes:
|
||
20,000
|
||
40,000
|
||
60,000
|
||
80,000
|
||
You can force the breaks to occur exactly at these points by using the
|
||
-X (exact) switch on the command line. Or, you can let CHOP try to
|
||
find a logical breaking point in the file (normally a carriage return /
|
||
line feed).
|
||
|
||
The term "return string" or "return characters" means the sequence of
|
||
one or more characters that defines a newline, or some other
|
||
interesting boundary in the file. CHOP assumes that you would prefer
|
||
to split the file at a natural boundary, rather than just someplace in
|
||
the middle. By default, CHOP assumes the break should occur at a
|
||
carriage return / line feed character sequence (CR LF -- Hex 0D 0A).
|
||
Thus, if CHOP plans to break the file at the 1000th byte, it will
|
||
actually look a little ahead of byte 1000 to try to find a newline (CR
|
||
LF) and split it there, rather than at byte 1000.
|
||
|
||
You can redefine the return string to be something else. For example,
|
||
Compuserve messages begin with the "#:" character sequence. By
|
||
defining this sequence as the return string, you instruct CHOP to split
|
||
the file only between messages--no message would be split across CHOP
|
||
output files. You would define "#:" as the return string by using the
|
||
switch "-n#:" on the command line (see examples).
|
||
|
||
CHOP ordinarily splits a file after the return string. You can make
|
||
the split happen before the return string by using the -h switch. You
|
||
would probably want to use this switch in the preceding Compuserve
|
||
example since the "#:" characters mark the beginning of a message. You
|
||
would probably want them to be the first characters in a new file,
|
||
rather than the last characters in the preceding file.
|
||
|
||
You can also limit how far CHOP is willing to search for the return
|
||
string. The -L parameter determines how many bytes forward of the
|
||
computed break point CHOP looks for the return. The default is 1000
|
||
bytes. If it cannot find the return string within the number of bytes
|
||
specified with -L, CHOP breaks the file at the computed point. CHOP
|
||
never breaks a file before the computed point. As a consequence, the
|
||
last CHOP output file will typically be a little smaller than the
|
||
earlier ones: the differences between the computed break points and
|
||
the actual return boundaries mount up.
|
||
|
||
The first byte of a file is byte 0 or byte 1. The second byte is always
|
||
byte 2. In other words, CHOP always counts from 1. if you specify byte
|
||
0, it assumes you mean the beginning byte.
|
||
|
||
CHOP will ordinarily decline to overwrite any existing files, but will
|
||
display a message and halt instead. If you want to trample over
|
||
existing files (I had to use "trample"--T was the only letter left),
|
||
use the -t switch on the command line. If -t is specified, CHOP will
|
||
write over any files that get in its way.
|
||
|
||
Use the -g switch to change the beginning file number. For example, if
|
||
you want to chop FOO.BAR into several pieces, but you want the first
|
||
one to be numbered FOO.8, use the -g8 switch to set the starting
|
||
number.
|
||
|
||
|
||
==============================
|
||
Examples:
|
||
==============================
|
||
|
||
CHOP foo.bar
|
||
|
||
chops FOO.BAR into two pieces, FOO.1 and FOO.2. If you do not use either
|
||
the -P or -S switches, CHOP assumes you want to split the file into two
|
||
pieces.
|
||
|
||
|
||
CHOP foo.bat -p5
|
||
|
||
chops the foo.bat program into five files of approximately equal size.
|
||
The breaks take place after a CR/LF pair.
|
||
|
||
|
||
CHOP foo.bat -s2000 -x
|
||
|
||
chops foo.bat into 2000-byte pieces. The first chop occurs exactly at
|
||
byte 2000. Note that the output file will actually have 2001 bytes,
|
||
counting the control-Z added to the file (though you can suppress it
|
||
with the -z switch).
|
||
|
||
|
||
CHOP foo.bat -e2000
|
||
|
||
copies the first 2000 bytes of foo.bat to FOO.1. By default, copying
|
||
begins at the first byte of the file.
|
||
|
||
|
||
CHOP foo.bat -e2000 -p2
|
||
|
||
puts the first 2000 bytes of foo.bat into 2 files of about 1000 bytes each.
|
||
|
||
|
||
CHOP foo.bat -b3000 -e3999 -p2 -r
|
||
|
||
puts the 1000 bytes in foo.bat between byte 3000 and 3999 into 2 files.
|
||
The chop will occur at the first CR/LF pair after byte 3500.
|
||
|
||
|
||
CHOP foo.bat -s20000 -nMSG: -oD:\CHOPOUT -g5 -c
|
||
|
||
chops foo.bat into pieces of approximately 20,000 bytes. The first chop
|
||
will occur immediately after the first occurrence of the string "MSG:"
|
||
(case sensitive because of -c) after byte 20000. The output files will be
|
||
D:\CHOPOUT\FOO.5, FOO.6, etc.
|
||
|
||
|
||
CHOP -id:\pdq\cserv.thd -p10 -n#: -h -l2000
|
||
|
||
chops d:\pdq\cserv.thd into about 10 pieces, with the breaks occurring
|
||
immediately before the character string "#:" (used by Compuserve forum
|
||
software to designate a new message). CHOP will search up to 2000 bytes
|
||
past the computed break point for the "#:" string. If it cannot find
|
||
"#:" within 2000 bytes, it will give up and break at the computed point.
|
||
|
||
|
||
CHOP cserv.thd -h -n#: -w -t
|
||
|
||
chops cserv.thd into many files--one for each occurrence of the "#:"
|
||
string. The -w switch implies an unlimited search limit and overrides
|
||
the -P, -S and -X switches. Overwrite any files (CSERV.1, CSERV.2,
|
||
etc.) that are already there.
|
||
|
||
|
||
CHOP program.pas -nPROCEDURE -w -h
|
||
|
||
chops the program.pas Pascal file into many files--one for each
|
||
procedure. After executing this command, each of the procedures in
|
||
"program.pas" will be in a separate file.
|
||
|
||
|
||
CHOP foo.bar -n$0C -p3
|
||
|
||
This example shows how to put hex codes in the "return" string. The
|
||
codes must be exactly two digits (i.e., precede single digit hex codes
|
||
with a 0) and must be preceded by a dollar sign. This example causes
|
||
foo.bar to be chopped into three pieces, with the chop taking place at a
|
||
hex 0C character (ASCII decimal 12, or formfeed).
|
||
|
||
|
||
FOR %1 in (*.TXT) DO CHOP %1 -p4
|
||
|
||
This example illustrates how to use the batch FOR command to CHOP a
|
||
series of files as specified by the wildcard (*.TXT). In this case,
|
||
each .TXT file will be chopped into four pieces of approximately equal
|
||
size.
|
||
|
||
|
||
==============================
|
||
Rejoining Chopped Files
|
||
==============================
|
||
|
||
You can use the DOS copy command to rejoin chopped files. For example,
|
||
this command rejoins two text files--FOO.1 and FOO.2--to recreate the
|
||
original FOO.BAR file.
|
||
|
||
COPY FOO.1/a+FOO.2 FOO.BAR
|
||
|
||
If FOO.BAR, FOO.1 and FOO.2 are binary files, use the /b switch with the
|
||
COPY command:
|
||
|
||
COPY FOO.1/b+FOO.2 FOO.BAR
|
||
|
||
The /b switch causes DOS to treat control-Z characters as legitimate
|
||
data (instead of EOF marks) when the files are joined.
|
||
|
||
==============================
|
||
Support
|
||
==============================
|
||
|
||
If you have any problems with CHOP or any suggestions about how to
|
||
improve it, please contact me on Compuserve (PPN 74025,514) or write to
|
||
me at 1801 E. 12th., Apt 1118, Cleveland, OH 44114.
|
||
|
||
|