349 lines
15 KiB
Plaintext
349 lines
15 KiB
Plaintext
CHAPTER 3 OPERATION AND REQUIREMENTS
|
||
|
||
|
||
Creating Programs to Assemble
|
||
|
||
Before you invoke A86 you must have an assembly-language source
|
||
program to assemble. A source program is an ASCII text file,
|
||
created with the text editor of your choice. The editor must
|
||
produce a file that is free of internal records known only to the
|
||
editor. Some of the fancier word processors will require you to
|
||
use a "plain text" mode to insure that the file is free of such
|
||
records.
|
||
|
||
This manual will fully explain to you the correct syntax of an
|
||
A86 program, but it is not intended to teach you about the
|
||
86-family instruction set, or about assembly-language interfacing
|
||
to your computer or your operating system.
|
||
|
||
The instruction set charts in Chapters 6 and 7 give concise,
|
||
one-line descriptions of each instruction, but they don't go into
|
||
any detail about instruction usage. For such detail, I recommend
|
||
either one of the two books The 8086/8088 Primer by Stephen P.
|
||
Morse, or The 80286 Architecture by Morse and Albert. The latter
|
||
book covers the 8087/287 and is recommended if you have a
|
||
floating-point coprocessor, or if you wish to explore the
|
||
expanded capabilities of the 286. (My 386 book is the latest in
|
||
a series in which those two books are predecessors.)
|
||
|
||
To learn how to make system calls to input from keyboard or disk,
|
||
output to screen, printer or disk, etc., you need a book that
|
||
covers the MS-DOS operating system and the BIOS for the IBM-PC,
|
||
or whatever computer you have if it's non-IBM-BIOS-compatible (if
|
||
you don't know whether or not it's compatible, it probably is).
|
||
I used Peter Norton's Programmer's Guide to the IBM-PC. If
|
||
you're less familiar with assembly language, you will probably
|
||
want his Assembly Language Guide to the IBM-PC instead.
|
||
|
||
|
||
Program Invocation
|
||
|
||
To invoke A86, you must provide a program invocation line, either
|
||
typed to the console when the DOS command prompt appears, or
|
||
included in a batch file. The program invocation line consists
|
||
of the following:
|
||
|
||
1. The program name A86.
|
||
|
||
2. The names of the source files you want to assemble. You may
|
||
use the wild card delimiters * and ? if you wish, to denote a
|
||
group of source files to be assembled. A86 will sort all
|
||
matching names into alphabetical order for each wild card
|
||
specification; so the files will be assembled in the same
|
||
order even if they get jumbled up within a directory.
|
||
|
||
A86 identifies the end of the source file names when it sees a
|
||
name with no extension, or a name with the default object
|
||
extension (COM, BIN or OBJ, as described shortly). Sorry, you
|
||
cannot have a source file with the default object extension.
|
||
3-2
|
||
|
||
3. You may optionally provide the word TO, to separate the source
|
||
file names from the output file names.
|
||
|
||
4. The name of the output program. If you do not provide an
|
||
extension, A86 will assume one of the following extensions:
|
||
|
||
a. .OBJ if you invoked the +O switch, for linkable object file
|
||
production.
|
||
|
||
b. .BIN if there is no +O switch, but there is an ORG 0 in
|
||
your program.
|
||
|
||
c. .COM otherwise.
|
||
|
||
If you want your program file to have no extension, you end
|
||
the file name with a period.
|
||
|
||
You have the option to omit both the program file name and the
|
||
symbol table file name from the invocation. If you do so, A86
|
||
will output the program source.COM (or source.OBJ or
|
||
source.BIN) and the symbol table source.SYM; where "source" is
|
||
a name derived from the list of source files, according to the
|
||
rules described in the section "Strategies for Source File
|
||
Maintenance" later in this chapter.
|
||
|
||
5. The name of the symbol table file. You do not need to give
|
||
the .SYM extension: A86 will produce a file with extension
|
||
.SYM in any case. In earlier versions of A86 I had allowed
|
||
other extensions to be specified, but this meant that by
|
||
carelessly permuting names on the command line, you could
|
||
destroy a source file-- not good!
|
||
|
||
You can omit the name of the symbol table file. If you do so,
|
||
A86 will use the same root as the output program name. If you
|
||
desire no symbol table file, specify the +S switch in your
|
||
invocation line or A86 environment variable (described later
|
||
in this chapter).
|
||
|
||
|
||
Assembler Switches
|
||
|
||
In addition to input and output file names, you may intersperse
|
||
assembler switch settings anywhere after the A86 program name.
|
||
They are all acted upon immediately, no matter where they are on
|
||
the command line. Some of the switches are discussed in more
|
||
detail elsewhere; I'll summarize them here:
|
||
|
||
+C causes the assembler to output symbol names with lower case
|
||
letters to its OBJ and SYM files. The case of letters is
|
||
still ignored during assembly. I output the name as it
|
||
appears in the last PUBLIC or EXTRN directive containing it;
|
||
if there is no such directive, I use the first occurrence of
|
||
the symbol to control which letters are output lower case.
|
||
(+C duplicates Microsoft MASM's /mx switch.)
|
||
3-3
|
||
|
||
+c causes the assembler to consider the case of letters within
|
||
all non-built-in symbols as significant both during assembly
|
||
and for output. Thus, for example, you can define different
|
||
symbols X and x. (+c duplicates MASM's /ml switch.)
|
||
|
||
+D causes the default base for numeric constants to be decimal,
|
||
even if the constants have leading zeroes.
|
||
|
||
-D causes the default base to be hexadecimal if there is a
|
||
leading zero; decimal otherwise.
|
||
|
||
+E causes the error-message-augmented source file to be written
|
||
to yourname.ERR within the current directory, in all cases.
|
||
With +E, A86 will never rewrite your original source file.
|
||
|
||
-E causes A86 to insert error messages into your source file,
|
||
whenever the file is in the current directory. If the file
|
||
is not in the current directory, A86 write an ERR file no
|
||
matter what the E switch setting is.
|
||
|
||
+F causes A86 to generate the 287 form of floating point
|
||
instructions (no implicit FWAIT bytes are generated before
|
||
the instructions). This mode can also be specified in the
|
||
program with the .287 directive.
|
||
|
||
+f causes A86 to support emulation of the 8087. When A86 sees a
|
||
floating point instruction, it generates external references
|
||
to be resolved by the standard emulation library (provided by
|
||
Microsoft, Borland, etc.). When you LINK your program to the
|
||
emulation library, the floating point instructions are
|
||
emulated by software. NOTE you must be assembling to a
|
||
linkable OBJ file for this mode to have effect; otherwise, +f
|
||
is ignored.
|
||
|
||
-F causes emulation and default-287 to be disabled. You'll
|
||
still get 287 generation if there is a .287 directive in your
|
||
program.
|
||
|
||
+Ln causes A86 to implement one or more of the following minor
|
||
options for code-generation. All these options enhance MASM
|
||
compatibility. The first three do so at the expense of
|
||
program size. The number n should be the sum of the numbers
|
||
for each of the options selected. For example, +L10 will
|
||
select the options numbered 2 and 8.
|
||
|
||
1 causes A86 to generate a longer (3-byte) instruction form
|
||
for an unconditional JMP instruction to a forward
|
||
reference local label, e.g. JMP >L1. A86 normally
|
||
assumes that since it's a local label, it will be nearby
|
||
and the short, 2-byte form will work. With this option
|
||
your code will usually be longer than necessary, but
|
||
you'll be spared having to occasionally go back and code
|
||
an explicit JMP LONG >L1.
|
||
|
||
2 causes A86 to refrain from optimizing the LEA instruction.
|
||
Without this option A86 will replace an LEA with a
|
||
shorter, equivalent MOV when it sees the chance.
|
||
3-4
|
||
|
||
4 causes A86 to generate a slightly more inefficient internal
|
||
format for memory references within an OBJ file. The
|
||
Power C compiler's MIX utility requires the inefficient
|
||
form. The makers of Power C refused to support their
|
||
customers on this by enhancing MIX, so I am forced to
|
||
offer this option.
|
||
|
||
8 causes A86 to assume that all ambiguous forward reference
|
||
operands to instructions other than jumps or calls refer
|
||
to memory variables and not offsets or constant values.
|
||
You can override this on a one-by-one basis, with the
|
||
OFFSET operator.
|
||
|
||
-L causes A86 to revert to its default for all the above
|
||
options.
|
||
|
||
+O causes A86 to produce a linkable .OBJ file when the output
|
||
file name extension is not explicitly given.
|
||
|
||
-O causes A86 to produce an executable .COM file when the output
|
||
file name extension is not explicitly given.
|
||
|
||
+S suppresses the creation of the symbol table (.SYM) file in
|
||
normal (no errors) assembly. This is overridden if you give
|
||
an explicit symbols file name in the invocation line.
|
||
|
||
-S causes the symbol table file to be created in all cases.
|
||
|
||
+X causes A86 to require that undefined names be explicitly
|
||
declared with an EXTRN when A86 is producing a linkable .OBJ
|
||
file. The X switch has no effect when A86 is making a .COM
|
||
file.
|
||
|
||
-X causes A86 to quietly assume that all undefined names are
|
||
valid external references.
|
||
|
||
The default setting for all the switches is "minus". Multiple
|
||
switches can be specified with a single sign; e.g. +OX is the
|
||
same as +O+X.
|
||
|
||
|
||
|
||
The A86 Environment Variable
|
||
|
||
To allow you to customize A86, the assembler examines the MS-DOS
|
||
environment variable named "A86" when it is invoked. If there is
|
||
such a variable, its contents are inserted before the invocation
|
||
command tail, as if you had typed them yourself.
|
||
|
||
For example, if you execute the command SET A86=+OX while in DOS
|
||
(typically in the AUTOEXEC.BAT file run when the computer is
|
||
started), then the O and X switches will be "plus", unless
|
||
overridden with a "minus" setting in the command line.
|
||
3-5
|
||
|
||
You may also include one or more file names in the A86
|
||
environment variable. Those files will always be assembled
|
||
first, before the files you specify on the command line. This
|
||
allows you to set up a library of macro definitions, which will
|
||
always be automatically available to your programs. Thus, for
|
||
example, the DOS command SET A86=C:\A86\MACDEF.8 +OX will cause
|
||
both the O and X switches to default ON, but will also cause the
|
||
file MACDEF.8 of subdirectory A86 of drive C to always be
|
||
assembled.
|
||
|
||
|
||
Using Standard Input as a Command Tail
|
||
|
||
The following feature is a bit advanced. If you're not familiar
|
||
with the practice of redirecting standard input, you may safely
|
||
skip this section.
|
||
|
||
A86 can also be configured to take its command arguments from
|
||
standard input, in addition to the invocation command tail or the
|
||
A86 environment variable. This allows A86 to be used in those
|
||
menu-driven systems that don't generate command tails for
|
||
programs. It also allows other programs to create lists of files
|
||
to be assembled, then "pipe" the list to A86.
|
||
|
||
Here's how the feature works: when the command argument A86 is an
|
||
ampersand &, A86 will prompt for standard input. If the
|
||
ampersand is seen but there are other things following it, the
|
||
ampersand is ignored.
|
||
|
||
For example, you can place a list of file names and switch
|
||
settings into a file called FILELIST. You can then invoke the
|
||
assembler via
|
||
|
||
A86 <FILELIST &
|
||
|
||
which will cause the contents of the FILELIST file to be used as
|
||
a command line.
|
||
|
||
You may place an ampersand at the end of your A86 environment
|
||
variable. If you do so, then A86 will prompt for file names
|
||
whenever it is invoked without any command arguments (you type
|
||
A86 followed immediately by the ENTER key to the MS-DOS prompt).
|
||
This is the mode used if you have a menu system that can't
|
||
generate an invocation command tail.
|
||
|
||
Note: when you redirect standard input so that it comes from a
|
||
file, A86 will read all the lines of the file (up to a limit of
|
||
1023 bytes), and substitute spaces for the line breaks. Thus you
|
||
may give the file names on individual lines, for readability.
|
||
However, if the feature is invoked manually (no redirection), so
|
||
that you are typing in the line after the prompt, A86 will take
|
||
the first line only. You need to give all your switches and
|
||
files on that one line.
|
||
3-6
|
||
|
||
Strategies for Source File Maintenance
|
||
|
||
A86 encourages modular programming, by letting you break your
|
||
source into separate files, with complete impunity. A86 has no
|
||
concern whatsoever for file breaks-- it treats the sequence of
|
||
files as a single source code stream.
|
||
|
||
You should consider one or more of the following strategies,
|
||
which I have adopted in my source file management:
|
||
|
||
1. I name all my A86 source files with the same extension, which
|
||
is found on no other files. The particular extension I have
|
||
chosen is ".8". I did not choose the more common .ASM, because
|
||
I have a few source files designed for MSDOS's assembler. If
|
||
you don't like .8, I would suggest .A86.
|
||
|
||
2. I keep a separate subdirectory on my hard disk for each
|
||
multi-source-file A86 program I have. Then the simple command
|
||
"A86 *.8" performs the assembly for the current directory's
|
||
program.
|
||
|
||
3. I exploit the fact that A86 expands wild cards into
|
||
alphabetical order. Whenever I want a source file to be
|
||
assembled first (e.g., when it contains variable
|
||
declarations), I append a decimal digit to the start of the
|
||
file name: 0 for the first file, 1 for the second, etc., for
|
||
however many files that need to be explicitly ordered. If a
|
||
file needs to come last, I append a "Z" to the start of the
|
||
file name.
|
||
|
||
To accommodate this strategy, I have programmed A86 to a
|
||
somewhat complicated algorithm for determining the default
|
||
output file name. I use the name of the first source file;
|
||
but I truncate the first character if it is a decimal digit.
|
||
However, you may have a general-purpose file that must come
|
||
first; so I have provided the following exception: if you have
|
||
a source file whose name begins with the digit "9", that name
|
||
(without the 9) is used. If you don't like this, you can
|
||
always explicitly give the program name you want: "A86 *.8
|
||
MYPROG".
|
||
|
||
|
||
System Requirements for A86
|
||
|
||
A86 requires MS-DOS V2.00 or later. No BIOS or lower-level calls
|
||
are made by A86, so A86 should run on any MS-DOS machine. Please
|
||
let me know if you find this not to be the case.
|
||
|
||
A86 itself is a small program, and it is fairly flexible about
|
||
the memory it uses. You can assemble small files with only 32K
|
||
bytes of memory beyond the program itself, which in the current
|
||
version is under 22K bytes-- a total of 54K bytes beyond the
|
||
operating system. The more memory you have, the more capacity A86
|
||
has, in symbol table size, source file size, and output file
|
||
size. If it can, A86 will use up to 256K bytes of memory.
|
||
3-7
|
||
|
||
As I just noted, in a small-memory system A86 severely limits the
|
||
size of source files. But remember that this doesn't hurt you
|
||
badly: you can split up source files with impunity. A86
|
||
assembles a sequence of files as if it were a single source
|
||
stream (similar to the INCLUDE facility of other assemblers).
|
||
|
||
|