285 lines
13 KiB
Plaintext
285 lines
13 KiB
Plaintext
|
CHAPTER 12 COMPATIBILITY WITH OTHER ASSEMBLERS
|
|||
|
|
|||
|
|
|||
|
I gave heavy priority to compatibility when I designed A86; a
|
|||
|
priority just a shade behind the higher priorities of
|
|||
|
reliability, speed, convenience, and power. For those of you who
|
|||
|
feel that "close, but incompatible" is like saying "a little bit
|
|||
|
pregnant", I'm sorry to report that A86 will not assemble all
|
|||
|
Intel/IBM/MASM programs, unmodified. But I do think that a vast
|
|||
|
majority of programs can, with a little massaging, be made to
|
|||
|
assemble under A86. Furthermore, the massaging can be done in
|
|||
|
such a way as to make the programs still acceptable to that old,
|
|||
|
behemoth assembler.
|
|||
|
|
|||
|
Version 3.00 of A86 has many compatibility features not present
|
|||
|
in earlier versions. Among the features added since A86 was
|
|||
|
first released are: more general forward references, double
|
|||
|
quotes for strings, "=" as a synonym for EQU, the RADIX
|
|||
|
directive, and the COMMENT directive. If you tried feeding an
|
|||
|
old source file to a previous A86 and were dismayed by the number
|
|||
|
of error messages you got, try again: things might be more
|
|||
|
manageable now.
|
|||
|
|
|||
|
|
|||
|
Conversion of MASM programs to A86
|
|||
|
|
|||
|
Following is a list of the things you should watch out for when
|
|||
|
converting from MASM to A86:
|
|||
|
|
|||
|
1. You need to determine whether the program was coded as a COM
|
|||
|
program or as an EXE program. All COM programs coded for MASM
|
|||
|
will contain an ORG 100H directive somewhere before the start
|
|||
|
of the code. EXE programs will contain no such ORG, and will
|
|||
|
often contain statements that load named segments into
|
|||
|
registers. If the program was coded as EXE, you must either
|
|||
|
assemble it (using the +O option) to an OBJ file to be fed to
|
|||
|
LINK, or you must eliminate the instructions that load segment
|
|||
|
registers-- in a COM program they often aren't necessary
|
|||
|
anyway, since COM programs are started with all segment
|
|||
|
registers already pointing to the same value.
|
|||
|
|
|||
|
A good general rule is: when it doubt, try assembling to an
|
|||
|
OBJ file.
|
|||
|
|
|||
|
2. You need to determine whether the program is executing with
|
|||
|
all segment registers pointing to the same value. Simple COM
|
|||
|
programs that fit into 64K will typically fall into this
|
|||
|
category. Most EXE programs, programs that use huge amounts
|
|||
|
of memory, and programs (such as memory-resident programs)
|
|||
|
that take over interrupts typically have different values in
|
|||
|
segment registers.
|
|||
|
12-2
|
|||
|
|
|||
|
If there are different values in the segment registers, then
|
|||
|
there may be instructions in the program for which the old
|
|||
|
assembler generates segment override prefixes "behind your
|
|||
|
back". You will need to find such references, and generate
|
|||
|
explicit overrides for them. If there are data tables within
|
|||
|
the program itself, a CS-override is needed. If there are
|
|||
|
data structures in the stack segment not accessed via a
|
|||
|
BP-index, an SS-override is needed. If ES points to its own
|
|||
|
segment, then an ES-override is needed for accesses (other
|
|||
|
than STOS and MOVS destinations) to that segment. In the
|
|||
|
interrupt handlers to memory-resident programs, the "normal"
|
|||
|
handler is often invoked via an indirect CALL or JMP
|
|||
|
instruction that fetches the doubleword address of the normal
|
|||
|
handler from memory, where it was stored by the initialization
|
|||
|
code. That CALL or JMP often requires a CS-override-- watch
|
|||
|
out!
|
|||
|
|
|||
|
If you want to remain compatible with the old assembler, then
|
|||
|
code the overrides by placing the segment register name, with
|
|||
|
a colon, before the memory-access operand in the instruction.
|
|||
|
If you do not need further compatibility, you can place the
|
|||
|
segment register name before the instruction mnemonic. For
|
|||
|
example:
|
|||
|
|
|||
|
MOV AL,CS:TABLE[SI] ; if you want compatibility do this
|
|||
|
CS MOV AL,TABLE[SI] ; if not you can do it this way
|
|||
|
|
|||
|
3. You should use a couple of A86's switches to maximize
|
|||
|
compatibility with MASM. I've already mentioned the +O switch
|
|||
|
to produce .OBJ files. You should also assemble with the +D
|
|||
|
switch, which disables A86's unique parsing of constants with
|
|||
|
leading zeroes as hexidecimal. The RADIX command in your
|
|||
|
program will also do this. And you should use the +L15 switch,
|
|||
|
that disables a few other A86 features that might have reduced
|
|||
|
compatibility. See Chapter 3 for a detailed explanation of
|
|||
|
these switches.
|
|||
|
|
|||
|
4. A86 is a bit more restrictive with respect to forward
|
|||
|
references than MASM, but not as much as it used to be. You'll
|
|||
|
probably need to resolve just a few ambiguous references by
|
|||
|
appending " B" or " W" to the forward reference name. One
|
|||
|
common reference that needs a bit more recoding is the
|
|||
|
difference of two forward references, often used to refer to
|
|||
|
the size of a block of allocated memory. You handle this by
|
|||
|
defining a new symbol representing the size, using an EQU
|
|||
|
right after the block is declared, and then replacing the
|
|||
|
forward-reference difference with the size symbol.
|
|||
|
|
|||
|
5. A86's macro definition and conditional assembly language is
|
|||
|
different than MASM's. Most macros can be translated by
|
|||
|
replacing the named parameters of the old macros with the
|
|||
|
dedicated names #n of the A86 macro language; and by replacing
|
|||
|
ENDM with #EM. Other constructs have straightforward
|
|||
|
translations, as illustrated by the following examples. Note
|
|||
|
that examples involving macro parameters have double pound
|
|||
|
signs, since the condition will be tested when the macro is
|
|||
|
expanded, not when it is defined.
|
|||
|
12-3
|
|||
|
|
|||
|
MASM construct Equivalent A86 construct
|
|||
|
|
|||
|
IFE expr #IF ! expr
|
|||
|
IFB <PARM3> ##IF !#S3
|
|||
|
IFNB <PARM4> ##IF #S4
|
|||
|
IFIDN <PARM1>,<CX> ##IF "#1" EQ "CX"
|
|||
|
IFDIF <PARM2>,<SI> ##IF "#2" NE "SI"
|
|||
|
.ERR (any undefined symbol)
|
|||
|
.ERRcond TRUE EQU 0FFFF
|
|||
|
TRUE EQU cond
|
|||
|
EXITM #EX
|
|||
|
IRP ... ENDM #RX1L ... #ER
|
|||
|
REPT 100 ...ENDM #RX1(100) ... #ER
|
|||
|
IRPC ... ENDM #CX ... #EC
|
|||
|
|
|||
|
The last three constructs, IRP, REPT, and IRPC, usually occur
|
|||
|
within macros; but in MASM they don't have to. The A86
|
|||
|
equivalents are valid only within macros-- if they occur in
|
|||
|
the MASM program outside of a macro, you duplicate them by
|
|||
|
defining an enclosing macro on the spot, and calling that
|
|||
|
macro once, right after it is defined.
|
|||
|
|
|||
|
To retain compatibility, you isolate the old macro definitions
|
|||
|
in an INCLUDE file (A86 will ignore the INCLUDE directive),
|
|||
|
and isolate the A86 macro definitions in a separate file, not
|
|||
|
used in an MASM assembly of the program.
|
|||
|
|
|||
|
6. A86 supports the STRUC directive, with named structure
|
|||
|
elements, just like MASM, with one exception: A86 does not
|
|||
|
save initial values declared in the STRUC definition, and A86
|
|||
|
does not allow assembly of instances of structure elements.
|
|||
|
|
|||
|
For example, the MASM construct
|
|||
|
|
|||
|
PAYREC STRUC
|
|||
|
PNAME DB 'no name given'
|
|||
|
PKEY DW ?
|
|||
|
ENDS
|
|||
|
|
|||
|
PAYREC 3 DUP (?)
|
|||
|
PAYREC <'Eric',1811>
|
|||
|
|
|||
|
causes A86 to accept the STRUC definition, and define the
|
|||
|
structure elements PNAME and PKEY correctly; but the PAYREC
|
|||
|
initializations need to be recoded. If it isn't vital to
|
|||
|
initialize the memory with the specific definition values, you
|
|||
|
could recode the first PAYREC as:
|
|||
|
|
|||
|
DB ((TYPE PAYREC) * 3) DUP ?
|
|||
|
|
|||
|
If you must initialize values, you do so line by line:
|
|||
|
|
|||
|
DB 'Eric '
|
|||
|
DW ?
|
|||
|
|
|||
|
If there are many such initializations, you could define a
|
|||
|
macro INIT_PAYREC containing the DB and DW lines.
|
|||
|
12-4
|
|||
|
|
|||
|
7. A86 does not support a couple of the more exotic features of
|
|||
|
MASM assembly language: the RECORD directive and its
|
|||
|
associated operators WIDTH and MASK; and the usage of
|
|||
|
angle-brackets to initialize structure records. These
|
|||
|
features would have added much complication to the internal
|
|||
|
structure of symbol tables in A86; degrading the speed and the
|
|||
|
reliability of the assembler. I felt that their use was
|
|||
|
sufficiently rare that it was not worth including them for
|
|||
|
compatibility.
|
|||
|
|
|||
|
If your old program does use these features, you will have to
|
|||
|
re-work the areas that use them. Macros can be used to
|
|||
|
duplicate the record and structure initializations. Explicit
|
|||
|
symbol declarations can replace the usage of the WIDTH and
|
|||
|
MASK operators.
|
|||
|
|
|||
|
|
|||
|
Compatibility symbols recognized by A86
|
|||
|
|
|||
|
A86 has been programmed to ignore a variety of lines that have
|
|||
|
meaning to Intel/IBM/MASM assemblers; but which do nothing for
|
|||
|
A86. These include lines beginning with a period (except .RADIX,
|
|||
|
which is acted upon), percent sign, or dollar sign; and lines
|
|||
|
beginning with ASSUME, INCLUDE, PAGE, SUBTTL, and TITLE. If you
|
|||
|
are porting your program to A86, and you wish to retain the
|
|||
|
option of returning to the other assembler, you may leave those
|
|||
|
lines in your program. If you decide to stay with A86, you can
|
|||
|
remove those lines at your leisure.
|
|||
|
|
|||
|
In addition, there is a class of symbols now recognized by A86 in
|
|||
|
its .OBJ mode, but still ignored in .COM mode. This includes
|
|||
|
NAME, END, and PUBLIC.
|
|||
|
|
|||
|
Named SEGMENT and ENDS directives written for other assemblers
|
|||
|
are, of course, recognized by A86's .OBJ mode. In non-OBJ mode,
|
|||
|
A86 treats these as CODE SEGMENT directives. A special exception
|
|||
|
to this is the directive
|
|||
|
|
|||
|
segname SEGMENT AT atvalue
|
|||
|
|
|||
|
which is treated by A86 as if it were the following sequence:
|
|||
|
|
|||
|
segname EQU atvalue
|
|||
|
STRUC
|
|||
|
|
|||
|
This will accomplish what is usually intended when SEGMENT AT is
|
|||
|
used in a program intended to be a COM file.
|
|||
|
12-5
|
|||
|
|
|||
|
Conversion of A86 Programs to Intel/IBM/MASM
|
|||
|
|
|||
|
I consider this section a bit of a blasphemy, since it's a little
|
|||
|
silly to port programs from a superior assembler, to run on an
|
|||
|
inferior one. However, I myself have been motivated to do so
|
|||
|
upon occasion, when programming for a client not familiar with
|
|||
|
A86; or whose computer doesn't run A86; who therefore wants the
|
|||
|
final version to assemble on Intel's assembler. Since my
|
|||
|
assembler/debugger environment is so vastly superior to any other
|
|||
|
environment, I develop the program using my assembler, and port
|
|||
|
it to the client's environment at the end.
|
|||
|
|
|||
|
The main key to success in following the above scenarios is to
|
|||
|
exercise supreme will power, and not use any of the wonderful
|
|||
|
language features that exist on A86, but not on MASM. This is
|
|||
|
often not easy; and I have devised some methods for porting my
|
|||
|
features to other assemblers:
|
|||
|
|
|||
|
1. I hate giving long sequences of PUSHes and POPs on separate
|
|||
|
lines. If the program is to be ported to a lesser assembler,
|
|||
|
then I put the following lines into a file that only A86 will
|
|||
|
see:
|
|||
|
|
|||
|
PUSH2 EQU PUSH
|
|||
|
PUSH3 EQU PUSH
|
|||
|
POP2 EQU POP
|
|||
|
POP3 EQU POP
|
|||
|
|
|||
|
I define macros PUSH2, PUSH3, POP2, POP3 for the lesser
|
|||
|
assembler, that PUSH or POP the appropriate number of
|
|||
|
operands. Then, everywhere in the program where I would
|
|||
|
ordinarily use A86's multiple PUSH/POP feature, I use one or
|
|||
|
more of the PUSHn/POPn mnemonics instead.
|
|||
|
|
|||
|
2. I refrain from using the feature of A86 whereby constants with
|
|||
|
a leading zero are default-hexadecimal. All my hex constants
|
|||
|
end with H.
|
|||
|
|
|||
|
3. I will usually go ahead and use my local labels L0 through L9;
|
|||
|
then at the last minute convert them to a long set of labels
|
|||
|
in sequence: Z100, Z101, Z102, etc. I take care to remove all
|
|||
|
the ">" forward reference specifiers when I make the
|
|||
|
conversion. The "Z" is used to isolate the local labels at
|
|||
|
the end of the lesser assembler's symbol table listing. This
|
|||
|
improves the quality of the final program so much that it is
|
|||
|
worth the extra effort needed to convert L0--L9's to Z100--
|
|||
|
Zxxx's.
|
|||
|
|
|||
|
4. I will place declarations B EQU DS:BYTE PTR 0 and W EQU
|
|||
|
DS:WORD PTR 0 at the top of the program. Recall that A86 has
|
|||
|
a "duplicate definition" feature whereby you can EQU an
|
|||
|
already-existing symbol, as long as it is equated to the value
|
|||
|
it already has. This feature extends to the built in symbols
|
|||
|
B and W, so A86 will look at those equates and essentially
|
|||
|
ignore them. On the old assembler, the effect of the
|
|||
|
declarations is to add A86's notation to the old language.
|
|||
|
Example:
|
|||
|
12-6
|
|||
|
|
|||
|
B EQU DS:BYTE PTR 0
|
|||
|
W EQU DS:WORD PTR 0
|
|||
|
MOV AX,W[0100] ; replaces MOV AX, DS:WORD PTR 0100
|
|||
|
MOV AL,B[BX] ; replaces MOV AL, DS:BYTE PTR [BX]
|
|||
|
|
|||
|
|