textfiles/computers/DOCUMENTATION/a12.txt

CHAPTER 12   COMPATIBILITY WITH OTHER ASSEMBLERS


I gave heavy priority to compatibility when I designed A86; a
priority just a shade behind the higher priorities of
reliability, speed, convenience, and power.  For those of you who
feel that "close, but incompatible" is like saying "a little bit
pregnant", I'm sorry to report that A86 will not assemble all
Intel/IBM/MASM programs, unmodified.  But I do think that a vast
majority of programs can, with a little massaging, be made to
assemble under A86.  Furthermore, the massaging can be done in
such a way as to make the programs still acceptable to that old,
behemoth assembler.

Version 3.00 of A86 has many compatibility features not present
in earlier versions.  Among the features added since A86 was
first released are:  more general forward references, double
quotes for strings, "=" as a synonym for EQU, the RADIX
directive, and the COMMENT directive.  If you tried feeding an
old source file to a previous A86 and were dismayed by the number
of error messages you got, try again: things might be more
manageable now.


Conversion of MASM programs to A86

Following is a list of the things you should watch out for when
converting from MASM to A86:

1. You need to determine whether the program was coded as a COM
   program or as an EXE program.  All COM programs coded for MASM
   will contain an ORG 100H directive somewhere before the start
   of the code.  EXE programs will contain no such ORG, and will
   often contain statements that load named segments into
   registers.  If the program was coded as EXE, you must either
   assemble it (using the +O option) to an OBJ file to be fed to
   LINK, or you must eliminate the instructions that load segment
   registers-- in a COM program they often aren't necessary
   anyway, since COM programs are started with all segment
   registers already pointing to the same value.

   A good general rule is: when it doubt, try assembling to an
   OBJ file.

2. You need to determine whether the program is executing with
   all segment registers pointing to the same value.  Simple COM
   programs that fit into 64K will typically fall into this
   category.  Most EXE programs, programs that use huge amounts
   of memory, and programs (such as memory-resident programs)
   that take over interrupts typically have different values in
   segment registers.
                                                             12-2

   If there are different values in the segment registers, then
   there may be instructions in the program for which the old
   assembler generates segment override prefixes "behind your
   back".  You will need to find such references, and generate
   explicit overrides for them.  If there are data tables within
   the program itself, a CS-override is needed.    If there are
   data structures in the stack segment not accessed via a
   BP-index, an SS-override is needed. If ES points to its own
   segment, then an ES-override is needed for accesses (other
   than STOS and MOVS destinations) to that segment.  In the
   interrupt handlers to memory-resident programs, the "normal"
   handler is often invoked via an indirect CALL or JMP
   instruction that fetches the doubleword address of the normal
   handler from memory, where it was stored by the initialization
   code.  That CALL or JMP often requires a CS-override-- watch
   out!

   If you want to remain compatible with the old assembler, then
   code the overrides by placing the segment register name, with
   a colon, before the memory-access operand in the instruction.
   If you do not need further compatibility, you can place the
   segment register name before the instruction mnemonic.  For
   example:

    MOV AL,CS:TABLE[SI]   ; if you want compatibility do this
    CS MOV AL,TABLE[SI]   ; if not you can do it this way

3. You should use a couple of A86's switches to maximize
   compatibility with MASM.  I've already mentioned the +O switch
   to produce .OBJ files.  You should also assemble with the +D
   switch, which disables A86's unique parsing of constants with
   leading zeroes as hexidecimal.  The RADIX command in your
   program will also do this. And you should use the +L15 switch,
   that disables a few other A86 features that might have reduced
   compatibility. See Chapter 3 for a detailed explanation of
   these switches.

4. A86 is a bit more restrictive with respect to forward
   references than MASM, but not as much as it used to be. You'll
   probably need to resolve just a few ambiguous references by
   appending " B" or " W" to the forward reference name.  One
   common reference that needs a bit more recoding is the
   difference of two forward references, often used to refer to
   the size of a block of allocated memory.  You handle this by
   defining a new symbol representing the size, using an EQU
   right after the block is declared, and then replacing the
   forward-reference difference with the size symbol.

5. A86's macro definition and conditional assembly language is
   different than MASM's.  Most macros can be translated by
   replacing the named parameters of the old macros with the
   dedicated names #n of the A86 macro language; and by replacing
   ENDM with #EM.  Other constructs have straightforward
   translations, as illustrated by the following examples.  Note
   that examples involving macro parameters have double pound
   signs, since the condition will be tested when the macro is
   expanded, not when it is defined.
                                                             12-3

   MASM construct              Equivalent A86 construct

   IFE expr                     #IF ! expr
   IFB <PARM3>                  ##IF !#S3
   IFNB <PARM4>                 ##IF #S4
   IFIDN <PARM1>,<CX>           ##IF "#1" EQ "CX"
   IFDIF <PARM2>,<SI>           ##IF "#2" NE "SI"
   .ERR                         (any undefined symbol)
   .ERRcond                     TRUE EQU 0FFFF
                                TRUE EQU cond
   EXITM                        #EX
   IRP ... ENDM                 #RX1L ... #ER
   REPT 100 ...ENDM             #RX1(100) ... #ER
   IRPC ... ENDM                #CX ... #EC

   The last three constructs, IRP, REPT, and IRPC, usually occur
   within macros; but in MASM they don't have to.  The A86
   equivalents are valid only within macros-- if they occur in
   the MASM program outside of a macro, you duplicate them by
   defining an enclosing macro on the spot, and calling that
   macro once, right after it is defined.

   To retain compatibility, you isolate the old macro definitions
   in an INCLUDE file (A86 will ignore the INCLUDE directive),
   and isolate the A86 macro definitions in a separate file, not
   used in an MASM assembly of the program.

6. A86 supports the STRUC directive, with named structure
   elements, just like MASM, with one exception: A86 does not
   save initial values declared in the STRUC definition, and A86
   does not allow assembly of instances of structure elements.

   For example, the MASM construct

   PAYREC STRUC
      PNAME  DB 'no name given'
      PKEY   DW ?
   ENDS

   PAYREC 3 DUP (?)
   PAYREC <'Eric',1811>

   causes A86 to accept the STRUC definition, and define the
   structure elements PNAME and PKEY correctly; but the PAYREC
   initializations need to be recoded.  If it isn't vital to
   initialize the memory with the specific definition values, you
   could recode the first PAYREC as:

   DB  ((TYPE PAYREC) * 3) DUP ?

   If you must initialize values, you do so line by line:

   DB 'Eric         '
   DW ?

   If there are many such initializations, you could define a
   macro INIT_PAYREC containing the DB and DW lines.
                                                             12-4

7. A86 does not support a couple of the more exotic features of
   MASM assembly language: the RECORD directive and its
   associated operators WIDTH and MASK; and the usage of
   angle-brackets to initialize structure records.  These
   features would have added much complication to the internal
   structure of symbol tables in A86; degrading the speed and the
   reliability of the assembler.  I felt that their use was
   sufficiently rare that it was not worth including them for
   compatibility.

   If your old program does use these features, you will have to
   re-work the areas that use them.  Macros can be used to
   duplicate the record and structure initializations.  Explicit
   symbol declarations can replace the usage of the WIDTH and
   MASK operators.


Compatibility symbols recognized by A86

A86 has been programmed to ignore a variety of lines that have
meaning to Intel/IBM/MASM assemblers; but which do nothing for
A86.  These include lines beginning with a period (except .RADIX,
which is acted upon), percent sign, or dollar sign; and lines
beginning with ASSUME, INCLUDE, PAGE, SUBTTL, and TITLE. If you
are porting your program to A86, and you wish to retain the
option of returning to the other assembler, you may leave those
lines in your program.  If you decide to stay with A86, you can
remove those lines at your leisure.

In addition, there is a class of symbols now recognized by A86 in
its .OBJ mode, but still ignored in .COM mode.  This includes
NAME, END, and PUBLIC.

Named SEGMENT and ENDS directives written for other assemblers
are, of course, recognized by A86's .OBJ mode.  In non-OBJ mode,
A86 treats these as CODE SEGMENT directives.  A special exception
to this is the directive

    segname SEGMENT AT atvalue

which is treated by A86 as if it were the following sequence:

    segname EQU atvalue
    STRUC

This will accomplish what is usually intended when SEGMENT AT is
used in a program intended to be a COM file.
                                                             12-5

Conversion of A86 Programs to Intel/IBM/MASM

I consider this section a bit of a blasphemy, since it's a little
silly to port programs from a superior assembler, to run on an
inferior one.  However, I myself have been motivated to do so
upon occasion, when programming for a client not familiar with
A86; or whose computer doesn't run A86; who therefore wants the
final version to assemble on Intel's assembler.  Since my
assembler/debugger environment is so vastly superior to any other
environment, I develop the program using my assembler, and port
it to the client's environment at the end.

The main key to success in following the above scenarios is to
exercise supreme will power, and not use any of the wonderful
language features that exist on A86, but not on MASM. This is
often not easy; and I have devised some methods for porting my
features to other assemblers:

1. I hate giving long sequences of PUSHes and POPs on separate
   lines.  If the program is to be ported to a lesser assembler,
   then I put the following lines into a file that only A86 will
   see:

      PUSH2 EQU PUSH
      PUSH3 EQU PUSH
      POP2 EQU POP
      POP3 EQU POP

   I define macros PUSH2, PUSH3, POP2, POP3 for the lesser
   assembler, that PUSH or POP the appropriate number of
   operands.  Then, everywhere in the program where I would
   ordinarily use A86's multiple PUSH/POP feature, I use one or
   more of the PUSHn/POPn mnemonics instead.

2. I refrain from using the feature of A86 whereby constants with
   a leading zero are default-hexadecimal.  All my hex constants
   end with H.

3. I will usually go ahead and use my local labels L0 through L9;
   then at the last minute convert them to a long set of labels
   in sequence: Z100, Z101, Z102, etc.  I take care to remove all
   the ">" forward reference specifiers when I make the
   conversion.  The "Z" is used to isolate the local labels at
   the end of the lesser assembler's symbol table listing. This
   improves the quality of the final program so much that it is
   worth the extra effort needed to convert L0--L9's to Z100--
   Zxxx's.

4. I will place declarations B EQU DS:BYTE PTR 0 and W EQU
   DS:WORD PTR 0 at the top of the program.  Recall that A86 has
   a "duplicate definition" feature whereby you can EQU an
   already-existing symbol, as long as it is equated to the value
   it already has.  This feature extends to the built in symbols
   B and W, so A86 will look at those equates and essentially
   ignore them.  On the old assembler, the effect of the
   declarations is to add A86's notation to the old language.
   Example:
                                                             12-6

     B EQU DS:BYTE PTR 0
     W EQU DS:WORD PTR 0
     MOV AX,W[0100]       ; replaces MOV AX, DS:WORD PTR 0100
     MOV AL,B[BX]         ; replaces MOV AL, DS:BYTE PTR [BX]