2253 lines
85 KiB
Plaintext
2253 lines
85 KiB
Plaintext
A Proposed Assembly Language Syntax For 65c816 Assemblers
|
||
by Randall Hyde
|
||
|
||
|
||
This is a proposed standard for 65c816 assembly language. The
|
||
proposed standard comes in three levels: subset, full, and extended. The
|
||
subset standard is intended for simple (or inexpensive) products,
|
||
particularly those aimed at beginning 65c816 assembly language programmers.
|
||
The full standard is the focus of this proposal. An assembler meeting the
|
||
full level adopts all of the requirements outlined in this paper. The
|
||
extended level is a mechanism whereby a vendor can claim full compliance
|
||
with the standard and point out that there are extensions as well. An
|
||
assembler cannot claim extended level compliance unless it also complies with
|
||
the full standard. An assembler, no matter how many extensions are
|
||
incorporated, will have to claim subset level unless the full standard is
|
||
supported. This ensures that programmers who do not use any assembler
|
||
extensions can assemble their programs on any assembler meeting the full or
|
||
extended compliance levels.
|
||
|
||
In addition to the items required for compliance, this proposal
|
||
suggests several extensions in the interests of compatibility with existing
|
||
65c816 assemblers. These recommendations are not required for full
|
||
compliance with the standard, they're included in this proposal as suggestions
|
||
to help make conversion of existing programs easier. The suggestions are
|
||
presented in two levels: recommended and optional. Recommended items should
|
||
be present in any decent 65c816 package. Inclusion of the optional items
|
||
is discouraged (since there are other ways to accomplish the same operation
|
||
within the confines of the standard) but may be included in the assembler
|
||
at the vendor's discretion to help alleviate conversion problems.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
65c816 Instruction Mnemonics
|
||
----------------------------
|
||
|
||
|
||
All of the following mnemonics are required at the subset, full,
|
||
and extended standard levels.
|
||
|
||
The following mnemonics handle the basic 65c816 instruction set:
|
||
|
||
ADC - add with carry
|
||
AND - logical AND
|
||
BCC - branch if carry clear
|
||
BCS - branch if carry set
|
||
BEQ - branch if equal
|
||
BIT - bit test
|
||
BMI - branch if minus
|
||
BNE - branch if not equal
|
||
BPL - branch if plus
|
||
BRA - branch always
|
||
BRK - break point instruction
|
||
BVC - branch if overflow clear
|
||
BVS - branch if overflow set
|
||
CLC - clear the carry flag
|
||
CLD - clear the decimal flag
|
||
CLI - clear the interrupt flag
|
||
CLP - clear bits in P
|
||
CLR - store a zero into memory
|
||
CMP - compare accumulator
|
||
CPX - compare x register
|
||
CPY - compare y register
|
||
CSP - call system procedure
|
||
DEC - decrement acc or memory
|
||
DEX - decrement x register
|
||
DEY - decrement y register
|
||
EOR - exclusive-or accumulator
|
||
HLT - halt (stop) the clock
|
||
INC - increment acc or memory
|
||
INX - increment x register
|
||
INY - increment y register
|
||
JMP - jump to new location
|
||
JSR - jump to subroutine
|
||
LDA - load accumulator
|
||
LDX - load x register
|
||
LDY - load y register
|
||
MVN - block move (decrement)
|
||
MVP - block move (increment)
|
||
NOP - no operation
|
||
ORA - logical or accumulator
|
||
PHA - push accumulator
|
||
PHP - push p
|
||
PHX - push x register
|
||
PHY - push y register
|
||
PLA - pop accumulator
|
||
PLP - pop p
|
||
PLX - pop x register
|
||
PLY - pop y register
|
||
PSH - push operand
|
||
PUL - pop operand
|
||
RET - return from subroutine
|
||
ROL - rotate left acc/mem
|
||
ROR - rotate right acc/mem
|
||
RTI - return from interrupt
|
||
RTL - return from long subroutine
|
||
RTS - return from short subroutine
|
||
SBC - subtract with carry
|
||
SED - set decimal flag
|
||
SEI - set interrupt flag
|
||
SEP - set bits in P
|
||
SHL - shift left acc/mem
|
||
SHR - shift right acc/mem
|
||
STA - store accumulator
|
||
STX - store x register
|
||
STY - store y register
|
||
SWA - swap accumulator halves
|
||
TAD - transfer acc to D
|
||
TAS - transfer acc to S
|
||
TAX - transfer acc to x
|
||
TAY - transfer acc to y
|
||
TCB - test and clear bit
|
||
TDA - transfer D to acc
|
||
TSA - transfer S to acc
|
||
TSB - test and set bit
|
||
TSX - transfer S to X
|
||
TXA - transfer x to acc
|
||
TXS - transfer x to S
|
||
TXY - transfer x to y
|
||
TYA - transfer y to acc
|
||
TYX - transfer y to x
|
||
WAI - wait for interrupt
|
||
XCE - exchange carry with emulation bit
|
||
|
||
Comments:
|
||
|
||
CLP replaces REP in the original 65c816 instruction set, since CLP
|
||
is a tad more consistent with the original 6502 instruction set. See
|
||
"recommended options" for the status of REP. CLR replaces the STZ
|
||
instruction. Since STA, STX, and STY are used to store 65c816 registers,
|
||
STZ seems to imply that there is a Z register. Using CLR (clear) eliminates
|
||
any confusion. CSP (call system procedure) replaces the COP mnemonic. COP
|
||
was little more than a software interrupt in both intent and implementation.
|
||
CSP helps make this usage a little clearer. HLT replaces the STP mnemonic.
|
||
STP, like the STZ mnemonic, implies that the P register is being stored
|
||
somewhere. HLT (for halt) is just as obvious as "stop the clock" yet it
|
||
doesn't have the same "look and feel" as a store instruction. JML and JSL
|
||
are not really required by the new standard; but see recommended options
|
||
concerning these two instructions. Most of the new 65c816 push and pull
|
||
instructions have been collapsed into two instructions: PSH and PUL.
|
||
|
||
PEA label becomes PSH #label
|
||
PEI (label) becomes PSH label
|
||
PER label becomes PSH @label
|
||
PHB becomes PSH DBR
|
||
PHD becomes PSH D
|
||
PHK becomes PSH PBR
|
||
|
||
PLB becomes PUL DBR
|
||
PLD becomes PUL D
|
||
|
||
These mnemonics are more in line with the original design of the 6502
|
||
instruction set whereby the mnemonic specifies the operation and the operand
|
||
specifies the addressing mode and address. The RET instruction gets converted
|
||
to RTS or RTL, depending on the type of subroutine being declared. RTS and
|
||
RTL still exist in order to force a short or long return. SHL and SHR (shift
|
||
left and shift right) are used instead of ASL and LSR. The 6500 family has
|
||
NEVER supported an arithmetic shift left instruction. The operation performed
|
||
by the ASL mnemonic is really a logical shift left. To simplify matters, SHL
|
||
and SHR are used to specify shift left and shift right. SWA (swap accumulator
|
||
halves) is used instead of XBA. Since this is the only instruction that
|
||
references the "B" accumulator, there's no valid reason for even treating
|
||
the accumulator as two distinct entities (this is just a carry-over from the
|
||
6800 MPU). Likewise, since the eight-bit accumulator cannot be distinguished
|
||
from the 16-bit accumulator on an instruction by instruction basis (it depends
|
||
on the setting of the M bit in the P register), the accumulator should always
|
||
be referred to as A, regardless of whether the CPU is in the eight or sixteen
|
||
bit mode. Therefore, instructions like TCD, TCS, TDC, and TSC should be
|
||
replaced by TAD, TAS, TDA, and TSA. For more info on these new mnemonics,
|
||
see the section on "recommended options".
|
||
|
||
|
||
Built-in Macros
|
||
---------------
|
||
|
||
The following instructions actually generate one or more instructions.
|
||
They are not required at the subset level, but are required at the full and
|
||
extended levels.
|
||
|
||
|
||
ADD - emits CLC then ADC
|
||
BFL - emits BEQ (branch if false)
|
||
BGE - emits BCS
|
||
BLT - emits BCC
|
||
BTR - emits BNE (branch if true)
|
||
BSR - emits PER *+2 then BRA (short) or PER *+3 then BRL (long)
|
||
SUB - emits SEC then SBC
|
||
|
||
|
||
Recommended Options
|
||
-------------------
|
||
|
||
The following mnemonics are aliases of existing instructions. The
|
||
(proposed) standard recommends that the assembler support these mnemonics,
|
||
mainly to provide compatibility with older source code, but does not
|
||
recommend their use in new programs. Some (or all) of these items may be
|
||
removed from the recommended list in future revisions of the standard. None
|
||
of these recommended items need be present at the subset level. If these
|
||
are the only extensions over and above the full syntax, the assembler
|
||
CANNOT claim to be an extended level assembler.
|
||
|
||
ASL BRL COP JML JSL LSR PEA PEI PER
|
||
PHB PHK PHK PLB PLD REP TCD TCS TDC
|
||
TSC TRB WDM XBA
|
||
|
||
|
||
|
||
|
||
Symbols, Constants, and Other Items
|
||
-----------------------------------
|
||
|
||
Symbols may contain any reasonable number of characters at the full
|
||
level. At the subset compliance level, at least 16 characters should be
|
||
supported and 32 is recommeded. A "reasonable" number of characters should
|
||
be at least 64 if the implementor needs a maximum value.
|
||
|
||
Symbols must begin with an alphabetic character and may contain
|
||
(only) the following symbols: A-Z, a-z, 0-9, "_", "$", and "!". The
|
||
assembler must be capable of treating upper and lower case alphabetic
|
||
characters identically. Note that this does not disallow an assembler from
|
||
allowing the programmer to choose that upper and lower case be distinct, it
|
||
simply requires that in the default case, upper and lower case characters
|
||
are treated identically. Note that the standard does not require case
|
||
sensitivity in the assembler (and, in fact, recommends against it).
|
||
Therefore, anyone foolish enough (for many, many reasons) to create variables
|
||
that differ only in the case of the letters they contain is risking port-
|
||
ability problems (as well as maintenence, readability, and other problems).
|
||
|
||
The following symbols are reserved and may not be redefined within
|
||
the program:
|
||
|
||
A, X, Y, S, DBR, PBR, D, M, P
|
||
|
||
Nor may these symbol appear as fields to a record or type definition (which
|
||
will be described later).
|
||
|
||
|
||
Constants take six different forms: character constants, string
|
||
constants, binary constants, decimal constants, hexadecimal constants and
|
||
set constants.
|
||
|
||
Character constants are created by surrounding a single character by
|
||
a pair of apostrophes or quotation marks, e.g., "s", "a", '$', and 'p'. If
|
||
the character is surrounded by apostrophes, then the ASCII code for that
|
||
character WITH THE H.O. BIT CLEAR will be used. If the quotation marks are
|
||
used, then the ASCII code for the character WITH THE H.O. BIT SET will be
|
||
used. If you need to represent the apostrophe with the H.O. bit clear or a
|
||
quotation mark with the H.O. bit set, simply double up the characters, e.g.,
|
||
|
||
'''' - emits a single apostrophe.
|
||
"""" - emits a single quotation mark.
|
||
|
||
String constants are generated by placing a sequence of two or more
|
||
characters within a pair of apostrophes or quotation marks. The choice of
|
||
apostrophe or quotation mark controls the H.O. bit, as for character
|
||
constants. Likewise, to place an apostrophe or quote within a string
|
||
delimited by the same character, just double up the apostrophe or quotation
|
||
mark:
|
||
|
||
'This isn''t bad!' - generates --This isn't bad--
|
||
"He said ""Hello""" - generates --He said "Hello"--
|
||
|
||
|
||
Binary integer constants consist of a sequence of 1 through 32 zeros
|
||
or ones preceded by a percent sign ("%"). Examples:
|
||
|
||
%10110010
|
||
%001011101
|
||
%10
|
||
%1100
|
||
|
||
Decimal integer constants consist of strings of decimal digits without
|
||
any preceding characters. E.g., 25, 235, 8325, etc. Decimal constants
|
||
may be (optionally) preceded by a minus sign.
|
||
|
||
Hexadecimal constants consist of a dollar sign ("$") followed by
|
||
a string of hexadecimal digits (0..9 and A..F). Values in the range $0
|
||
through $FFFFFFFF are allowed.
|
||
|
||
Set constants are only required at the full and extended compliance
|
||
levels. A set constant consists of a list of items surrounded by braces,
|
||
e.g., {0,3,5}. For more information, see the .SET directive.
|
||
|
||
|
||
|
||
Address Expressions
|
||
-------------------
|
||
|
||
Most instructions and many pseudo-opcode/assembler directives require
|
||
operands of some sort. Often these operands contain some sort of address
|
||
expression (some, ultimately, numeric or string value). This proposed
|
||
standard defines the operands, precision, accuracy, and available operations
|
||
that constitutes an address expression.
|
||
|
||
Precision: all integer expressions are computed using 32 bits. All string
|
||
expressions are computed with strings up to 255 characters in length. All
|
||
floating point operations are performed using IEEE 80-bit extended floating
|
||
point values (i.e., Apple SANE routines). All set operations are performed
|
||
using 32 bits of precision.
|
||
|
||
Accuracy: all integer operations (consisting of two 32-bit operands and an
|
||
operator on those operands) must produce the correct result if the actual
|
||
result can fit within 32 bits. If an overflow occurs, the value is truncated
|
||
and only the low order 32 bits are retained. If an underflow occurs, zero
|
||
is used as the result. If an overflow or underflow occurs, a special bit will
|
||
be set (until the next value is computed) that can be tested by the ".IFOVR"
|
||
and ".IFUNDR" directives. Other than that, such errors are ignored. All
|
||
arithmetic is performed using unsigned arithmetic operations. All
|
||
floating point operations follow the IEEE (and Apple SANE) suggestions, and
|
||
are otherwise ignored by the assembler. Any string operation producing a
|
||
string longer than 255 characters produces an assembly time error. All set
|
||
operations must be exact.
|
||
|
||
Integer operations: The following integer operations must be provided at all
|
||
compliance levels:
|
||
|
||
+ (binary) adds the two operands.
|
||
- (binary) subracts second operand from the first.
|
||
* multiplies the two operands.
|
||
/ divides the first operand by the second.
|
||
\ divides the first operand by the second and returns the remainder.
|
||
& logically ANDs the two operands.
|
||
| logically ORs the two operands.
|
||
^ logically XORs the two operands.
|
||
|
||
|
||
=
|
||
<> These operators compare the two operands (unsigned comparison) and
|
||
< return 1 if the comparison is true, 0 otherwise.
|
||
>
|
||
<=
|
||
>=
|
||
|
||
- (unary) negates (2's complement) the operand
|
||
~ (unary) complements (inverts - 1's complement) the operand
|
||
|
||
|
||
The following operators must be provided at the full and extended compliance
|
||
levels:
|
||
|
||
<- shifts the first operand to the left the number of bits specified by the
|
||
second operand.
|
||
-> shifts the first operand to the right the number of bits specified by the
|
||
second operand.
|
||
|
||
@ (unary) subtracts the location counter at the beginning of the current
|
||
statement from the following address expression.
|
||
|
||
% (ternary, e.g.: X%Y:Z) This operator extracts bits Y through Z from X and
|
||
returns that result right justified.
|
||
|
||
|
||
Floating point operations: floating point numbers and operations are required
|
||
only at the full and extended levels. The following operations must be
|
||
available as well:
|
||
|
||
+ adds the two operands.
|
||
- subtracts the second operand from the first.
|
||
* multiplies the two operands.
|
||
/ divides the first operand by the second.
|
||
- (unary) negates the operand.
|
||
|
||
=
|
||
<> These operators compare the two operands and
|
||
< return 1 if the comparison is true, 0 otherwise.
|
||
>
|
||
<=
|
||
>=
|
||
|
||
|
||
|
||
String operations: strings and string operations are not required at the
|
||
subset level, but the standard recommends their presence. The following
|
||
string operations must be provided at the full and extended levels:
|
||
|
||
+ concatenates two strings
|
||
% (ternary, e.g., X%Y:Z) returns the substring composed of the characters in
|
||
X starting at position Y of length Z. Generate an error if X doesn't
|
||
contain sufficient characters.
|
||
|
||
=
|
||
<> These operators compare the two operands and
|
||
< return 1 if the comparison is true, 0 otherwise.
|
||
>
|
||
<=
|
||
>=
|
||
|
||
|
||
Set operations: sets and set operations are required only at the full and
|
||
extended levels. The following set operations must be provided:
|
||
|
||
+ union of two sets (logical OR of the bits).
|
||
* intersection of two sets (logical AND of the bits).
|
||
- set difference (set one ANDed with the NOT of the second set)
|
||
|
||
= returns 1 if the two sets are equal, zero otherwise.
|
||
<> returns 1 if the two sets are not equal, zero otherwise.
|
||
< returns 1 if the first set is a proper subset of the second.
|
||
<= returns 1 if the first set is a subset of the second.
|
||
> returns 1 if the first set is a proper superset of the second.
|
||
>= returns 1 if the first set is a superset of the second.
|
||
|
||
% (ternary, e.g., X % Y:Z) extracts elements Y..Z from X and returns those
|
||
items.
|
||
|
||
|
||
In addition to the above operators, several pre-defined functions are also
|
||
available. Note that these functions are not required at the subset
|
||
compliance level, only at the full and extended levels:
|
||
|
||
float(i) - Converts integer "i" to a floating point value.
|
||
trunc(r) - Converts real "r" to a 32-bit unsigned integer (or generates an
|
||
error).
|
||
valid(r) - returns "1" if r is a valid floating point value, 0 otherwise
|
||
(for example, if r is NaN, infinity, etc.)
|
||
length(s)- returns the length of string s.
|
||
lookup(s)- returns "1" if s is a valid symbol in the symbol table.
|
||
value(s) - returns value of symbol specified by string "s" in the symbol
|
||
table.
|
||
type(s) - returns type of symbol "s" in symbol table. Actual values
|
||
returned are yet to be defined.
|
||
mode(a) - returns the addressing mode of item "a". Used mainly in macros.
|
||
STR(s) - returns string s with a prefixed length byte.
|
||
ZRO(s) - returns string s with a suffixed zero byte.
|
||
DCI(s) - returns string s with the H.O. bit of its last char inverted.
|
||
RVS(s) - returns string s with its characters reversed.
|
||
FLP(s) - returns string s with its H.O. bits inverted.
|
||
IN(v,s) - returns one if value v is in set s, zero otherwise.
|
||
|
||
|
||
The following integer functions must be present at all compliance levels:
|
||
|
||
LB(i),
|
||
LBYTE(i),
|
||
BYTE(i) - returns the L.O. byte of i.
|
||
HB(i),
|
||
HBYTE(i) - returns byte #1 (bits 8-15) of i.
|
||
BB(i),
|
||
BBYTE(i) - returns bank byte (bits 16-23) of i.
|
||
XB,
|
||
XBYTE(i) - returns H.O. byte of i.
|
||
LW(i),
|
||
LWORD(i),
|
||
WORD(i) - returns L.O. word of i.
|
||
HW(i),
|
||
HWORD(i) - returns H.O. word of i.
|
||
WORD(i)
|
||
|
||
Pack(i,j)- returns a 16-bit value whose L.O. byte is the L.O. byte of i and
|
||
whose H.O. byte is the L.O. byte of j.
|
||
|
||
Pack(i,j,k,l)- returns a 32-bit value consisting of (i,j,k,l) where i is the
|
||
L.O. byte and l is the H.O. byte. Note: l is optional. If
|
||
it isn't present, substitute zero for l.
|
||
|
||
|
||
|
||
|
||
The order of evaluation for an expression is strictly left to right
|
||
unless parentheses are used to modify the precedence of a sub-expression.
|
||
Since parentheses are used to specify certain indirect addressing modes, the
|
||
use of paretheses to override the strict left-to-right evaluation order
|
||
introduces some ambiguity. For example, should the following be treated
|
||
as jump indirect through location $1001 or jump directly to location $1001?
|
||
|
||
JMP ($1000+1)
|
||
|
||
The ambiguity is resolved as follows: if the parenthesis is the first char-
|
||
acter in the operand field, then the indirect addressing mode is assumed.
|
||
Otherwise, the parentheses are used to override the left-to-right precedence.
|
||
The example above would be treated as a jump indirect through location $1001.
|
||
If you wanted to jump directly to location $1001 in this fashion, the state-
|
||
ment could be modified to
|
||
|
||
JMP 0+($1000+1)
|
||
|
||
so that the parenthesis is no longer the first character in the operand
|
||
field.
|
||
|
||
The use of parentheses to override the left-to-right precedence is
|
||
only required at the full and extended compliance levels. It is not
|
||
required at the subset compliance level.
|
||
|
||
|
||
|
||
|
||
|
||
Expression Types
|
||
----------------
|
||
|
||
Expressions, in addition to having a value associated with
|
||
them, also have a specific type. The three basic types of expressions are
|
||
integer, floating point, and string expressions. Integer expressions can
|
||
be broken down into subtypes as well. A hierarchical diagram is the easiest
|
||
way to describe integer expressions:
|
||
|
||
|
||
|
||
integers ------ constants ------------ user defined (enumerated) types
|
||
| |
|
||
| +----- simple numeric constants
|
||
|
|
||
|
|
||
+-- addresses ------------ direct page addresses
|
||
|
|
||
+----- absolute addresses --- full 16-bit
|
||
| |
|
||
| +- relative 8-bit
|
||
|
|
||
+----- long addresses
|
||
|
||
This diagram points out that there are two types of integer expres-
|
||
sions: constants and addresses. Further, there are two types of constants
|
||
and four types of addresses. Before discussion operations on these different
|
||
types of integer values, their purpose should be presented.
|
||
|
||
Until now, most 65xxx assembler did little to differentiate between
|
||
the different types of integer values. In this proposed standard, however,
|
||
strong type checking is enforced. Whereas in previous assemblers you could
|
||
use the following code:
|
||
|
||
label equ $1000
|
||
lda #Label
|
||
sta Label
|
||
|
||
such operations are illegal within the confines of the new standard. The
|
||
problem with this short code segment is that the symbol "label" is used as
|
||
both an integer constant (in the LDA instruction) and as an address
|
||
expression (in the STA instruction). To help prevent logical errors from
|
||
creeping into a program, the assembler doesn't allow the use of addresses
|
||
where constants are expected and vice versa. To that end, a new assembler
|
||
directive, CON, is used to declare constants while EQU is used to declare
|
||
an (absolute) address. Symbols declared by CON cannot be (directly) used
|
||
as an address. Likewise, symbols declared by EQU (and others) cannot be
|
||
used where a constant is expected (such as in an immediate operand).
|
||
|
||
Although this type checking can be quite useful for locating bugs
|
||
within the source file, it can also be a source of major annoyance. Some-
|
||
times (quite often, in fact) you may want to treat an address expression
|
||
as a constant or a constant expression as an address. Two functions are
|
||
used to coerce these expressions to their desired form: PTR and OFS.
|
||
PTR(expr) converts the supplied constant expression to an address expression.
|
||
OFS(expr) converts the supplied address expression to a constant expression.
|
||
The following is perfectly legal:
|
||
|
||
Cons1 CON $5A
|
||
DataLoc EQU $1000
|
||
lda #OFS(DataLoc)
|
||
sta PTR(Cons1)
|
||
|
||
For more information, see the section on assembler directives. PTR and OFS
|
||
are required at all compliance levels of this proposed standard.
|
||
|
||
While any constant value may be used anywhere a constant is allowed,
|
||
the 65c816 microprocessor must often differentiate between the various types
|
||
of address expressions. This is particularly true when emitting code since
|
||
the length of an instruction depends on the particular address expression.
|
||
If an expression contains only constants, direct page values, absolute
|
||
values, or long values, there isn't much of a problem. The assembler uses
|
||
the specified type as the addressing mode. If the expression contains mixed
|
||
types, the resulting type is as follows:
|
||
|
||
Expression contains: Result is:
|
||
| |
|
||
| |
|
||
+------------+-- Constants - Constant
|
||
| |
|
||
+-- Direct | - Direct
|
||
|
|
||
+--+ Absolute - Absolute
|
||
|
|
||
+--+- Long - Long
|
||
|
||
Allowable forms:
|
||
|
||
constant
|
||
direct constant+direct
|
||
absolute constant+absolute
|
||
long constant+long
|
||
absolute+long
|
||
constant+absolute+long
|
||
|
||
|
||
This says that if you expression contains only constants, then the
|
||
result is a constant. If it contains a mixture of constants and direct
|
||
page addresses, the result is a direct page address. Note that direct page
|
||
addresses cannot be mixed with other types of addresses. An error must be
|
||
reported in this situation (although you could get around it with an
|
||
expression of the form "abs+OFS(direct)"). Likewise, adding a constant to
|
||
an absolute address produces an absolute address. Adding an absolute and
|
||
a long address produces a long address, etc.
|
||
|
||
Sometimes, you need to force an expression to be a certain type.
|
||
For example, the instruction "LDA $200" normally assembles to a load
|
||
absolute from location $200 in the current data bank. If you need to force
|
||
this to location $200 in bank zero, regardless of the content of the DBR,
|
||
the address expression must be coerced to a long address. Coercion of this
|
||
type is accomplished with the ":D", ":A", ":L", and ":S" expression suffixes.
|
||
To force "LDA $200" to be assembled using the long address mode, the in-
|
||
struction is modified to be "LDA $200:L". The coercion suffix must always
|
||
follow the full address expression. The ":S" (for short branches) suffix
|
||
is never required, since a short branch (for BRA and BSR) is always assumed,
|
||
but it is included for completeness. For BRA and BSR, the ":L" suffix is
|
||
used to imply a long branch (+/- 32K) rather than the long addressing mode.
|
||
|
||
Caveats: If ":D" or ":A" is used to coerce a large address expression
|
||
to direct or absolute, the high order byte(s) of the expression are truncated
|
||
and ignored. The assembler must assume that when a programmer uses these
|
||
constructs he knows exactly what he's doing. Therefore, "LDA $1001:D" will
|
||
happily assemble this instruction into a "LDA $01" instruction despite the
|
||
actual value of the address expression.
|
||
|
||
|
||
|
||
|
||
|
||
Addressing Mode Specification
|
||
-----------------------------
|
||
|
||
65c816 addressing modes are specified by certain symbols in the op-
|
||
erand field. A quick rundown follows:
|
||
|
||
Addressing mode Format(s) Example(s)
|
||
--------------- ------------------ ----------------------
|
||
|
||
Immediate #<expression> LDA #0
|
||
=<expression> CMP =LastValue
|
||
|
||
Direct Page <expression> LDA DPG
|
||
<expression>:D LDA ANY:D
|
||
|
||
Absolute <expression> LDA ABS
|
||
<expression>:A LDA ANY:A
|
||
|
||
Long <expression> LDA LONG
|
||
<expression>:L LDA ANY:L
|
||
|
||
Accumulator {no operand} ASL
|
||
INC
|
||
|
||
Implied {no operand} CLC
|
||
SED
|
||
|
||
Direct, Indirect,
|
||
Indexed by Y (<direct expr>),Y LDA (DPG),Y
|
||
(<direct expr>).Y LDA (ANY:D).Y
|
||
|
||
Direct, Indirect,
|
||
Indexed by Y, Long [<direct expr>],Y LDA [DPG],Y
|
||
[<direct expr>].Y LDA [DPG].Y
|
||
|
||
Direct, Indexed by X,
|
||
Indirect (<direct expr>,X) LDA (DPG,X)
|
||
(<direct expr>.X) LDA (ANY:D.X)
|
||
|
||
Direct, Indexed by X <direct expr>,X LDA DPG,X
|
||
<direct expr>.X LDA DPG.X
|
||
|
||
Direct, Indexed by Y <direct expr>,Y LDX DPG,Y
|
||
<direct expr>.Y LDX DPG.Y
|
||
|
||
Absolute, Indexed by X <abs expr>,X LDA ABS,X
|
||
<abs expr>.X LDA ANY:A.X
|
||
|
||
Long, Indexed by X <long expr>,X LDA ANY:L,X
|
||
<long expr>.X LDA LONG.X
|
||
|
||
Absolute, Indexed by Y <abs expr>,Y LDA ANY:A,Y
|
||
<abs expr>.Y LDA ABS.Y
|
||
|
||
Program Counter
|
||
Relative (branches) <expression> BRA ABS
|
||
@<expression> BRA @ABS
|
||
|
||
PC Relative (PSH) @<expression> PSH @ABS
|
||
|
||
Absolute, Indirect (<abs expr>) JMP (ABS)
|
||
|
||
Absolute, Indexed,
|
||
Indirect (<abs expr>,X) JMP (ABS,X)
|
||
(<abs expr>.X) JMP (ABS.X)
|
||
|
||
Direct, Indirect (<dpg expr>) LDA (DPG)
|
||
STA (ANY:D)
|
||
|
||
Stack Relative <expr8>,S LDA 2,S
|
||
<expr8>.S LDA 2.S
|
||
|
||
Stack Relative,
|
||
Indirect, Indexed (<expr8>,S),Y LDA (2,S),Y
|
||
(<expr8).S),Y LDA (2.S),y
|
||
(<expr8),S).Y LDA (2,S).y
|
||
(<expr8).S).Y LDA (2.S).y
|
||
|
||
Block Move <long expr>,<long expr> MVN LONG,LONG
|
||
MVP LONG,LONG
|
||
|
||
|
||
<dpg expr>, DPG- Any direct page expression or symbol.
|
||
<abs expr>, ABS- Any absolute expression or symbol.
|
||
<long expr>, Long- Any long expression or symbol.
|
||
expr8- Any expression evaluating to a value less than
|
||
256.
|
||
|
||
|
||
Note: the only real difference between the existing standard and the proposed
|
||
standard is that the period (".") can be used to form an indexed address ex-
|
||
pression. This is compatible (in practice, as well as philosophy) with the
|
||
record structure mechanism supported by this proposed standard. This syntax
|
||
for the various addressing modes is required at all compliance levels.
|
||
|
||
Suggestion: (<dpg expr>):L, (<dpg expr>):L,Y, and (<dpg expr):L.Y
|
||
should be allowed as substitutes for [<dpg expr>], [<dpg expr>],Y, and
|
||
[<dpg expr].Y, respectively. This, however, is not required by this proposed
|
||
standard.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Assembler Directives and Pseudo-Opcodes
|
||
---------------------------------------
|
||
|
||
An assembler directive is a message to the assembler to change some
|
||
status or otherwise affect the assembly operation. It does not generate any
|
||
object code. A pseudo-opcode, on the other hand, is not a standard 65c816
|
||
instruction but does generate object code. Examples of assembler directives
|
||
include instructions that turn the listing on or off, define procedures,
|
||
equate labels to values, etc. Examples of pseudo-opcodes include instructions
|
||
like .BYTE which emit bytes of object code based on the instruction's
|
||
parameters.
|
||
|
||
|
||
Equates:
|
||
--------
|
||
|
||
Probably the most important assembler directives are the equates.
|
||
The equate directives let you associate a value and a type with a symbol.
|
||
The possible equates use the syntax:
|
||
|
||
<label> .EQU <16-bit value>
|
||
<label> .EDP <8-bit value>
|
||
<label> .EQL <24-bit value>
|
||
<label> .CON <32-bit value>
|
||
<label> .FCON <SANE floating point value>
|
||
|
||
All except .FCON are required at all compliance levels. .FCON is required
|
||
at the full and extended levels.
|
||
|
||
.EQU lets you define a absolute symbol; an address whose value is
|
||
relative to the DBR. An error should be generated if the value in the
|
||
operand field requires more than 16 bits. The type of the operand expression
|
||
is ignored. It may be a constant expression, a direct page expression, or
|
||
even a long address expression. As long as it's an integer expression an
|
||
can fit into 16 bits, it's quite acceptable.
|
||
|
||
.EDP (equate to direct page) is used to define direct page symbols.
|
||
Again, the operand field may be of any integer type as long as the result
|
||
fits into 8 bits. A recommended synonym for .EDP is .EPZ (equate to page
|
||
zero) in deference to the 6502's zero page addressing mode.
|
||
|
||
.EQL (equate long) defines long address expressions. As usual, the
|
||
operand field may contain any integer expression that fits within 24 bits.
|
||
|
||
.CON (constant) is used to define integer numeric constants. Any
|
||
32 bit numeric value may be specified in the operand field.
|
||
|
||
.FCON (floating point constant) is used to declare symbolic floating
|
||
point constants. Such constants must be stored in the symbol table as
|
||
80-bit SANE extended values.
|
||
|
||
In addition to the typed equates, this proposed standard also allows
|
||
an untyped equate, which takes the form:
|
||
|
||
<label> = <operand>
|
||
|
||
where "<operand>" is any valid operand that may appear in the operand field
|
||
of any instruction. <operand>'s type may be integer, string, floating point
|
||
and may also include an addressing mode. The following are all legal:
|
||
|
||
lbl = 5
|
||
lbl = 5.5
|
||
lbl = "Five"
|
||
lbl = Array,X
|
||
lbl = (dp,s),y
|
||
|
||
Labels defined by "=" may appear anywhere the operand field specified for
|
||
that label is allowed. In general, a simple string substitution should be
|
||
performed when a label defined by "=" is used. Note: a label declared by
|
||
"=" can be redefined without error throughout the program. The "=" directive
|
||
is required only at the full and extended compliance levels.
|
||
|
||
|
||
|
||
Data Definitions:
|
||
-----------------
|
||
|
||
While the equates are probably the most important assembler
|
||
directives, the data definition instructions are probably the most important
|
||
pseudo-opcodes around. These instructions are classed into four groups
|
||
determined by the types of operands they accept. In the following paragraphs
|
||
all optional items are enclosed within braces.
|
||
|
||
The first group of data reservation instructions accept any integer
|
||
type expression as operands. They are:
|
||
|
||
{label} .BYTE {expr1, expr2, ..., exprn}
|
||
{label} .WORD {expr1, expr2, ..., exprn}
|
||
{label} .LONG {expr1, expr2, ..., exprn}
|
||
|
||
If a label is present, it is treated as a statement label within the current
|
||
segment and assigned the value of the location counter before any bytes are
|
||
emitted. For the .BYTE opcode, one byte of data is emitted for each operand
|
||
in the operand field, that byte being the L.O. byte of each expression.
|
||
Operands are purely optional. If no operand appears, then an indeterminate
|
||
value is emitted. The .WORD opcodes outputs two bytes for each expression in
|
||
the operand field (or two indeterminate bytes if no operand is present). The
|
||
.LONG instruction outputs four bytes for each operand. These three pseudo-
|
||
opcodes must be present at all compliance levels.
|
||
|
||
The next group of pseudo-opcodes are used to create tables of
|
||
addresses. As such, they only allow symbols that have been defined by
|
||
.EQU, .EQL, "=" (as applicable), statement labels, procedure labels, and
|
||
segment labels in their operand fields. They are:
|
||
|
||
{label} .OFFS expr1 {,expr2, ..., exprn}
|
||
{label} .ADRS expr1 {,expr2, ..., exprn}
|
||
{label} .PTR expr1 {,expr2, ..., exprn}
|
||
|
||
.OFFS outputs two bytes for each operand; .ADRS outputs three bytes for
|
||
each operand; and .PTR outputs four bytes for each operand. These three
|
||
pseudo-opcodes are only required at the full and extended compliance levels.
|
||
|
||
The third group of declarations are used to create constant tables.
|
||
As such, they only allow symbols declared by .CON. They are:
|
||
|
||
{label} .SHORT expr1 {,expr2, ..., exprn}
|
||
{label} .INTEGER expr1 {,expr2, ..., exprn}
|
||
{label} .LONGINT expr1 {,expr2, ..., exprn}
|
||
|
||
These pseudo-ops output one, two, and four bytes respectively. These
|
||
pseudo-opcodes are not required at the subset compliance level, they are
|
||
required only at the full and extended levels.
|
||
|
||
Note: non-symbolic constants are allowed in any of the above
|
||
pseudo-opcodes. Only symbols should have their type information checked.
|
||
|
||
The last group of data declaration pseudo-opcodes are used to
|
||
initialize floating point values. These pseudo-ops are:
|
||
|
||
{label} .FLOAT {item1, item2, ..., itemn}
|
||
{label} .DOUBLE {item1, item2, ..., itemn}
|
||
{label} .EXTENDED {item1, item2, ..., itemn}
|
||
{label} .COMP {item1, item2, ..., itemn}
|
||
|
||
each instruction generates operands of 4, 8, 10, or 8 bytes in length,
|
||
respectively. If the operand field is left blank, the corresponding bytes
|
||
contain an indeterminate value, but the assembler should initialize them to
|
||
NaN (not a number). These four pseudo-opcodes are required only at the
|
||
full and extended levels.
|
||
|
||
Although not required by the standard, the following data declaration
|
||
directives are recommended and should be supported:
|
||
|
||
{label} .HBYTE expr1 {,expr2, ..., exprn}
|
||
{label} .BBYTE expr1 {,expr2, ..., exprn}
|
||
{label} .XBYTE expr1 {,expr2, ..., exprn}
|
||
{label} .HWORD expr1 {,expr2, ..., exprn}
|
||
|
||
the first three reserve one byte of memory for each operand and store the
|
||
H.O (bits 8-15), bank (bits 16-23), or extra byte (bits 24-31) respectively.
|
||
.HWORD reserves two bytes composed of bits 16-31 for each operand.
|
||
|
||
|
||
Arrays:
|
||
-------
|
||
|
||
Space for arrays and data tables can be reserved using the data
|
||
declaration statement mentioned above in conjunction with the "DUP" operator.
|
||
DUP is a binary operator that takes the form:
|
||
|
||
count DUP (list)
|
||
|
||
where count is some constant value and list is a (possibly empty) list of
|
||
values. The items in (list) are repeated "count" times. For example, the
|
||
following .BYTE statement reserves space for an array of 64 bytes and
|
||
initializes each byte to zero:
|
||
|
||
MyArray .BYTE 64 DUP (0)
|
||
|
||
The following statement reserves 256 bytes consisting of the values 1, 2, 3,
|
||
4, 5, 6, 7, and 8 repeated 32 times:
|
||
|
||
MyArray .BYTE 32 DUP (1,2,3,4,5,6,7,8)
|
||
|
||
|
||
The DUP operator is fully recursive. That is, one of the items in
|
||
the list may, itself, be a list defined by the DUP operator. For example,
|
||
|
||
Example .BYTE 16 DUP (0,1,2 DUP (3,4,5))
|
||
|
||
reserves 128 bytes consisting of the list "0,1,3,4,5,3,4,5" repeated 16 times.
|
||
|
||
If the DUP list is empty, e.g., "16 dup ()", then exactly one item
|
||
is reserved for each entry, but it is not initialized. The following example
|
||
reserves space for 128 uninitialized words:
|
||
|
||
OffsetTable .WORD 128 DUP ()
|
||
|
||
|
||
|
||
|
||
Type definitions:
|
||
-----------------
|
||
|
||
Enumerated data types can be declared with the ".TYPE" directive.
|
||
This directive takes the form:
|
||
|
||
{label} .TYPE item1 {,item2, ..., itemn}
|
||
|
||
The items in the list are assigned consecutive values starting from zero.
|
||
For example, in the following .TYPE statement, the symbols red, green, and
|
||
blue are assigned the values zero, one, and two, respectively:
|
||
|
||
colors .TYPE red,green,blue
|
||
|
||
The symbols in the operand field of a .TYPE statement must be unique and
|
||
undefined elsewhere (within the current scope, more on that later). The
|
||
.TYPE statement above is almost identical to the statements:
|
||
|
||
red .con 0
|
||
green .con 1
|
||
blue .con 2
|
||
|
||
However, there is one major difference. The .TYPE statement also defines a
|
||
symbol specified in the label field. This symbol can be used as a pseudo-
|
||
opcode to reserve space for values of the specified type. In the example
|
||
above, "colors" could be used as a pseudo-opcode to reserve space for the
|
||
values red, green, and blue. To differentiate type declarations from other
|
||
instructions, a special lead-in character is used. The slash ("/") is
|
||
recommended by this standard, but the user should have the option of choosing
|
||
this character via a setup program for the assembler. From the example
|
||
above, colors could be used as a pseudo-opcode in the following manner:
|
||
|
||
Christmas /colors red,green
|
||
Ocean /colors blue,green
|
||
Sky /colors blue
|
||
/colors red
|
||
Primaries /colors red,blue,green
|
||
|
||
Unlike other data reserving pseudo-opcodes, a "/colors" definition only
|
||
allows symbols that appear in the operand field of the associated .TYPE
|
||
statement or one of those symbols in a expression that contains a single
|
||
such symbol plus or minus a numeric constant, as long as the result is still
|
||
within the range of symbols declared for that type. E.g.,
|
||
|
||
Okay /colors red,green+1,blue
|
||
NotOkay1 /colors blue+2 ;Outside allowable range
|
||
NotOkay2 /colors red+blue ;can't add two such symbols
|
||
NotOkay3 /colors $25 ;Not red, green, or blue
|
||
|
||
If you need to coerce an expression to the proper form, simply use the type
|
||
name as a pseudo-function. E.g.,
|
||
|
||
ThisIsOkay /colors colors(0),blue ;Same as red, blue
|
||
|
||
If the operand is not appropriate, the assembler should generate a warning
|
||
and emit the code as though the .BYTE statement were used.
|
||
|
||
|
||
If there isn't a label starting in column one of a .TYPE statement
|
||
then the symbols defined in the operand field are applied to the previous
|
||
.TYPE statement. This allows you to create .TYPEs where several symbols
|
||
(which couldn't possibly fit on a single line) are declared as constants.
|
||
E.g.,
|
||
|
||
colors .TYPE red, yellow, blue
|
||
.TYPE orange, green violet
|
||
.TYPE brown, black, white
|
||
|
||
All of these symbols will be associated with "colors". A maximum of 256
|
||
symbols can be associated with a symbol via the .TYPE statement. Whenever
|
||
the data reservation form is used, exactly one byte is reserved for each
|
||
item in the operand field. If you need to reserve more than a single byte
|
||
for each item, use the record declarations described next.
|
||
|
||
The DUP operator can be used to define enumerated data type arrays,
|
||
e.g.,
|
||
|
||
LotsOfRed /colors 16 DUP (red)
|
||
|
||
|
||
|
||
Another form of the .TYPE statement allows you to declare byte
|
||
subrange values. A definition of this type takes the form:
|
||
|
||
label .TYPE start..stop
|
||
|
||
where start and stop are constant values in the range 0..255 and
|
||
start <= stop. Examples:
|
||
|
||
LessThan10 .TYPE 0..9
|
||
Nibbles .TYPE 0..$F
|
||
PrimaryColors .TYPE red..blue ;From above, is red, yellow, blue
|
||
|
||
|
||
Implementation of the .TYPE statement is required only at the full
|
||
and extended compliance levels.
|
||
|
||
|
||
|
||
Records:
|
||
--------
|
||
|
||
A record data structure can be defined with the ".RECORD" and ".ENDR"
|
||
directives using the syntax:
|
||
|
||
label .RECORD
|
||
<data declarations>
|
||
.ENDR
|
||
|
||
This creates a template, but does not generate any code. An example might
|
||
be:
|
||
|
||
CursorPosn .RECORD
|
||
ROW .BYTE 0
|
||
COLUMN .BYTE 0
|
||
.ENDR
|
||
|
||
This definition creates the type "CursorPosn". Like the .TYPE definitions,
|
||
the symbol defined by .RECORD can be used as a pseudo-opcode to reserve
|
||
storage for a variable. For example, to declare a variable of type
|
||
"CursorPosn" the following statement is used:
|
||
|
||
MyCursor /CursorPosn
|
||
|
||
This statement reserves two bytes, initialized to zeros, at the current
|
||
location counter.
|
||
|
||
Access to the fields of the record is accomplished by using the
|
||
"." operator, just like Pascal. E.g.,
|
||
|
||
lda MyCursor.ROW ;Fetches first byte.
|
||
lda MyCursor.COLUMN ;Fetches the second byte.
|
||
|
||
|
||
In the example above, the ROW and COLUMN fields of each variable
|
||
declared with CursorPosn are always initialized to zero. Any other value
|
||
could have been used by substituting the appropriate value, or an
|
||
indeterminate value could have been specified by the definition:
|
||
|
||
CursorPosn .RECORD
|
||
ROW .BYTE
|
||
COLUMN .BYTE
|
||
.ENDR
|
||
|
||
|
||
On occasion, you may want each record variable definition to
|
||
specify the initial values. This can be accomplished by specifying
|
||
parameters in the record definition. Parameters are specified by the
|
||
symbols: ?0, ?1, ..., ?9. ?0 corresponds to the first parameter, ?1 to
|
||
the second, etc. Consider the following record and variable definitions:
|
||
|
||
CursorPosn .RECORD
|
||
ROW .BYTE ?0
|
||
COLUMN .BYTE ?1
|
||
.ENDR
|
||
|
||
HomePosn /CursorPosn 0,0
|
||
LowerRight /CursorPosn 23,79
|
||
MyCursor /CursorPosn 5,10
|
||
|
||
|
||
The only problem with this definition form is that each CursorPosn
|
||
variable must supply exactly two operands. Sometimes you may want to have
|
||
a default value in the event an operand isn't specified. This is accomplished
|
||
using a record defintion of the form:
|
||
|
||
CursorPosn .RECORD ?0=0,?1=0
|
||
ROW .BYTE ?0
|
||
COLUMN .BYTE ?1
|
||
.ENDR
|
||
|
||
This definition instructions the assembler to allow zero or more parameters,
|
||
defaulting ?0 and ?1 to zero if their respective entries aren't present.
|
||
The .DEFAULT directive can also be used, particularly if you run out of
|
||
room on the .RECORD line:
|
||
|
||
OpenRec .RECORD ?0=0, ?1=1
|
||
.DEFAULT ?2=ZRO('Hello there'), ?3=2
|
||
FirstItem .WORD ?0
|
||
.LONG ?3
|
||
SecondItem .BYTE ?1, ?2
|
||
.ENDR
|
||
|
||
|
||
Record definitions are required at the full and extended compliance
|
||
levels, they are not required at the subset compliance level.
|
||
|
||
|
||
|
||
Sets:
|
||
-----
|
||
|
||
Bit string types can be declared using the .SET directive. .SET is
|
||
used in a manner quite similar to .TYPE except the items in the operand field
|
||
can be any constant whose value is less than 32. Up to 32 items may
|
||
appear in the operand field of a .SET definition. The syntax is
|
||
|
||
label .SET item1 {,item2, ..., itemn} ;n <= 32.
|
||
|
||
An alternate form is to specify the name of some type variable in the operand
|
||
field. The following definition creates a set of integers in the range
|
||
0..9:
|
||
|
||
LessThan10 .TYPE 0..9
|
||
SetOfDigits .SET LessThan10
|
||
|
||
|
||
Declaring a set variable is quite similar to declaring an enumerated
|
||
type variable or a record variable: simply use the set name as a pseudo-opcode
|
||
prefaced by a "/":
|
||
|
||
Digits /SetOfDigits
|
||
|
||
|
||
Set constants are specified by placing the items in the set within
|
||
a pair of braces. E.G.:
|
||
|
||
BitValues .TYPE 0..7
|
||
SetOfBitValues .SET BitValues
|
||
Bits /SetOfBitValues {0,1,2,3}
|
||
;
|
||
;
|
||
lda #{0,2,7}
|
||
sta Bits
|
||
|
||
|
||
The assembler, by default, should allow set constants composed of
|
||
the integer values 0..31. This allows programmers to easily deal with bits
|
||
by bit numbers rather than the integers those bit patterns represent. For
|
||
example, to strip all but the H.O. two bits in the (8-bit) accumulator, the
|
||
instruction "AND #{6,7}" makes a lot more sense than "AND #$C0". All other
|
||
entities appearing within "{" and "}" must appear somewhere in the operand
|
||
field of a .SET statement (or must be a member of a .TYPE definition if that
|
||
type appears in the operand field of a .SET).
|
||
|
||
|
||
|
||
Macros:
|
||
-------
|
||
|
||
Macros are created using the .MACRO and .ENDM directives. The syntax
|
||
for a macro definition is
|
||
|
||
label .MACRO {default parameter values}
|
||
<macro body>
|
||
.ENDM
|
||
|
||
Macros are invoked by placing an underscore, followed by the macro name (the
|
||
label in the .MACRO statement). The user should be able to change the macro
|
||
lead-in character from underscore to some other character via an assembler
|
||
set up program.
|
||
|
||
All labels declared within the macro are local to that definition unless the
|
||
".GLOBAL" directive is used to extend their scope. In general, global
|
||
macro labels (except, possibly, those defined by "=") are not useful anyway
|
||
since a duplicate label error might occur on the second invocation of the
|
||
macro.
|
||
|
||
The macro body consists of a sequence of assembler statements. Most
|
||
reasonable statements may be included in the macro body. The standard does
|
||
not required nested macro definitions. Nor need the macro definitions allow
|
||
.RECORD, .TYPE, or .SET definitions (since labels are local to the macro,
|
||
such definitions are dubious anyway).
|
||
|
||
Macro parameters are specified using ?0, ?1, ..., ?9, just as for
|
||
.RECORD definitions. "?#" can be used to determine the actual number of
|
||
parameters present. "?:expr" can be used to select a parameter using a
|
||
numeric expression. For example, "?:?#-1" returns the value of the last
|
||
parameter specified. Default values for the parameters can be specified
|
||
in the .MACRO operand field, or in a .DEFAULT statement, just like specifying
|
||
default values for .RECORD parameters. E.g.,
|
||
|
||
MyMacro .MACRO ?0=0, ?1=2
|
||
.DEFAULT ?2="Hello there"
|
||
.BYTE ?0
|
||
.WORD ?1
|
||
.BYTE ?2
|
||
.ENDM
|
||
|
||
then:
|
||
|
||
_MyMacro 10,20
|
||
|
||
generates the bytes:
|
||
10, 20, 0, Hello there
|
||
|
||
|
||
Macros, by the very nature, allow a variable number of parameters.
|
||
If more parameters are specified than there are references for, the extra
|
||
parameters are ignored. If fewer parameters are specified than there are
|
||
references for, the additional references will be treated as undefined
|
||
symbols. If you want to be able to force the user to enter an exact number
|
||
of parameters, then use the ?# in the default field to specify a fixed number
|
||
of parameters. The following macro definition requires the user to enter
|
||
exactly two parameters whenever TwoParms is invoked:
|
||
|
||
TwoParms .MACRO ?#=2
|
||
lda ?0
|
||
sta ?1
|
||
.ENDM
|
||
|
||
If the number of parameters is fixed at a certain value, default values
|
||
are not allowed in the macro definition.
|
||
|
||
Since macro parameters, in a macro invocation, are separated by
|
||
commas, you cannot directly create a macro of the form:
|
||
|
||
LDAIX .MACRO ?#=1
|
||
lda ?0
|
||
.ENDM
|
||
|
||
and invoke it by:
|
||
|
||
_LDAIX LBL,X
|
||
|
||
intending the "LDA LBL,X" instruction to be generated. Instead, the macro
|
||
mechanism will think that LBL and X are two different parameters and generate
|
||
an error since only a single parameter is allowed. The "<<" and ">>" symbols
|
||
are used as an escape mechanism to parenthesize such operands. To handle the
|
||
case above, the following statement could be used:
|
||
|
||
_LDAIX <<LBL,X>>
|
||
|
||
and this would generate the instruction "LDA LBL,X".
|
||
|
||
The lookup, value, type, and mode functions are quite useful for
|
||
dealing with macro parameters. The exact values returned by these functions
|
||
will be described at a later time.
|
||
|
||
For additional information on macros and dealing with macro para-
|
||
meters, see the sections on conditional assembly and while loops.
|
||
|
||
Macros are required only at the full and extended compliance levels.
|
||
|
||
|
||
|
||
Address Expression Functions:
|
||
-----------------------------
|
||
|
||
Format:
|
||
|
||
label .FUNC {default parameter values}
|
||
<function body>
|
||
.RETURN expr
|
||
.ENDF
|
||
|
||
The .FUNC statement lets programmers define their own address
|
||
expression functions that can be used in operand fields of assembly language
|
||
statements. The function body typically contains a sequence of equates
|
||
and other value computing statements; it may not contain any code generating
|
||
statements.
|
||
|
||
Like a macro definition, all symbols defined inside an address
|
||
expression function are local to that function. Likewise, default parameters
|
||
may be declared in the operand field of the .FUNC statement or via the
|
||
.DEFAULT statement. Alternately, you can specify that a fixed number of
|
||
parameters are required by using the "?#=expr" item in the operand field
|
||
of the .FUNC statement.
|
||
|
||
The expression following the .RETURN statement is the value returned
|
||
by the addressing mode function. Note that more than one .RETURN may appear
|
||
within the function (perhaps within the confines of a conditional assembly
|
||
sequence). If more than one .RETURN statement is encountered, all but the
|
||
last are ignored. The expression returned in the .RETURN operand field may
|
||
contain addressing modes in addition to the actual expression value. In
|
||
general, anything allowed as a macro parameter can be returned as an address
|
||
expression value.
|
||
|
||
An address expression function is invoked by placing the function
|
||
name in some other expression followed by the parameters enclosed within
|
||
parentheses. The parentheses are required even if the parameter list is
|
||
empty (just like the "C" programming language). Examples follow:
|
||
|
||
StripLONibble .FUNC ?#=1
|
||
value = ?0 AND $F0
|
||
.RETURN value
|
||
.ENDF
|
||
;
|
||
AppendTXT .FUNC ?#=1
|
||
string = ?0 + ".TXT"
|
||
.RETURN string
|
||
.ENDF
|
||
;
|
||
.
|
||
.
|
||
.
|
||
LDA #StripLONibble($FF)
|
||
.
|
||
.
|
||
.
|
||
.BYTE AppendTxt("MyString")
|
||
|
||
The LDA instruction generates
|
||
|
||
LDA #$F0,
|
||
|
||
the .BYTE statement becomes
|
||
|
||
.BYTE "MyString.TXT"
|
||
|
||
The latter example demonstrates that address expression functions can
|
||
return any valid type. This includes strings, records, sets, and any
|
||
other entity allowed in an operand field. Consider the following:
|
||
|
||
LBLX .FUNC ?#=2
|
||
L = ?0-?1,X
|
||
.RETURN L
|
||
.ENDF
|
||
|
||
LDA LBLX($100,10)
|
||
|
||
This generates the code:
|
||
|
||
LDA $100-10,X
|
||
|
||
|
||
Address expression functions are required only at the full and
|
||
extended compliance levels.
|
||
|
||
|
||
|
||
|
||
The Label Type
|
||
--------------
|
||
|
||
The ".LABEL" directive is used to declare a valueless symbol, that is, one which
|
||
is defined but is assigned no particular value. The syntax for the .LABEL directive is:
|
||
|
||
.LABEL symbol1 {, symbol2, ..., symboln}
|
||
|
||
Each symbol appearing in the operand field is inserted into the symbol table as a "label"
|
||
typed symbol.
|
||
|
||
Label-typed symbols are useful mainly in macros and in the operand fields of
|
||
conditional assembly statements. The only operations you can perform using label-typed
|
||
symbols are "=" and "<>". Most of the reserved symbols in the assembler (such as A, X,
|
||
Y, DBR, D, M, S, etc.) are actually label-typed symbols.
|
||
|
||
An example of where you might use a label-typed symbol follows:
|
||
|
||
CmpReg .MACRO ?#=2
|
||
.IF ?0=A
|
||
cmp ?1
|
||
.ELSE
|
||
.IF ?0=X
|
||
cpx ?1
|
||
.ELSE
|
||
.IF ?0=Y
|
||
cpy ?1
|
||
.ELSE
|
||
.PAUSE
|
||
.ENDIF
|
||
.ENDIF
|
||
.ENDIF
|
||
.ENDM
|
||
|
||
The "=" equate can also be used to defined label-typed symbols by specifying a
|
||
label-typed symbol in the operand field, e.g.,
|
||
|
||
ACC = A
|
||
XReg = X
|
||
etc.
|
||
|
||
Note that the last equate above does not allow you to enter indexed by X addressing modes as
|
||
|
||
<expression>,XReg
|
||
|
||
it simply allows you to use a statement of the form:
|
||
|
||
.IF XReg=X
|
||
|
||
and wind up assemblying the code after the ".IF".
|
||
|
||
The ".LABEL" directive is required at the full and extended compliance levels; it
|
||
is not required at the subset compliance level.
|
||
|
||
|
||
|
||
|
||
|
||
Procedures:
|
||
-----------
|
||
|
||
At the full and extended compliance levels, the .PROC and .ENDP
|
||
directives can be used to declare 65c816 procedures (subroutines). Procedure
|
||
declarations take the form:
|
||
|
||
procname .PROC {near|far}
|
||
|
||
<procedure body>
|
||
|
||
.ENDP
|
||
|
||
If an operand appears after the .PROC statement, it must be either "near" or
|
||
"far". If no operand appears, "near" is assumed.
|
||
|
||
The procedure name that appears in the label field of the .PROC
|
||
statement is assigned the current value of the location counter at that
|
||
point in the program. It is also given the type of near procedure or
|
||
far procedure, depending upon the .PROC operand field.
|
||
|
||
All labels defined inside a procedure are local to that procedure
|
||
unless the .GLOBAL directive is used to extend their scope beyond the
|
||
procedure. Therefore, labels inside one procedure may be reused outside
|
||
that procedure. If a label inside a procedure is already defined outside
|
||
that procedure an error is not generated, instead the new label supercedes
|
||
the old one INSIDE THE PROCEDURE (scoping rules are the same as for Pascal).
|
||
Procedures may be nested inside one other, the scoping rules used by Pascal
|
||
apply in such situations.
|
||
|
||
Inside the procedure, RET can be used in place of RTS or RTL. The
|
||
assembler will automatically choose the appropriate version depending upon
|
||
whether the procedure is a near or far procedure. If RTS is used inside a
|
||
FAR procedure or RTL is used inside a NEAR procedure, the assembler will
|
||
generate a warning.
|
||
|
||
The assembler automatically assembles JSR using the absolute or
|
||
long addressing mode depending upon the procedure definition. If the
|
||
assembler supports the JSL mnemonic and a JSL is used to call a NEAR
|
||
procedure, the assembler must generate an warning. If the address expression
|
||
following a JSR was coerced using the ":A" or ":L" suffixes, no warning will
|
||
be generated if the incorrect distance was specified. I.e., the following
|
||
does NOT generate an error:
|
||
|
||
JSR mysub:L
|
||
.
|
||
.
|
||
.
|
||
mysub .PROC NEAR
|
||
.
|
||
.
|
||
.
|
||
|
||
If you use a coercion operator, the assembler assumes that you know what
|
||
you are doing.
|
||
|
||
Note that the use of the .PROC statement is optional. You may con-
|
||
tinue to build and call subroutines without the .PROC directive. However,
|
||
using .PROC allows the assembler to perform additional type checking on
|
||
certain operations. An external data flow analysis program can also use the
|
||
procedure declarations to help locate logical bugs in your code.
|
||
|
||
.PROC and .ENDP are required at all compliance levels of the
|
||
standard.
|
||
|
||
|
||
|
||
|
||
|
||
Module Communication Directives:
|
||
--------------------------------
|
||
|
||
Three directives, .GLOBAL, .PUBLIC, and .EXTERNAL, are used to
|
||
communicate symbolic values across procedure, segment, and module boundaries
|
||
(a module is any one source file which is assembled as a whole unit). The
|
||
.GLOBAL directive is used to make symbols visible outside of procedures,
|
||
macros, functions, and records. The .PUBLIC directive is used to make
|
||
certain symbols visible outside the current module. The .EXTERNAL directive
|
||
is used to make symbols defined outside the current module visible within
|
||
the module.
|
||
|
||
The syntax for the .PUBLIC and .GLOBAL directives is identical, it
|
||
takes the form:
|
||
|
||
.PUBLIC symbol1 {,symbol2, ..., symboln}
|
||
and, .GLOBAL symbol1 {,symbol2, ..., symboln}
|
||
|
||
A label is not allowed in the label field of either mnemonic. The symbols
|
||
specified in the operand field of these two instructions are made known
|
||
outside the procedure or module where they currently reside. If a procedure
|
||
is nested inside another, the .GLOBAL statement makes its symbols known
|
||
only to the procedure encompassing the nested procedure. In the following
|
||
example, LCL is known only inside procedure X1 and X2, not to the whole
|
||
program:
|
||
|
||
X1 .PROC
|
||
.
|
||
.
|
||
X2 .PROC
|
||
.GLOBAL LCL
|
||
.
|
||
.
|
||
.ENDP
|
||
.ENDP
|
||
|
||
If you wanted to make LCL visible at the level above X1, then another
|
||
.GLOBAL statement must appear inside the X1 procedure declaring LCL to
|
||
be global to that procedure.
|
||
|
||
Another alternative is to use the .PUBLIC statement. Any symbol
|
||
declared public with .PUBLIC is instantly visible throughout the program
|
||
(within the confines of the scoping rules). However, keep in mind that
|
||
symbols declared as public are visible outside the current module as well
|
||
and may intefere with other modules.
|
||
|
||
The .EXTERNAL directive is used to obtain access to symbols declared
|
||
outside the current module. The syntax for the .EXTERNAL directive is:
|
||
|
||
.EXTERNAL symbol1:type {,symbol2:type, ..., symboln:type}
|
||
|
||
Again, no label is allowed in the label field of the .EXTERNAL directive.
|
||
The type item is any of NEAR, FAR, CONST, DIRECT, ABS, or LONG.
|
||
|
||
Note: symbols declared with "=", .MACRO, .RECORD, .SET, and .TYPE
|
||
may not appear as operands to the .GLOBAL, .PUBLIC, or .EXTERNAL directives.
|
||
|
||
These directives are not required at the subset compliance level,
|
||
only at the full and extended levels.
|
||
|
||
|
||
|
||
|
||
Segments:
|
||
---------
|
||
|
||
Segments are used to group a collection of logically and physically
|
||
related entities within a program. A segment may contain the program code,
|
||
variables, stack area, direct page area, or other such data. Typically
|
||
a segment is a load module. That is, a segment is loaded as a whole into
|
||
memory. If a program consists of two or more segments, they need not all
|
||
reside in memory at the same time. The memory manager/loader may load
|
||
segments as needed into memory.
|
||
|
||
Segment definitions are required at all compliance levels. All
|
||
programs must consist of at least one segment (this is a source of minor
|
||
incompatibility with existing assemblers). The most general form of the
|
||
segment definition is:
|
||
|
||
label .SEGMENT TYPE=expr {,ALIGN=expr} {,ORG=expr} {,NOCODE}
|
||
|
||
<segment body>
|
||
|
||
.ENDS
|
||
|
||
|
||
.SEGMENT lets you declare any general type of segment. The symbol in the
|
||
label field need not be unique, but if it is redefined elsewhere within the
|
||
current scope, it must appear on a .SEGMENT definition whose type is exactly
|
||
the same as the current definition.
|
||
|
||
Unlike .PROCs, .MACROs, etc., symbols defined inside a segment are
|
||
not local to the segment, but are instantly visible to the reset of the
|
||
module. If you need to declare local variables within a segment, use the
|
||
.LOCAL and .RELEASE directives.
|
||
|
||
The type of segment must be specified in the .SEGMENT operand field.
|
||
The actual segment types will be defined at a later date. For now, assume
|
||
the types used by the Apple //GS loader are specified after the TYPE= item.
|
||
The segment type describes the attributes of the segment, attributes such
|
||
as whether the segment is relocatable or absolute, fixed or movable, etc.
|
||
|
||
The optional ALIGN operand is used to determine some number of bytes
|
||
to which this segment (portion) must be aligned. If ALIGN=1 , the segment
|
||
will be aligned on any byte boundary. If ALIGN=2 then the segment will be
|
||
aligned on a word boundary, etc. Any value between 1 and $10000 can be used
|
||
(ALIGN=$10000 will align the segment on a bank boundary).
|
||
|
||
The ORG=expr option can be used to fix the starting address of the
|
||
segment. This option isn't normally used with code-generating segments.
|
||
It's mainly used to define I/O port addresses and other absolute variables.
|
||
|
||
The NOCODE option is used to declare that a segment will not generate
|
||
any code (i.e., it's just used to declare variables). If any 65c816 instruct-
|
||
ion appears in a NOCODE segment, an error will be generated. All data
|
||
declaring pseudo-opcodes (e.g., .BYTE) must specify indeterminate values else
|
||
an error will be reported.
|
||
|
||
If multiple segments with the same name appear in a module (or
|
||
across modules, for that matter), they will be combined into a single,
|
||
contiguous module by the assembler and/or linker. Consider the following:
|
||
|
||
MyCode .SEGMENT Type=$1AF
|
||
.
|
||
.
|
||
.
|
||
.ENDS
|
||
;
|
||
MyData .SEGMENT Type=$100
|
||
.
|
||
.
|
||
.
|
||
.ENDS
|
||
;
|
||
MyCode .SEGMENT Type=$1AF
|
||
.
|
||
.
|
||
.
|
||
.ENDS
|
||
|
||
|
||
Although MyCode appears in two completely disjoint areas, the assembler/linker
|
||
will combine these items into a single segment. Segments appear in the
|
||
load module in the order they are declared in the source file. In the
|
||
example above, segment MyCode appears before segment MyData (even though
|
||
a portion of MyCode appears after MyData, MyCode was still declared before
|
||
MyData).
|
||
|
||
Segments may be nested, but they don't follow any scoping rules.
|
||
Declaring one segment inside another is no different that declaring those
|
||
two segments completely separate.
|
||
|
||
If you have two separate segments (different names but the same
|
||
type), you can combine them together using the .GROUP directive. This
|
||
directive takes the form
|
||
|
||
label .GROUP seg1, seg2 {,seg3, ..., segn}
|
||
|
||
Referring to "label" refers to the segment obtained by combining the
|
||
segments in the .GROUP operand field.
|
||
|
||
To simply segment usage, there are six predeclared segments. They
|
||
may be declared with the directives:
|
||
|
||
.CODE .DATA .DIRECT
|
||
.STACK .VAR .CONST
|
||
|
||
.CODE is used to declare static, code-generating segments which allow
|
||
65c816 instructions. .DATA is used to declare static data-generating
|
||
segments. .CONST is identical to .DATA except data items inside the
|
||
.CONST directive are read-only. Any attempt to write to items inside a
|
||
.CONST segment should generate an error by the assembler or data flow
|
||
analysis programs. .DIRECT is used to declare segments containing direct
|
||
page variables. This is a NOCODE segment, so only definitions are allowed,
|
||
initial values are illegal. .STACK segments are also NOCODE segments. They
|
||
are useful for declaring stack space down in bank zero. The .VAR segment
|
||
is used like the .DATA segment, except .VAR segments are NOCODE segments.
|
||
They are used for declaring unintialized variables in main RAM.
|
||
|
||
The syntax for these six directives is
|
||
|
||
label .xxxx {ALIGN=expr | ORG=expr}
|
||
|
||
<segment body>
|
||
|
||
.ENDS
|
||
|
||
|
||
|
||
|
||
The ASSUME Directive
|
||
--------------------
|
||
|
||
With the addition of the bank registers and the mode bits in the
|
||
65c816 processor, an assembler can no longer determine the proper addressing
|
||
mode to use in all circumstances without help from the programmer. For
|
||
example, if the assembler encounters an instruction of the form "LDA Label"
|
||
and Label is a statement label inside some segment (i.e., not declared with
|
||
EDP, EQU, EQL, or other type-defining directive), it has no idea whether to
|
||
use the direct, absolute, or long addressing mode. To do so would require
|
||
that the assembler know the current values of the direct page and data bank
|
||
registers at assembly time. Frankly, it is not possible for the assembler
|
||
to always know the content of these registers, hence the programmer must
|
||
manually supply this information to the assembler. This information, as well
|
||
as some other useful information, is supplied to the assembler via the
|
||
.ASSUME directive.
|
||
|
||
The .ASSUME directive uses the syntax:
|
||
|
||
.ASSUME operand1 {,operand2, ..., operandn}
|
||
|
||
where operand(i) is one of the following:
|
||
|
||
DBR:expression24
|
||
DBR:NOTHING
|
||
DP:expression16
|
||
DP:NOTHING
|
||
M:expression1
|
||
M:NOTHING
|
||
X:expression1
|
||
X:NOTHING
|
||
CPU:cpu_type
|
||
|
||
where expression24 is an expression yielding a 24-bit value, expression16 is
|
||
an expression yielding a 16-bit value, expression1 is an expression yielding
|
||
zero or one, NOTHING is a reserved word, and cpu_type is one of {6502, 65c02,
|
||
65802, 65816} or one of the later versions of the 65c816 microprocessor.
|
||
|
||
DP (direct page) is used to let the assembler know where the direct
|
||
page register is pointing. If a segment name is given as the expression,
|
||
that segment must be one that resides in bank zero and is of type DIRECT.
|
||
If the assembler encounters a symbol declared in a segment that is assumed
|
||
to be a direct page segment via the DP:expression operand, the assembler will
|
||
reference that location using the direct page addressing mode (if posssible).
|
||
If the "DP:NOTHING" form is used, the assembler will only use the direct page
|
||
addressing mode if a symbol was declared with the EDP equate. None of the
|
||
segments will be treated as direct page segments, even if they were declared
|
||
as type DIRECT. If you want to simultaneously refer to several segments as
|
||
direct page segments, group them together using the .GROUP directive and
|
||
specify the group name as the expression value after the DP:, i.e.,
|
||
|
||
DPGroup .GROUP DPSeg1, DPSeg2, DPSeg3
|
||
.ASSUME DP:DPGroup
|
||
|
||
By default, the assembler should assume DP:NOTHING.
|
||
|
||
|
||
DBR is used to tell the assembler which segment/bank the DBR (data
|
||
bank register) points at. References to variables within that segment will
|
||
be assembled as absolute references (unless that segment name is also
|
||
specified after DP:expr, in which case the direct page addressing mode will
|
||
be used, if possible). If DBR:NOTHING is specified, absolute addressing will
|
||
be used only for those symbols declared via EQU, all other references will
|
||
be assumed to be long references. Note that the H.O. eight bits of the
|
||
24-bit expression are used. Therefore, to set the DBR assumption to an
|
||
absolute bank in memory, an expression of the form:
|
||
|
||
.ASSUME DBR:$200000 ;Assume DBR=$20
|
||
|
||
must be used. By default, the assembler should assume DBR:NOTHING.
|
||
|
||
Normally, a programmer should use "#" and "=" to specify eight or
|
||
sixteen bit immediate operand sizes. To help ensure upwards compatibility
|
||
with existing source code, a mechanism has been added whereby the "#" is
|
||
used and the .ASSUME directive controls the size of immediate operands. This
|
||
task is achieved using the M:expr and X:expr operands. Normally the assembler
|
||
defaults to M:NOTHING and X:NOTHING. In this mode, "#" specifies 8-bit
|
||
immediate operands and "=" specifies 16-bit operands. If the expression
|
||
following the M or X is zero or one, then any immediate operand containing
|
||
an equal sign is flagged as an error and the "#" specifies an eight-bit
|
||
operand if the expression was 1, a sixteen-bit operand if the expression was
|
||
zero. If the expression evaluates to any other value an error is generated.
|
||
Note that M only affects accumulator and memory operations while X affects
|
||
the index register operations. It is perfectly permissible to have an
|
||
.ASSUME of the form:
|
||
|
||
.ASSUME M:NOTHING,X:1
|
||
|
||
The "=" immediate specifier would be allowed for accumulator operations but
|
||
not for X/Y index register operations.
|
||
|
||
To help ensure compatibility with the existing defacto standard,
|
||
LONGI, LONGA, SHORTI, and SHORTA should be provided as built-in macros
|
||
generating the appropriate .ASSUME statement.
|
||
|
||
The "CPU:cpu_type" operand to the .ASSUME statement lets users
|
||
specify the exact 6500 family CPU they are using. The effect of this
|
||
operand is to "disconnect" certain instructions. If a certain CPU is
|
||
specified and a programmer uses an addressing mode or instruction which isn't
|
||
available on that CPU, the assembler will generate an error. By default,
|
||
the assembler should assume the CPU of the machine on which the assembler
|
||
is intended to run (e.g., 65c816 for Apple //GS machines). If the assembler
|
||
is running on a different processor other than a 6500 family chip, it should
|
||
default to 65c816. The user should be able to choose this default value
|
||
from an assembler set-up program.
|
||
|
||
The .ASSUME directive, and all operands available to it, must be
|
||
supported at all compliance levels.
|
||
|
||
|
||
|
||
Local Symbols
|
||
-------------
|
||
|
||
In addition to local labels automatically specified inside procedures,
|
||
macros, and expression functions, you can also explicitly declare local sym-
|
||
bols within the source file. User-defined local symbols come in two
|
||
varieties: numeric and symbolic.
|
||
|
||
Up to 10 active numeric local labels can be specified at any given
|
||
time. The numeric local labels are similar to those used by D. E. Knuth
|
||
in "The Art of Computer Programming, Vol 1", although the syntax is different.
|
||
Numeric local labels are declared by placing a caret (up-arrow) in front of
|
||
a single decimal digit in the label field. Examples follow:
|
||
|
||
^0 LDX #05
|
||
^9 DEX
|
||
^4 LDA LBL
|
||
|
||
Numeric local labels are referenced with the ">n" and "<n" items, where "n"
|
||
represents a single decimal digit. If the greater than symbol prefaces a
|
||
digit, then the next occurrence of that numeric local label in the source
|
||
file is referenced. If a less than ("<") symbol is used, then the previous
|
||
numeric local label is used. Examples:
|
||
|
||
|
||
LDX #5
|
||
^0 CLR Array,X
|
||
DEX
|
||
BPL <0 ;References 2nd line above.
|
||
;
|
||
LDA Array+2
|
||
bne >0 ;References ^0 below.
|
||
TXA
|
||
^0 STA Array+1
|
||
|
||
Note that multiple occurrences of the same numeric local label may appear
|
||
within the program. The are differentiated by the "<" and ">" symbols.
|
||
|
||
Since "<" and ">" may appear both as operators and as the beginning
|
||
of an operand, a minor ambiguity results. If you see a portion of an ex-
|
||
pression like ">0", does it mean 'is some value greater than zero' or does
|
||
it refer to the next occurrence of "^0"? This is easily handled from context.
|
||
If the ">" or "<" appears where an operator is expected, then the appropriate
|
||
operation is performed. If they appear where an operand is expected and they
|
||
are followed by a single decimal digit, then they are used as lead-ins for
|
||
numeric local labels. Otherwise an error must be generated.
|
||
|
||
Numeric local labels are great for those cases where you need to
|
||
perform a short branch or to set up a small loop and you don't want to use
|
||
meaningless mnemonics like "loop1", "SkipInstr12", etc. Other times, you
|
||
may want to use a meaningful name like "MainLoop" or "ElseQuit", without
|
||
having to worry about conflicts in other parts of the program. Such cases
|
||
are easily handled by the symbolic local label facility specified by this
|
||
proposal. Two assembler directives: .LOCAL and .RELEASE are used to define
|
||
the scope of user-specified local labels. The syntax for these two directives
|
||
is identical, it is:
|
||
|
||
.LOCAL label1 {,label2, ..., labeln}
|
||
.RELEASE label1 {,label2, ..., labeln}
|
||
|
||
A label defined with .LOCAL is confined to the scope of the .LOCAL/.RELEASE
|
||
pair. .LOCAL/.RELEASE pairs may be nested allowing you to redefine a symbol
|
||
to any reasonable depth (say, a minimum of 8 levels).
|
||
|
||
Numeric local labels are required at all compliance levels. Symbolic
|
||
local labels are required at the full and extended compliance levels.
|
||
|
||
|
||
|
||
Conditional Assembly
|
||
--------------------
|
||
|
||
Conditional assembly is handled by the .IF, .ELSE, .IF1, .IF2,
|
||
.IFDEF, .IFNDEF, and .ENDIF directives. .IF is followed by a numeric
|
||
address expression that yields a zero (false) or non-zero (true) result.
|
||
The following code (up to the .ELSE or .ENDIF) is assembled if the result is
|
||
true. Otherwise the code after the .ELSE (if it is present) is assembled in
|
||
its stead. .IF1 and .IF2 assemble their respective code during passes
|
||
one and two. .IFDEF and .IFNDEF accept a single symbol as their parameter
|
||
and test whether or not this symbol is currently defined. The .ELSE
|
||
directive can be used to assemble additional code in the event the tested
|
||
condition is false. Finally, the .ENDIF directive is used to terminate
|
||
a conditional assembly sequence.
|
||
|
||
Conditional assembly blocks can be nested to at least eight levels,
|
||
preferably more. Since all conditional assembly blocks are terminated with
|
||
.ENDIF, there is no need to worry about matching .ELSEs as you would, say,
|
||
in Pascal. Every form of the IF statement is terminated with its own .ENDIF.
|
||
|
||
The .IF1 and .IF2 directives are normally used to print messages and
|
||
perform other minor housekeeping chores. In general, there's absolutely no
|
||
reason why anyone would want to generate code inside one of these conditional
|
||
assembly blocks. Therefore, the assembler may optionally generate an error
|
||
message if the location counter is modified anywhere inside the .IF1/.IF2
|
||
conditional assembly block.
|
||
|
||
.IF, .ELSE, and .ENDIF are required at all compliance levels. .IF1,
|
||
.IF2, .IFDEF, and .IFNDEF are required only at the full and extended com-
|
||
pliance levels.
|
||
|
||
|
||
|
||
While Loops
|
||
-----------
|
||
|
||
Sometimes, especially within macros, you will need some sort of
|
||
looping structure to process parameters or otherwise generate sequences of
|
||
code; the .WHILE/.ENDW directives are used for this purpose. The syntax
|
||
for the while section is:
|
||
|
||
.WHILE expression
|
||
<body of loop>
|
||
.ENDW
|
||
|
||
The instructions in the loop body are repeated as long as the expression
|
||
yields a non-zero value. For the loop to terminate, the variable(s)
|
||
controlling the loop must be defined using the "=" assembler directive
|
||
since this is the only directive that allows you to redefine an instance
|
||
of a variable.
|
||
|
||
The .WHILE directive is especially useful for processing a macro
|
||
(or record definition) with a variable number of parameters. Consider the
|
||
following macro:
|
||
|
||
ByteTable .MACRO
|
||
ParmCnt = ?#
|
||
.WHILE ParmCnt
|
||
.BYTE ?:(?#-ParmCnt)
|
||
ParmCnt = ParmCnt-1
|
||
.ENDW
|
||
.ENDM
|
||
|
||
_ByteTable 0,5,4,2,7
|
||
|
||
This example emits the five bytes 0, 5, 4, 2, and 7 into the object code
|
||
stream.
|
||
|
||
INCLUDE Mechanism
|
||
-----------------
|
||
|
||
A source file include mechanism is provided by the .INCLUDE directive.
|
||
Its syntax is
|
||
|
||
.INCLUDE "filename"
|
||
|
||
The specified file will be inserted at the point of the .INCLUDE directive
|
||
in the current assembly, as though the code were actually inserted at that
|
||
point.
|
||
|
||
The include mechanism must be capable of nested includes up to four
|
||
levels deep. The .INCLUDE directive must be supported at all compliance
|
||
levels of the assembler, although assemblers operating at the subset
|
||
compliance level need not support nested include files.
|
||
|
||
|
||
|
||
|
||
Programs, Modules, and Units
|
||
----------------------------
|
||
|
||
The assembler handles three types of sources files: programs, modules,
|
||
and units. Unless otherwise specified, all source files are assumed to be
|
||
programs. A program is differentiated from a module or unit in that the
|
||
assembler/linker assumes that control is transferred to some point in a
|
||
'program' when it is loaded into memory. Modules and units are assumed to
|
||
be subserviant sections of code that contain data and/or code used by
|
||
programs.
|
||
|
||
By default, a piece of code is assumed to be a program and control
|
||
is transferred to the first byte of that code when the program is loaded
|
||
into memory. This helps improve compatibility with existing source files.
|
||
The .PROGRAM directive can be used to explicitly declare a piece of code as
|
||
a main program, as well as provide an entry address other than the first byte
|
||
of code emitted. The syntax for the .PROGRAM directive is
|
||
|
||
.PROGRAM label
|
||
|
||
where "label" is a program statement label somewhere within the current
|
||
assembly. The address of this label is passed on to the linker/loader where
|
||
it will be used to provide a starting address for the code. All of the
|
||
statements in the source file will be assembled into the program from the
|
||
.PROGRAM directive till the .END directive. If a .PROGRAM directive appears
|
||
in the source file, it must appear before any other statement (other than a
|
||
comment or listing directive) and there may only be one .PROGRAM directive
|
||
encountered per assembly. No modules or units may appear as part of a
|
||
program assembly (see below).
|
||
|
||
The .MODULE directive is used to tell the assembler that it is
|
||
assemblying an object code module which is to be linked into a separate
|
||
program before execution. The .PUBLIC statement is used as the means
|
||
to communicate linkage information to other modules, units, and programs.
|
||
Like the .PROGRAM directive, the .MODULE directive must appear before most
|
||
statements in the source file and the module is terminated with the .END
|
||
directive. However, another module may appear in the source file immediately
|
||
after the .END directive. Such modules are assembled as independent entries
|
||
in a library. The syntax for the .MODULE directive is:
|
||
|
||
.MODULE ModuleName
|
||
|
||
The module name operand is stored as part of the source file for use by the
|
||
linker, but is not otherwise refereced during the assembly process. In fact,
|
||
this symbol may be redefined later in the source file.
|
||
|
||
The .LINK directive can be used to link a module into another module,
|
||
unit, or program at assembly time. The syntax for this directive is:
|
||
|
||
.LINK "filename",ModuleName
|
||
|
||
where filename is the operating system name of the object code file or
|
||
library file containing the module, and ModuleName is the actual module
|
||
name specified with the .MODULE directive. The specified object code is
|
||
inserted into the assembly at the point of the .LINK directive. Access to
|
||
the symbols declared public within the module is accomplished using the
|
||
.EXTERNAL directive.
|
||
|
||
Units are a much more structured form of modules. With a unit,
|
||
you specify not only the symbols visible to the code using the unit, but
|
||
also how that data is used. Units also allow you to pass type checking
|
||
information so the assembler can check for possible logical errors during
|
||
assembly. Finally, as an added bonus, within units you can link in macros,
|
||
records, types, symbols defined by "=", and other entities that cannot be
|
||
handled by modules and the .PUBLIC/.EXTERNAL mechanism.
|
||
|
||
A unit takes the form:
|
||
|
||
.UNIT UnitName
|
||
|
||
<interface section>
|
||
|
||
.BEGIN
|
||
|
||
<implementation section>
|
||
|
||
.END
|
||
|
||
Like .MODULEs, several units may appear in the source file by simply following
|
||
the .END directive with the next unit definition. In fact, .MODULEs and
|
||
.UNITs can be intermixed in the same source file. If more than one module or
|
||
unit appears in the source file, they will be assembled into different slots
|
||
in the object file generated (i.e., a library file will be generated).
|
||
|
||
The interface section of a unit contains those items that will be
|
||
public to the unit. Equates, records, macros, types, sets, and any other
|
||
non-code generating declaration can be used in the interface section (note:
|
||
an exact list of items will be specified later). Such definitions will be
|
||
made available to the code that uses this unit as well as to the code in the
|
||
implementation section. In addition to such declarations, the interface
|
||
section may also contain .PROC definitions and .ENTRY definitions. The
|
||
.PROC definitions simply contain the .PROC statement (which must also appear
|
||
in the implementation section), the .ENTRY definition is used in lieu of the
|
||
.PUBLIC directive and takes the form:
|
||
|
||
label .ENTRY {NEAR or FAR}
|
||
|
||
An example of a simple unit might be:
|
||
|
||
.UNIT SimpleUnit
|
||
MyMac .macro
|
||
lda #0
|
||
sta ?0
|
||
.endm
|
||
;
|
||
ClrSub .proc near
|
||
SetTrue .proc far
|
||
SetIt .entry far
|
||
;
|
||
.BEGIN ;Start implemenation section.
|
||
;
|
||
ClrSub .proc near
|
||
_MyMac $11
|
||
ret
|
||
.endp
|
||
;
|
||
SetTrue .proc far
|
||
lda #1
|
||
SetIt sta $23
|
||
ret
|
||
.endp
|
||
.end
|
||
|
||
|
||
To use the code defined in a unit, the ".USE" directive is used in
|
||
a fashion not unlike the .LINK directive, namely,
|
||
|
||
.USE "filename",UnitName
|
||
|
||
where filename is the operating system pathname and UnitName is the name
|
||
specified in the operand field of the unit directive. Whenever the .USE
|
||
directive appears in a source file, the content of the implementation section
|
||
will be listed if the source listing option is turned on.
|
||
|
||
Whenever the .USE or .LINK directives are employed, the corresponding
|
||
object code is always inserted into the assembly. Therefore the assembler
|
||
is performing double duty, it's acting as both the assembler and linker.
|
||
With units, the assembler always performs the link operation. With modules,
|
||
you can defer the link operation to a separate linkage step, although there
|
||
are only a few instances where this would be beneficial (for example, while
|
||
creating libraries).
|
||
|
||
All of the program linkage directives are optional at the subset
|
||
compliance level, but required at the full and extended levels.
|
||
|
||
|
||
|
||
Listing Controls
|
||
----------------
|
||
|
||
Several directives are used to control the appearence of the assem-
|
||
bled source listing. The exact format of the listing will be specified with-
|
||
in this proposal (although at a later date). The exact listing format must
|
||
be adhered to so that symbolic debuggers can take advantage of an assembled
|
||
source listing saved as a text file for use when stepping through a program.
|
||
|
||
.ON operands
|
||
.OFF operands
|
||
|
||
These two directives are used to turn certain listing options or
|
||
or off. Valid operands include LIST, OBJ, MAC, and COND. LIST controls
|
||
whether or not the source file is listed and supercedes all other options.
|
||
OBJ (if on) will force the assembler to display all bytes of object code
|
||
emitted by an instruction, even if it takes more than one line to display it
|
||
all; if off, OBJ will only display the number of emitted object code bytes
|
||
that fit on the current source line. MAC controls macro expansions during
|
||
the listing. If off, only the macro name, not the expansion, will be dis-
|
||
played. COND controls the printing of statements in a false conditional or
|
||
while loop section.
|
||
|
||
The .TITLE and .SUBTITLE directives let you assign titles and sub-
|
||
titles to the source file. The syntax for these directives is
|
||
|
||
.TITLE "Title of source file"
|
||
.SUBTITLE "Subtitle for this section"
|
||
|
||
The title is displayed at the top of each page and the subtitle is displayed
|
||
immediately below the title. .TITLE always forces a page eject, .SUBTITLE
|
||
never does.
|
||
|
||
The .PAGE directive forces an immediate page ejection on the listing.
|
||
It requires no operands.
|
||
|
||
The .PRINTF directive has the syntax:
|
||
|
||
.PRINTF "Control string" {,operands}
|
||
|
||
It is used in a manner analogous to the PRINTF in the "C" programming
|
||
language. If expressions follow the control string, "%" modifiers in the
|
||
control string specify their output format. E.g.,
|
||
|
||
.PRINTF "Label = $%4h",Label
|
||
|
||
would print
|
||
|
||
Label = $1234
|
||
|
||
assuming the value associated with Label was $1234.
|
||
|
||
|
||
The .PAUSE directive can be used to force an assembly time error.
|
||
It is useful mainly in macros, records, expression functions, etc. to force
|
||
an error if an illegal condition (like bad number of parameters) occurs.
|
||
|
||
The listing control directives are required only at the full and
|
||
extended compliance levels.
|
||
|
||
|
||
|
||
Data Flow Analysis Directives
|
||
-----------------------------
|
||
|
||
The following directives are quite useful to add-on debuggers and
|
||
data flow analysis programs. They are required only at the full and extended
|
||
compliance levels:
|
||
|
||
label .table
|
||
<data table>
|
||
.endt
|
||
|
||
For .table, the label is assigned the current value of the location counter
|
||
and label is treated like a statement label. .TABLE and .ENDT are otherwise
|
||
ignored.
|
||
|
||
label .REF label1 {, label2, ..., labeln}
|
||
|
||
This statement is ignored by the assembler. The statement label, if present,
|
||
is also ignored.
|
||
|
||
|
||
|
||
Other Optional Goodies
|
||
----------------------
|
||
|
||
The following are not required by this proposal, but should be
|
||
provided nonetheless:
|
||
|
||
.system "DOS command"
|
||
|
||
.SYSTEM issues the specified command to the operating system. This command
|
||
is useful for deleteing files during assembly, changing directories, etc.
|
||
|
||
|
||
|
||
|
||
|
||
Operation of the Assembler
|
||
--------------------------
|
||
|
||
Given the structure of the assembler, there's no way it can accomplish
|
||
its job in less than three passes without placing severe burdens on the
|
||
user (I could provide you with a mathematical proof of this, but I don't want
|
||
to bore you to death). Therefore, the standard specifies that the assembler
|
||
must use three (or more) passes to do its job. During the first pass the
|
||
assembler associates labels with segments (and groups of segments), determines
|
||
whether or not those symbols are near or far, and performs other housekeeping
|
||
chores fit for pass one. Pass two of the assembler is equivalent to the
|
||
traditional pass one of an assembler, it computes the values for all of the
|
||
symbols in the program. Pass three generates the actual object code.
|
||
|
||
|
||
|
||
In Addition to the Assembler...
|
||
-------------------------------
|
||
|
||
The standard should also include specifications for a run-time
|
||
library to be provided with the assembler as well as a list of tools
|
||
(e.g., debugger, linker, librarian, etc.) which must be provided with the
|
||
product to meet the full compliance level. I would like to propose the
|
||
following items in the run-time library:
|
||
|
||
TTY_IO: A set of routines to communicate with a text-based
|
||
user console. INIT, GETC, and PUTC are the basic routines. These three
|
||
routines are easily supported on any system supporting a user console.
|
||
|
||
TERMINAL_IO: A set of routines to communicate with a cursor-based
|
||
terminal device. Routines supported should include INIT, GETC, PUTC, GOTOXY,
|
||
HOME, CLREOLN, and CLREOP.
|
||
|
||
CONSOLE_IO: A set of routines to communicate with a DMA-based video
|
||
display device. See the specifications for ANIX's CHARIO driver for the
|
||
routines to be supplied with this library entry.
|
||
|
||
AUX_IO: A driver for a set of one or more serial communication ports.
|
||
Routines should include INITA, SETUPA, GETA, PUTA, STATUSA.
|
||
|
||
PRT_IO: A driver for a set of one or more printer ports. Routines
|
||
should include INITP, SETUPP, PUTP, and STATUSP.
|
||
|
||
NET_IO: A driver for a set of one or more network ports. Routines
|
||
should include INITN, SETUPN, GETPacket, SendPacket, etc.
|
||
|
||
CLK_IO: A driver for a real time clock or clock-calendar unit.
|
||
|
||
FP: An IEEE floating point package for the 65c816 chip.
|
||
|
||
MATH: A set of integer math routines (multiply, divide, extended
|
||
precision, etc.).
|
||
|
||
CONV: A set of conversion routines (binary -> decimal, etc.).
|
||
|
||
FILE_IO: A set of routines that interface to the host's operating
|
||
system providing a common interface to various operating systems.
|
||
|
||
DVC_IO: A hardware independent device I/O package (allowing named
|
||
devices which can be connected through a BIOS (like the AUX_IO and PRT_IO
|
||
packages) to various hardware devices.
|
||
|
||
STD_IO: A set of routines to perform various I/O operations such
|
||
as PRINT, PRINTF, SCANF, PUTI (integer), GETI, PUTH (hex), GETH, etc.
|
||
|
||
MEM_MGR: A set of memory management routines to efficiently allocate
|
||
and deallocate memory.
|
||
|
||
|
||
This is, by no means, an exhaustive list, but a quick sample of the types of
|
||
routines that should be provided.
|
||
|
||
Apple //GS users may complain that many of these routines already
|
||
exist within the confines of the Apple toolbox. The intent, however, is to
|
||
provide a set of useful routines that can be utilized on ANY 65c816 system
|
||
so 65c816 code can be easily ported to systems other than the Apple //GS.
|
||
|
||
|
||
|
||
|
||
|
||
|