911 lines
37 KiB
Plaintext
911 lines
37 KiB
Plaintext
|
|
|||
|
This is a very long article (36k) of interest to hardware
|
|||
|
hackers, assembly language programmers, and machine
|
|||
|
architects. It is a description of how I feel the
|
|||
|
65xxx family should evolve. Don't count on anything
|
|||
|
you read here. Nonetheless, you might find it interesting.
|
|||
|
If you're not one of the aforementioned types, you may
|
|||
|
want to skip the noise which follows...
|
|||
|
|
|||
|
|
|||
|
The 65C816 Dream Machine
|
|||
|
|
|||
|
This essay is an attempt to vent my frustrations.
|
|||
|
While the 65C816 chip is, without question, better than
|
|||
|
the 6502 and 65c02 chips that preceded it, the 65c816
|
|||
|
leaves a lot to be desired. Unless you count
|
|||
|
microcontrollers like the 8048, F8, or 8051, I've never
|
|||
|
encountered a chip as difficult to program in assembly
|
|||
|
language as the 65c816. Those stupid M and X bits cause
|
|||
|
so much trouble I wonder if they're worth the trouble of
|
|||
|
using them. Attempting to use the 65c816 in native mode
|
|||
|
while attempting to coexist with other 6502 routines
|
|||
|
(requiring emulation mode) such as ProDOS 8 can really
|
|||
|
push one's patience. But wait! There's a small chance
|
|||
|
things can be improved. The WDM (William D. Mensch)
|
|||
|
instruction is reserved by the Western Design Center for
|
|||
|
instruction set expansion. While I'm sure Mr. Mensch has
|
|||
|
other plans for this opcode, the following treatise
|
|||
|
provides my views on how this single opcode should be
|
|||
|
used.
|
|||
|
|
|||
|
The WDM opcode should be used in the next version of
|
|||
|
the 65c816 (let's call it the 65c820, just to be amusing)
|
|||
|
to change the instruction set. When the 65c820 resets,
|
|||
|
it should come up in the 6502 emulation mode, just like
|
|||
|
the 65c816 does now. The XCE instruction could be used
|
|||
|
to switch to 65c816 mode just like the existing 65c816
|
|||
|
part. The WDM opcode, which I'll call NAT (for NATive
|
|||
|
mode) will be used to switch the processor to 65c820
|
|||
|
native mode. Once in the 65c820 mode, the 65c820 takes
|
|||
|
on a completely different character. The only bounds
|
|||
|
I've placed on the new instruction set is that if you can
|
|||
|
perform an operation with a single instruction on the
|
|||
|
65c816, you can perform the same thing on the 65c820 with
|
|||
|
a single instruction. All other aspects (including
|
|||
|
timing and instruction size) can vary. I've also taken
|
|||
|
some liberties with the way certain instructions affect
|
|||
|
the flags. For the most part however, 65c816
|
|||
|
instructions have an identical counterpart on the 65c820.
|
|||
|
|
|||
|
Design Issues: There are lot's of reasons for
|
|||
|
designing a new instruction set. My criteria are as
|
|||
|
follows:
|
|||
|
|
|||
|
1) The instruction set must mirror the philosophy of the
|
|||
|
6500 family. A programmer experienced with the 6502
|
|||
|
instruction set must feel comfortable with the 65c820
|
|||
|
instruction set.
|
|||
|
|
|||
|
2) The new instruction set must support high level
|
|||
|
language constructs better than the 6502 and 65c816
|
|||
|
processors.
|
|||
|
|
|||
|
3) The new instruction set must be easy to learn and fun
|
|||
|
to use.
|
|||
|
|
|||
|
4) We must remember that fancy instructions are very
|
|||
|
difficult to implement in silicon. Hence super fancy
|
|||
|
instructions which provide limited functionality must
|
|||
|
be left out. For example, the 65c820 doesn't support
|
|||
|
floating point instructions (although they could be
|
|||
|
added via a coprocessor).
|
|||
|
|
|||
|
5) The only (commercially popular) computer system that
|
|||
|
would ever use the 65c820 is an upgrade of the Apple
|
|||
|
IIGS. Hence the instruction set should contain
|
|||
|
instructions that enhance the operation of an Apple II
|
|||
|
family machine.
|
|||
|
|
|||
|
6) The original 6502 instruction set was designed with a
|
|||
|
small set of basic instructions complemented with a
|
|||
|
large set of addressing modes. The 65c816 strayed
|
|||
|
from this philosophy, the 65c820 returns to it.
|
|||
|
|
|||
|
Based on these design issues, I offer the following
|
|||
|
machine; the 65c820:
|
|||
|
|
|||
|
|
|||
|
_________________________________________________________
|
|||
|
______________________
|
|||
|
|
|||
|
Programmer's Model:
|
|||
|
|
|||
|
The 65c820 will contain several additional registers,
|
|||
|
above and beyond those available on the 65c816. All
|
|||
|
registers are 16 bits. The register bank includes:
|
|||
|
|
|||
|
A, AX -- Accumulator and accumulator extension
|
|||
|
X -- X index register
|
|||
|
Y -- Y index register
|
|||
|
F -- Stack frame pointer
|
|||
|
S -- Stack pointer
|
|||
|
D -- Direct page register
|
|||
|
P -- Program status word
|
|||
|
ABR -- Auxillary bank register
|
|||
|
SBR -- Stack bank register
|
|||
|
DBR -- Data bank register
|
|||
|
PBR -- Program bank register
|
|||
|
PC -- Program counter
|
|||
|
LBound -- Low bounds register
|
|||
|
HBound -- High bounds register
|
|||
|
|
|||
|
A, X, Y, S, D, & PC are mostly identical to their 65c816
|
|||
|
counterparts. AX is the accumulator extension used by the
|
|||
|
multiply and divide instructions. F is a special index
|
|||
|
register, useful for accessing local variables and
|
|||
|
parameters. P differs from the 65c816 version in that it
|
|||
|
is 16-bits wide. Accessing the upper byte of this
|
|||
|
register is a privileged operation (more on that later
|
|||
|
on). DBR and PBR are similar to their 65c816 cousins,
|
|||
|
except they are now 16-bits long and allow you to
|
|||
|
position the program and data banks on any PAGE boundary
|
|||
|
(rather than a bank [64K] boundary). ABR is an auxillary
|
|||
|
data bank register. SBR lets you locate the stack
|
|||
|
anywhere in the 16Mbyte address space. LBound and HBound
|
|||
|
provide some rudimentary memory management functions.
|
|||
|
All memory addresses are added to LBound to produce the
|
|||
|
true physical address. If the result- ing address is
|
|||
|
greater than HBound, an ABORT trap will be issued. This
|
|||
|
allows you to load multiple programs into memory and
|
|||
|
protect them from being walked on by other programs.
|
|||
|
|
|||
|
As I alluded to earlier, certain operations are
|
|||
|
PRIVILEGED. The 65c820's program status word takes the
|
|||
|
following form:
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 | 7 6 5 4 3 2 1 0 U/S
|
|||
|
I M fpc * * * * | N V * * D dir Z C
|
|||
|
|
|||
|
The low order 8 bits are identical to the 6502's P
|
|||
|
register except the B bit isn't present (it's not
|
|||
|
required) and the I bit has been moved to bit 14. The
|
|||
|
dir bit controls the direction of various string
|
|||
|
instructions (ala 8086). The low order 8 bits are called
|
|||
|
the USRPSW (user program status word). The upper 8 bits
|
|||
|
are called tye SYSPSW (system program status word) and
|
|||
|
can only be accessed while in the system mode. Bit 15
|
|||
|
(U/S) is the user/supervisor bit. This bit determines
|
|||
|
whether or not you are in the user or system (supervisor)
|
|||
|
mode. Bit 14 is the interrupt disable bit. For
|
|||
|
protection reasons, a user mode program cannot have
|
|||
|
access to the interrupt disable bit (by turning off all
|
|||
|
interrupts and not turning them back on, a user mode
|
|||
|
program can cause all kinds of havoc). Bit 13 is the
|
|||
|
memory management bit. If set, the LBound the HBound
|
|||
|
registers determine the location and extent of the
|
|||
|
logical address space. If clear, then the logical
|
|||
|
address space and physical address space are the same.
|
|||
|
The fpc bit determines if a floating point coprocessor is
|
|||
|
installed. If not, the floating point expansion
|
|||
|
instructions will cause an illegal instruction trap,
|
|||
|
otherwise, the FP instructions will be routed to the
|
|||
|
floating point coprocessor. The remaining bits in the P
|
|||
|
register are reserved for future use.
|
|||
|
|
|||
|
|
|||
|
Opcode Format:
|
|||
|
|
|||
|
The 65c820's instruction set is broken down into 32
|
|||
|
classes. They are
|
|||
|
|
|||
|
0-MOV, 1-LEA, 2-LEAA, 3-LEAD, 4-LEAS, 5-XCHG, 6-
|
|||
|
ADD, 7-ADC 8-SUB, 9-SBC, A-CMP, B-AND, C-OR,
|
|||
|
D-XOR, E-ASH, F-LSH 10-ROT, 11-BIT, 12-ADDQ, 13-CMPQ,
|
|||
|
14-exp, 15-exp, 16-exp, 17-exp 18-exp, 19-Scc, 1A-Ccc,
|
|||
|
1B-Icc, 1C-Brnch,1D-Brnch,1E-exp, 1F-exp
|
|||
|
|
|||
|
"exp" refers to expansion.
|
|||
|
|
|||
|
The "typical" instruction format (for opcodes $00..$11)
|
|||
|
is
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 | 7 6 5 4 3 2 1 0 a a
|
|||
|
a a a a s d | r r r o o o o o
|
|||
|
|
|||
|
where a = addressing mode bits s = size
|
|||
|
(0=byte, 1=word) d = direction (0=to addressing
|
|||
|
mode loc, 1=from addressing mode loc) r = register
|
|||
|
o = opcode (one of the group values above).
|
|||
|
|
|||
|
There are 64 possible addressing modes (since there are
|
|||
|
six "a" bits). The register bits refer to the first
|
|||
|
eight of these addressing modes (0..7).
|
|||
|
|
|||
|
0- A 10- d,F 20- d,X 30- d,Y 1- X
|
|||
|
11- a,F 21- a,X 31- a,Y 2- Y 12-
|
|||
|
n(d,F) 22- a,FX 32- a,FY 3- S 13- n(a,F)
|
|||
|
23- l,X 33- l,Y 4- F 14- n[d,F] 24- (X)
|
|||
|
34- (Y) 5- TOS 15- n[a,F] 25- (d,X) 35-
|
|||
|
(d),Y 6- Imm 16- (d,F) 26- n(d,X) 36-
|
|||
|
n(d),Y 7- d 17- [d,F] 27- [d,X] 37-
|
|||
|
[d],Y 8- a 18- (a,F) 28- n[d,X] 38-
|
|||
|
n[d],Y 9- l 19- [a,F] 29- n(d,FX) 39-
|
|||
|
n(d,F),Y A- d,S 1A- P 2A- n[a,FX] 3A-
|
|||
|
n[a,F],Y B- (d,S),Y 1B- D 2B- [a,FX] 3B-
|
|||
|
[a,F],Y C- (d) 1C- ABR 2C- (a,X) 3C-
|
|||
|
(d),Y+ D- [d] 1D- SBR 2D- n(a,X) 3D-
|
|||
|
(d),-Y E- n(d) 1E- DBR 2E- [a,X] 3E-
|
|||
|
[d],Y+ F- n[d] 1F- PBR 2F- n[a,X] 3F-
|
|||
|
[d],-Y
|
|||
|
|
|||
|
where:
|
|||
|
|
|||
|
A, X, Y, S, F, P, D, ABR, SBR, DBR, and PBR are the
|
|||
|
corresponding 65c820 registers. Imm refers to an
|
|||
|
immediate operand. d refers to an eight-bit value,
|
|||
|
usually (but not always) a direct page address. a refers
|
|||
|
to a 16-bit absolute address. l refers to a 24-bit long
|
|||
|
address n is a displacement of the form one byte, +/-
|
|||
|
64 if the H.O. bit is zero. two bytes, H.O. byte
|
|||
|
first, +/- 16383 if the H.O. bit is one.
|
|||
|
|
|||
|
All addressing mode containing F, FX, or FY are relative
|
|||
|
to the SBR register. Any "d" address appearing in such an
|
|||
|
addressing mode is simply an 8-bit displacement relative
|
|||
|
to the frame pointer. FX means add F and X and use the
|
|||
|
result as the frame pointer. FY is the same, but using
|
|||
|
the Y register.
|
|||
|
|
|||
|
Y+ and -Y are autoincrement and autodecrement addressing
|
|||
|
modes. For autoincrement, the Y register is incremented
|
|||
|
after the value contained in Y is used. For auto-
|
|||
|
decrement, the Y register is decremented before the value
|
|||
|
is used.
|
|||
|
|
|||
|
Addressing modes of the form n[---]-- compute the
|
|||
|
effective address specified by the indirect operation and
|
|||
|
then add the specified offset to the effective address to
|
|||
|
obtain the true effective address. For example, if Y
|
|||
|
contains 5 and location $00 points at $1000 in the DBR,
|
|||
|
then 4(0),y refers to location $1009.
|
|||
|
|
|||
|
The TOS addressing mode refers to the Top Of Stack, more
|
|||
|
on this later.
|
|||
|
|
|||
|
|
|||
|
General Instructions:
|
|||
|
|
|||
|
The general instructions (opcodes $00..$11) all take the
|
|||
|
form:
|
|||
|
|
|||
|
Instr Source, Dest
|
|||
|
|
|||
|
Where Instr is the instruction mnemonic, Source is the
|
|||
|
address of a source operand, and Dest is the address of a
|
|||
|
destination operand. At least one of the two operands
|
|||
|
must be a "register" addressing mode. The register
|
|||
|
addressing modes are the first eight addressing modes
|
|||
|
listed above. If the source operand is a register
|
|||
|
addressing mode, then the direction bit in the
|
|||
|
instruction is zero, otherwise it is one. If the source
|
|||
|
addressing mode is the immediate addressing mode, the
|
|||
|
flags are set by the result of the operation, but nothing
|
|||
|
else is changed. For example, MOVB #0,#n sets the zero
|
|||
|
flag since a zero bit is moved, but the zero isn't
|
|||
|
actually moved anywhere. Note that a "B" or "W" suffix
|
|||
|
is used on the mnemonics to specify the instruction size.
|
|||
|
|
|||
|
Three important register addressing mode greatly enhance
|
|||
|
the capabilities of the 65c820 processor: the TOS, Imm,
|
|||
|
and d register addressing modes. Since d is a register
|
|||
|
addressing mode, any direct page memory location can be
|
|||
|
used as a "register". This greatly enhances the
|
|||
|
flexibility of the 65c820. This effectively gives you
|
|||
|
256 registers to play around with.
|
|||
|
|
|||
|
The Imm addressing mode, since it is a register
|
|||
|
addressing mode, lets you perform operations between any
|
|||
|
register or memory location in the machine (addressable
|
|||
|
by a single instruction) with an immediate operand. For
|
|||
|
example, CMPB #5,2[3,D],Y is perfectly legal. For a
|
|||
|
few instructions, immediate operands don't make much
|
|||
|
sense, such instructions will cause an illegal
|
|||
|
instruction trap (for example, you cannot load the
|
|||
|
effective address of an immediate operand into a
|
|||
|
register).
|
|||
|
|
|||
|
The TOS addressing mode is extremely powerful. If
|
|||
|
you've looked ahead at the expansion instructions, you'd
|
|||
|
have noticed that there aren't any specific push or pop
|
|||
|
instructions (unless you count ENTER, EXIT, SAVE, and
|
|||
|
RESTORE). The TOS addressing mode handles all of this
|
|||
|
for you. You want to push the accumulator onto the
|
|||
|
stack? No problem, MOVW A,TOS will do the job. You
|
|||
|
want to pop the X register off of the stack? Use MOVW
|
|||
|
TOS,X. You want to add the item on the top of stack to
|
|||
|
the item below it on the stack (a VERY common operation
|
|||
|
performed by compilers), just use ADDW TOS,TOS. This
|
|||
|
instruction will pop two words off of the stack, add
|
|||
|
them, and push their sum back onto the stack (leaving two
|
|||
|
bytes on the stack rather than the original four). With
|
|||
|
the TOS addressing mode, you can push (or pop) any value
|
|||
|
anywhere in addressable memory onto the stack with a
|
|||
|
single instruction.
|
|||
|
|
|||
|
|
|||
|
Special (but not expansion) Instructions:
|
|||
|
|
|||
|
There are seven groups of instructions in this category:
|
|||
|
ADDQ, CMPQ, Scc, Ccc, Icc, and the branch instructions.
|
|||
|
|
|||
|
ADDQ (add quick) appears in place of the ubiquitous INC
|
|||
|
and DEC instructions. ADDQ lets you add a four-bit signed
|
|||
|
value to any addressable item. The register bits, along
|
|||
|
with the direction bit, let you specify a signed four-bit
|
|||
|
value. This value is added to the specfied address. The
|
|||
|
immediate operand MUST be the source operand.
|
|||
|
|
|||
|
The CMPQ (compare quick) is similar except a compare
|
|||
|
operation is performed rather than an addition.
|
|||
|
Furthermore, the immediate operand is the destination
|
|||
|
operand rather than the source operand.
|
|||
|
|
|||
|
The Scc (set on condition), Ccc (clear on condition), and
|
|||
|
Icc (invert on condition) instructions are used to set
|
|||
|
boolean values based on the condition codes. These go
|
|||
|
hand in hand with the branch instructions so I'll
|
|||
|
describe them all at once.
|
|||
|
|
|||
|
There are 16 possible conditions, the register and
|
|||
|
direction bits are used to specify the condition. These
|
|||
|
conditions are
|
|||
|
|
|||
|
0- RA/A 4- HI 8- GT C- PL 1- CC/LO
|
|||
|
5- LT 9- EQ D- VC 2- CS/HS 6- GE
|
|||
|
A- NE E- VS 3- LS 7- LE B- MI
|
|||
|
F- SR/N
|
|||
|
|
|||
|
LO (lower) = unsigned less than HS (higher/same) =
|
|||
|
unsigned greater than or equal LS (lower/same) = unsigned
|
|||
|
less than or equal HI (higher) = unsigned greater than LT
|
|||
|
= signed less than GE = signed greater than or equal LE =
|
|||
|
signed greater than or equal GT = signed greater than
|
|||
|
|
|||
|
The Scc, Ccc, and Icc instructions take the form:
|
|||
|
|
|||
|
Scc{b|w} #Imm, Dest Ccc{b|w} #Imm,
|
|||
|
Dest Icc{b|w} #Imm, Dest
|
|||
|
|
|||
|
If the immediate operand is zero, then Scc will store a
|
|||
|
one into the specified location if the condition code is
|
|||
|
met, otherwise a one will be stored. Ccc does just the
|
|||
|
opposite, it stores a zero if the condition is met, one
|
|||
|
otherwise. The Icc instruction will complement the
|
|||
|
specified location (logical NOT) if the condition code is
|
|||
|
met. If the immediate operand is not zero, the the Scc
|
|||
|
in- struction will set the specified bits in the
|
|||
|
destination operand if the condition code is met, the Scc
|
|||
|
instruction will have no effect if the condition is not
|
|||
|
met. The Ccc instruction clears the specified bits in the
|
|||
|
destination operand. The Icc instruction inverts the
|
|||
|
specified bits. The destination bits are specified with
|
|||
|
ones in the immediate operand. For example, SCS #$88,
|
|||
|
$00 will set bits three and seven in memory location zero
|
|||
|
if the carry flag is set, location $00 will be
|
|||
|
unaffected if the carry flag is clear.
|
|||
|
|
|||
|
The SA/CA/IA (Set always, Clear always, Invert always)
|
|||
|
instructions always perform the specified operation. The
|
|||
|
SN/CN/IN (set never, clear never, invert never) behave as
|
|||
|
though the condition code was not met.
|
|||
|
|
|||
|
The branch instructions are unusual compared to those
|
|||
|
encountered thus far. The instruction is only one byte
|
|||
|
long. It takes the form:
|
|||
|
|
|||
|
7 6 5 4 3 2 1 0 o o o --1C or 1D--
|
|||
|
|
|||
|
If the opcode is $1C, then the three "o" bits represent
|
|||
|
condition codes 0..7 above. Note that the BRA instruction
|
|||
|
uses opcode bits %000.
|
|||
|
|
|||
|
If the opcode is $1D, then the three "o" bits represent
|
|||
|
condition codes 8..$F above. There is no BN instruction,
|
|||
|
Opcode %111 is the BSR (branch to subroutine)
|
|||
|
instruction.
|
|||
|
|
|||
|
Unlike the 65c816, branches are not limited to +/- 128
|
|||
|
bytes. A displacement value, similar to the used by the
|
|||
|
general addressing modes allows a one-byte displacement
|
|||
|
of +/- 64 bytes or +/- 16383 bytes. More than enough for
|
|||
|
most cases.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Math expansion instructions:
|
|||
|
|
|||
|
The math expansion instructions (opcode $14) use the
|
|||
|
three register bits as an opcode expansion yield eight
|
|||
|
additional instructions. The instruction format is
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
|
|||
|
a a a a s d o o o 1 0 1 0 0 where "aaaaaa"
|
|||
|
is a general addressing mode, "s" is the size (B/W), "d"
|
|||
|
is the direction (load/store), and "ooo" is the sub-
|
|||
|
opcode, decode as follows:
|
|||
|
|
|||
|
0- MUL 1- DIV 2- MOD 3- REM 4- INDX 5- CHK
|
|||
|
6- DIVS 7- FPexp
|
|||
|
|
|||
|
Sub-opcode 7 is reserved for floating point expansion via
|
|||
|
a coprocessor. If the FPC bit in the SYSPSW is not set,
|
|||
|
then executing this opcode will cause an illegal
|
|||
|
instruction trap. If the FPC bit is set, then an
|
|||
|
additional eight bit opcode follows this instruction.
|
|||
|
This opcode value plus the physical address provided by
|
|||
|
the addressing mode, bounds registers, and applicable
|
|||
|
prefix(es) are passed along to the coprocessor.
|
|||
|
|
|||
|
All of these instructions use the 65c820 accumulator
|
|||
|
as the register operand. MULW performs an unsigned 16x16
|
|||
|
multiply, leaving the 32-bit result in A, AX. MULB
|
|||
|
performs an unsigned 8x8 multiply, leaving the result in
|
|||
|
A. DIVW performs an unsigned 32/16 division. The value
|
|||
|
in (A,AX) is divided by the specified operand and the
|
|||
|
quotient is left in (A,AX). DIVB divides the 16-bit
|
|||
|
accumulator by an eight bit value, leaving the result in
|
|||
|
A. DIVS{W|B} perform signed divisions. These two
|
|||
|
instructions operate on the 16-bit accumulator or 8-bit
|
|||
|
accumulator ONLY. The AX register is not used. MOD and
|
|||
|
REM compute the modulo and remainder functions (MOD is
|
|||
|
unsigned, REM is signed). Their register usage is
|
|||
|
identical to DIV/DIVS. There is no need for a signed
|
|||
|
multiply instruction since signed and unsigned
|
|||
|
multiplication produces the same result, assuming you
|
|||
|
ignore the value in AX.
|
|||
|
|
|||
|
The INDX and CHK instructions are used to perform
|
|||
|
array computations. The operand of these two
|
|||
|
instructions points at a pair of bytes or words. The
|
|||
|
INDX instruction multiplies the accumulator by the first
|
|||
|
value and then adds the second value to the accumulator.
|
|||
|
The direction bit in the opcode is ignored. The INDX
|
|||
|
instruction takes two forms: INDXB and INDXW.
|
|||
|
|
|||
|
The CHK instruction compares the value in the
|
|||
|
accumulator against the first and second values. If the
|
|||
|
accumulator lies within these two values (inclusive) then
|
|||
|
the overflow flag is cleared. If the accumulator is
|
|||
|
outside the range of these two values, then the overflow
|
|||
|
flag is set. The direction flag in the opcode is used to
|
|||
|
determine whether a signed or unsigned comparison is
|
|||
|
used. The CHK instruction takes four forms: CHKSB, CHKSW,
|
|||
|
CHKUB, and CHKUW. The "U" and "S" specify unsigned or
|
|||
|
signed.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
String expansion instructions:
|
|||
|
|
|||
|
Opcode $15 is used for string operations. The 65c820
|
|||
|
processor provides four basic string operations: MOVS
|
|||
|
(move), CMPS (compare), XLATS (translate), and FILLS
|
|||
|
(fill). The instruction format is as follows:
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 s s
|
|||
|
s d d d l l l o o 1 0 1 0 1
|
|||
|
|
|||
|
where "sss" is the source address, "ddd" is the
|
|||
|
destination address, "lll" is the length address, and
|
|||
|
"oo" is the opcode. Since all of the addresses are three
|
|||
|
bits, they must be register addresses. The source
|
|||
|
address is a sixteen bit value taken from one of the
|
|||
|
register addressing modes. The sixteen-bit value
|
|||
|
obtained at said address is the start of the string
|
|||
|
within the data bank (i.e., relative to the DBR
|
|||
|
register). The destination address is also a sixteen-bit
|
|||
|
register addressing mode value, specifying the start of
|
|||
|
the destination address within the auxillary bank (i.e.,
|
|||
|
relative to the ABR register). The length value is a
|
|||
|
sixteen-bit quantity obtained directly from the register
|
|||
|
addressing mode location. Prefix bytes (described later
|
|||
|
on) are not allowed in front of a string instruction.
|
|||
|
|
|||
|
Opcode assignments:
|
|||
|
|
|||
|
0- MOVS 1- CMPS 2- XLATS 3- FILLS
|
|||
|
|
|||
|
The direction of the string operation is specified by
|
|||
|
the "dir" bit in the USRPSW register. If the bit is
|
|||
|
clear, then the source and destination operands are
|
|||
|
incremented after each string operation. If the "dir"
|
|||
|
bit is clear, then these operands are decremented after
|
|||
|
each string operation.
|
|||
|
|
|||
|
The string instructions take the form:
|
|||
|
|
|||
|
MNEMONIC src, dest, len
|
|||
|
|
|||
|
where src, dest, and len are any of A, X, Y, S, F, TOS,
|
|||
|
#value, or a direct page address. For the these
|
|||
|
operands, the sixteen bit value specified by one of these
|
|||
|
addresses is used, relative to the DBR, as the address
|
|||
|
(or length) of the specified block. An absolute address
|
|||
|
can be specified by an immediate operand. The direct
|
|||
|
page address is the address of the 16-bit value within
|
|||
|
the direct page, it does not mean that the address of the
|
|||
|
block is that address in the direct page. Same with the
|
|||
|
TOS, the value on the top of stack contains the address,
|
|||
|
the top of stack is not the block itself. The len
|
|||
|
operand is always a byte count. Unless an immediate
|
|||
|
operand is specified, the operands are always updated to
|
|||
|
reflect their new value at the termination of the block
|
|||
|
operation.
|
|||
|
|
|||
|
The MOVS instruction is used to move a string of
|
|||
|
bytes from one location to another. A block of "len"
|
|||
|
bytes specified by DBR/src is moved to ABR/dest.
|
|||
|
|
|||
|
The MOVS operation is an example of an instruction
|
|||
|
that does not exactly mirror its 65c816 counterpart. It
|
|||
|
may take two (or more) instructions to perform the same
|
|||
|
operation as the 65c816 MVN and MVP instructions, since
|
|||
|
the direction flag may require adjustment before
|
|||
|
performing a MOVS instruction. Futhermore, the ABR and
|
|||
|
DBR registers may need adjustment before and after the
|
|||
|
MOVS instruction to simulate the MVN and MVP
|
|||
|
instructions. Finally, the actual count is specified by
|
|||
|
length, not count-1 (as on the 65c816), so this may
|
|||
|
require some adjustment if you are translating 65c816
|
|||
|
code instruction by instruction.
|
|||
|
|
|||
|
Example:
|
|||
|
|
|||
|
MVN 0,1
|
|||
|
|
|||
|
can be simulated by
|
|||
|
|
|||
|
MOVW #0,DBR MOVW #1,ABR ADDQ #1,
|
|||
|
A ;Since MVN assumes A contains count-1 MOVS
|
|||
|
X,Y,A MOVW #1,DBR
|
|||
|
|
|||
|
|
|||
|
The CMPS operation compares the two specified
|
|||
|
strings. It does a byte by byte comparison until length
|
|||
|
bytes are compared or a character in the source string is
|
|||
|
not equal to the corresponding character in the
|
|||
|
destination string. The condition codes are set to
|
|||
|
reflect the ordinality of the two strings (so you can use
|
|||
|
any of the branch, Scc, Ccc, or Icc instructions to test
|
|||
|
the results). If the z flag is returned set, then the two
|
|||
|
strings are equal (through the specified length),
|
|||
|
otherwise the source and destination operands are updated
|
|||
|
to point at the differing chars and the length operand is
|
|||
|
updated to show the number of character processed thus
|
|||
|
far (assuming, of course, that these operands weren't
|
|||
|
immediate, in which case they would be ignored).
|
|||
|
|
|||
|
The XLATS instruction is used to translate values in
|
|||
|
a string. The source operand points at a table in the
|
|||
|
DBR. Each character in the dest string is used as an
|
|||
|
index into this table and the value fetched from the
|
|||
|
table is stored over the original character in the
|
|||
|
destination string.
|
|||
|
|
|||
|
The FILLS instruction is used to initialize a string
|
|||
|
with a fixed value. The source operand is an eight-bit
|
|||
|
value. It is stored in successive locations at ABR/dest
|
|||
|
for len bytes. If an immediate value is specified, a
|
|||
|
sixteen-bit value is encoded into the instruction, but
|
|||
|
only the L.O. eight bits are used.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Single byte expansion instructions:
|
|||
|
|
|||
|
These instructions take the form:
|
|||
|
|
|||
|
7 6 5 4 3 2 1 0 o o o 1 0 1 1 0
|
|||
|
|
|||
|
Where "ooo" is decoded as:
|
|||
|
|
|||
|
0- NOP 1- COP 2- BRK 3- SVC 4- RTS 5- RTL 6- RTI 7- EXIT
|
|||
|
|
|||
|
SVC is the "supervisor call" instruction. Its
|
|||
|
intended use is for making operating system calls. It is
|
|||
|
similar in function to the COP instruction.
|
|||
|
|
|||
|
EXIT is used to deallocate local variables in a
|
|||
|
procedure. It undoes the actions of the ENTER
|
|||
|
instruction. Basically it performs the following
|
|||
|
operations:
|
|||
|
|
|||
|
MOV F,S MOV TOS, F
|
|||
|
|
|||
|
The remaining instructions in this group are
|
|||
|
identical to their 65c816 counterparts, so they don't
|
|||
|
require any futher elaboration.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Single byte w/displacement expansion instructions:
|
|||
|
|
|||
|
These instructions take the form:
|
|||
|
|
|||
|
7 6 5 4 3 2 1 0 o o o 1 0 1 1 1
|
|||
|
|
|||
|
Where "ooo" is decoded as:
|
|||
|
|
|||
|
0- SAVE n 1- RESTORE n 2- reserved 3- reserved 4- RTS n
|
|||
|
5- RTL n 6- ADJSP n 7- ENTER n
|
|||
|
|
|||
|
The "n" value immediately following these
|
|||
|
instructions is a displacement value. If bit seven of
|
|||
|
the first byte following the opcode is zero, then the
|
|||
|
remaining six bits are used to specify a signed value in
|
|||
|
the range +/- 64. If bit seven is one, then the
|
|||
|
following 15 bits are used to specify a value in the
|
|||
|
range +/-16383. Except possibly for ADJSP, none of these
|
|||
|
instructions should ever require more than a single byte
|
|||
|
displacement.
|
|||
|
|
|||
|
SAVE is used to quickly push registers from the set
|
|||
|
[A,AX,X,Y,F,D,P] onto the stack. The instruction is
|
|||
|
followed by a single byte with bits 0..6 cor- responding
|
|||
|
to these registers. Bit seven must always be zero.
|
|||
|
|
|||
|
RESTORE does just the opposite of SAVE, it pops the
|
|||
|
specified registers off of the stack.
|
|||
|
|
|||
|
RTS n and RTL n perform the specified return from
|
|||
|
subroutine operations and then add the specified
|
|||
|
displacement to the stack pointer after the return
|
|||
|
address has been popped. This provides a convenient
|
|||
|
mechanism whereby parameters can be removed from the
|
|||
|
stack.
|
|||
|
|
|||
|
The ADJSP n instruction adds the displacement value
|
|||
|
to the stack pointer. This is a shorter version of the
|
|||
|
ADD #value,S instruction. A special case was created for
|
|||
|
this instruction because it gets used all the time in
|
|||
|
languages like "C" or "SDL/65" which allow a variable
|
|||
|
number of parameters.
|
|||
|
|
|||
|
The ENTER n instruction is used to set up an
|
|||
|
activation record when a procedure is initially entered.
|
|||
|
It performs the following operations:
|
|||
|
|
|||
|
MOVW F,TOS MOVW S, F ADJSP n
|
|||
|
|
|||
|
The EXIT instruction can be used to undo the effects of
|
|||
|
this instruction.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Prefix expansion instructions:
|
|||
|
|
|||
|
These instructions take the form:
|
|||
|
|
|||
|
7 6 5 4 3 2 1 0 o o o 1 1 0 0 0
|
|||
|
|
|||
|
where "ooo" is decoded as:
|
|||
|
|
|||
|
0-ABR prefix 1-SBR prefix 2-PBR prefix 3-word index
|
|||
|
prefix 4-dword index prefix 5-qword index prefix 6-
|
|||
|
XBA/SWA 7-EMU
|
|||
|
|
|||
|
XBA and EMU aren't true prefix bytes, they're just
|
|||
|
single byte instructions that didn't conveniently fit
|
|||
|
anywhere else. So I'll describe them first. XBA is
|
|||
|
identical to its 65c816 counterpart, it swaps the bytes
|
|||
|
in the accumulator. EMU switches from 65c820 native mode
|
|||
|
to 65c816 emulation mode. EMU is a privileged
|
|||
|
instruction and will cause a privileged instruction trap
|
|||
|
if executed from the user mode.
|
|||
|
|
|||
|
The first three prefix bytes are used to modify the
|
|||
|
bank used for data accesses. Addressing modes that
|
|||
|
normally access memory through the data bank register
|
|||
|
(which are all memory references except direct, long,
|
|||
|
TOS, and those involving F) can be "tweaked" to access
|
|||
|
memory through the auxillary, stack, or program bank
|
|||
|
registers by prefixing the address with the appropriate
|
|||
|
prefix. For example,
|
|||
|
|
|||
|
MOVW #275, ABR:$1000
|
|||
|
|
|||
|
stores 275 into location $1000 in the auxillary bank
|
|||
|
register rather than the data bank register. Indirect
|
|||
|
addresses of the form (a,X) and n(a,X) present a minor
|
|||
|
problem. Does the prefix specify the bank address of the
|
|||
|
absolute operand or the effective address? I've opted
|
|||
|
for requiring that the absolute operand reside in the
|
|||
|
data bank and the prefix byte determines the effective
|
|||
|
address bank.
|
|||
|
|
|||
|
Any addressing mode utilitizing the frame pointer
|
|||
|
register (F) is always relative to the stack bank
|
|||
|
register. Prefixes are only allowed for the following
|
|||
|
frame-based addressing modes: n(d,F), n(a,F), (d,F),
|
|||
|
(a,F), n(d,FX), and n(d,F),Y. The indirect address
|
|||
|
always comes out of the stack bank, the prefix applies to
|
|||
|
the computed effective address.
|
|||
|
|
|||
|
Although the ABR:/SBR:/PBR: lexemes immediately
|
|||
|
precede the address expression to which they apply (on
|
|||
|
the source line), in the object code, the prefix byte
|
|||
|
always precedes the instruction to which the prefix
|
|||
|
applies. If more than one prefix byte precedes an
|
|||
|
instruction, only the last one is used. If a prefix byte
|
|||
|
precedes an instruction to which the prefix doesn't make
|
|||
|
sense (a branch, for example), then the prefix byte is
|
|||
|
ignored. Finally, the prefix byte will be ignored if
|
|||
|
there isn't an applicable addressing mode in the current
|
|||
|
instruction. E.G.: byt $18 ;ABR prefix
|
|||
|
byte MOVW A,X ;ABR prefix has no meaning
|
|||
|
here.
|
|||
|
|
|||
|
|
|||
|
Three additional prefix bytes apply to the X and Y
|
|||
|
index registers. These are the word index prefix, dword
|
|||
|
index prefix, and qword index prefix. These prefix bytes
|
|||
|
provide scaled indexed addressing modes for the 65c820.
|
|||
|
Without one of these prefixes, the X and Y registers are
|
|||
|
always byte offsets. That is, when used as an index
|
|||
|
register, the contents of X or Y is added directly to the
|
|||
|
effective address being computed. When accessing words,
|
|||
|
pointers (double words), or eight byte values (e.g.,
|
|||
|
floating point) you have to manually adjust the index
|
|||
|
registers by a factor of 2, 4, or 8. The scaled index
|
|||
|
addressing prefix bytes let you avoid this problem. The
|
|||
|
word prefix multiplies the X or Y register value by two
|
|||
|
before using it in the effective address computation.
|
|||
|
Likewise, the dword and qword prefixes multiply X or Y by
|
|||
|
4 or 8 before using the value. In the source code, these
|
|||
|
prefix bytes are specified by the ":W", ":D", and ":Q"
|
|||
|
suffixes:
|
|||
|
|
|||
|
MOVW A,LBL,X:W MOVW
|
|||
|
$0,(PTR),Y:D MOVW $2, 2(PTR),Y:D
|
|||
|
MOVW F,(TBL,X:W) MOVW A,LBL,Y:Q
|
|||
|
|
|||
|
If multiple prefixes appear, only the last one is used.
|
|||
|
If the prefix doesn't apply to the next instruction, it
|
|||
|
is ignored.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
Single operand expansion instructions:
|
|||
|
|
|||
|
The $1E expansion instructions are dedicated to
|
|||
|
instructions which require a single operand. The format
|
|||
|
for the opcodes is as follows:
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
|
|||
|
a a a a s o o o o 1 1 1 1 0
|
|||
|
|
|||
|
where "aaaaaa" is a general addressing mode, "s" is the
|
|||
|
size (B/W), and "oooo" is one of the following opcodes:
|
|||
|
|
|||
|
0- NOT 8- LAX (load AX register) 1-
|
|||
|
NEG 9- SAX (store AX register) 2- ABS
|
|||
|
A- XAX (exchange AX register) 3- BOOL (0->0 else->1)
|
|||
|
B- LLB (load LBound register) 4- SEX
|
|||
|
C- LHB (load HBound register) 5- ZEX
|
|||
|
D- SLB (store LBound register) 6- JMP
|
|||
|
E- SHB (store HBound register) 7- JSR
|
|||
|
F- VAL (validate memory location)
|
|||
|
|
|||
|
|
|||
|
All of these instructions are followed by a single
|
|||
|
general address expression. Immediate operands are not
|
|||
|
allowed for any of these instructions.
|
|||
|
|
|||
|
NOT- logically compliments the specified value. NEG-
|
|||
|
takes the two's complement of the specified value. ABS-
|
|||
|
takes the absolute value of the specified location. BOOL-
|
|||
|
If the specified location is not zero, a one is stored
|
|||
|
into it.
|
|||
|
|
|||
|
SEX- (that's sign extension, not what you think). SEXB
|
|||
|
checks the high order bit of the specified byte and
|
|||
|
copies it into the H.O. byte of the corresponding
|
|||
|
address. For example, if X contains $0082 then SEXB X
|
|||
|
will store $FF82 into X. If X contains $0002, then SEXB X
|
|||
|
will store $0002 into X. SEXW sign extends the
|
|||
|
specified location into the AX register.
|
|||
|
|
|||
|
ZEX- zero extends the specified value. ZEXW simply
|
|||
|
stores a zero into AX. ZEXB stores a zero into the H.O.
|
|||
|
byte of the specified word.
|
|||
|
|
|||
|
JMP and JSR are like their 65c816 counterparts except any
|
|||
|
valid addressing mode can be used. Note that, unlike
|
|||
|
most other instructions, the result is assumed to be in
|
|||
|
the current program bank unless a long addressing mode is
|
|||
|
specified.
|
|||
|
|
|||
|
LAX, SAX, and XAX allow you to load, store, and exchange
|
|||
|
the contents of the AX register. Note that these three
|
|||
|
instructions plus SEX, ZEX, MUL, DIV, and MOD are the
|
|||
|
only instructions that deal with the AX register.
|
|||
|
|
|||
|
LLB, LHB, SLB, and SHB let you load and save the contents
|
|||
|
of the bounds registers. These are privileged
|
|||
|
instructions which will cause a privilege trap if
|
|||
|
executed from the user mode.
|
|||
|
|
|||
|
VAL- This instruction is used to validate a memory
|
|||
|
location. That is, it tests the specified memory
|
|||
|
location to see if it lies within the range specified by
|
|||
|
the bounds register. The address is a physical address,
|
|||
|
not a translated address. The overflow flag is set if a
|
|||
|
bounds violation would occur. Note that the M bit in the
|
|||
|
SYSPSW need not contain a particular value when using
|
|||
|
this instruction. This is a privileged instruction which
|
|||
|
will cause a privilege violation if executed in the user
|
|||
|
mode.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
BIT expansion instructions:
|
|||
|
|
|||
|
These instructions take the form:
|
|||
|
|
|||
|
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 a a
|
|||
|
a a a a s o o o o 1 1 1 1 1
|
|||
|
|
|||
|
"aaaaaa" is the destination addressing mode. "s" is the
|
|||
|
size (applicable only to MAND, MOR, and MXOR). "oooo" is
|
|||
|
the sub-opcode, decoded as:
|
|||
|
|
|||
|
0- INS dest, start, len 1- EXT dest, start, len 2- FFS
|
|||
|
dest, start, len 3- FFC dest, start, len 4- MAND dest,
|
|||
|
mask 5- MOR dest, mask 6- MXOR dest, mask
|
|||
|
|
|||
|
7..F- reserved.
|
|||
|
|
|||
|
INS is used to insert a value into a bit field. The
|
|||
|
value in the accumulator is shifted to the left "start"
|
|||
|
bits and the the "len" following bits are stored into
|
|||
|
the specified memory location. For example, if memory
|
|||
|
location $00 contains $F0 and the accumulator contains
|
|||
|
$3, then INS $00,2,4 would leave location $00 containing
|
|||
|
$CC. Note that you needn't specify byte or word size as
|
|||
|
this is intrinsic from the length.
|
|||
|
|
|||
|
EXT- extracts a bit field from some location and stores
|
|||
|
the right justified value into the accumulator (zeroing
|
|||
|
out any unused bits). For example, if memory location
|
|||
|
$00 contains $CC and the accumulator contains $FFFF, then
|
|||
|
EXT $0,2,4 would leave the accumulator containing 3 and
|
|||
|
location $00 containing $CC.
|
|||
|
|
|||
|
|
|||
|
FFS finds the first set bit in the specified location.
|
|||
|
The bit position is returned in the accumulator. If
|
|||
|
there were no set bits, the accumulator contains "len"+1.
|
|||
|
|
|||
|
FFC finds the first clear bit in a manner identical to
|
|||
|
FFS.
|
|||
|
|
|||
|
Some notes: These four instructions are followed by a
|
|||
|
single byte. The low order four bits contain the start
|
|||
|
value, the high order four bits contain the length-1.
|
|||
|
"start" + "len" must always be less than or equal to 15.
|
|||
|
FFS and FFC use the direction bit in the USRPSW to
|
|||
|
determine which way to progress in the bit field when
|
|||
|
searching for the set or clear bit.
|
|||
|
|
|||
|
|
|||
|
The MAND, MOR, and MXOR (masked AND, OR, and XOR) will
|
|||
|
AND, OR, or XOR the accumulator into the specified memory
|
|||
|
location. The difference between these three
|
|||
|
instructions and the standard AND, OR, and XOR
|
|||
|
instructions is that they are followed by a byte or word
|
|||
|
(depending on the instruction size) which contains a mask
|
|||
|
for the operation. Wherever a one bit appears in the
|
|||
|
mask, the logical operation will take place, wherever a
|
|||
|
zero bit appears, the destination's bits will be
|
|||
|
unaffected.
|
|||
|
|
|||
|
|
|||
|
_________________________________________________________
|
|||
|
__________________
|
|||
|
|
|||
|
That wraps up my proposed instruction set for the 65c820.
|
|||
|
I'll be happy to discuss my design decisions with anyone
|
|||
|
who's interested. The next step is to try and convince
|
|||
|
someone to actually build this thing! In the mean time,
|
|||
|
I might try writing an interpreter and assembler for it.
|
|||
|
|
|||
|
By the way. Many of you have probably recognized certain
|
|||
|
instructions from this processor or that processor
|
|||
|
sprinkled throughout. To set the record straight, most
|
|||
|
of my ideas have come from my own frustrations with the
|
|||
|
65c816, the 8086 family, and the National Semiconductor
|
|||
|
32000 family. Despite that fact that a lot of you think
|
|||
|
that Intel's parts stink because they're used by IBM,
|
|||
|
don't let that prejudice you against many of the design
|
|||
|
issues here. The 8086 does have a resonable
|
|||
|
archetecture, given the compromises it had to face. It's
|
|||
|
certainly better than the 65c816. I've incorporated a
|
|||
|
lot of the better ideas (like segment prefixes) into the
|
|||
|
design of the 65c820. Once again, don't downplay these
|
|||
|
powerful features just because you don't like IBM.
|
|||
|
|
|||
|
*** Randy Hyde
|
|||
|
|