1590 lines
71 KiB
Plaintext
1590 lines
71 KiB
Plaintext
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM Personal Computer Assembly
|
|||
|
Language Tutorial
|
|||
|
|
|||
|
Joshua Auerbach
|
|||
|
Yale University
|
|||
|
Yale Computer Center
|
|||
|
175 Whitney Avenue
|
|||
|
P. O. Box 2112
|
|||
|
New Haven, Connecticut 06520
|
|||
|
Installation Code YU
|
|||
|
Integrated Personal Computers Project
|
|||
|
Communications Group
|
|||
|
Communications and Data Base Division
|
|||
|
Session C316
|
|||
|
.pa
|
|||
|
IBM PC Assembly Language Tutorial 1
|
|||
|
|
|||
|
This talk is for people who are just getting started with the PC MACRO
|
|||
|
Assembler. Maybe you are just contemplating doing some coding in
|
|||
|
assembler, maybe you have tried it with mixed success. If you are here to
|
|||
|
get aimed in the right direction, to get off to a good start with the
|
|||
|
assembler, then you have come for the right reason. I can't promise you'll
|
|||
|
get what you want, but I'll do my best.
|
|||
|
|
|||
|
On the other hand, if you have already turned out some working assembler
|
|||
|
code, then this talk is likely to be on the elementary side for you. If
|
|||
|
you want to review a few basics and have no where else pressing to go, then
|
|||
|
by all means stay.
|
|||
|
|
|||
|
Why Learn Assembler?
|
|||
|
--------------------
|
|||
|
|
|||
|
The reasons for LEARNING assembler are not the same as the reasons for
|
|||
|
USING it in a particular application. But, we have to start with some of
|
|||
|
the reasons for using it and then I think the reasons for learning it will
|
|||
|
become clear.
|
|||
|
|
|||
|
First, let's dispose of a bad reason for using it. Don't use it just
|
|||
|
because you think it is going to execute faster. A particular sequence of
|
|||
|
ordinary bread-and-butter computations written in PASCAL, C, FORTRAN, or
|
|||
|
compiled BASIC can do the job just about as fast as the same algorithm
|
|||
|
coded in assembler. Of course, interpretive BASIC is slower, but if you
|
|||
|
have a BASIC application which runs too slow you probably want to try com-
|
|||
|
piling it before you think too much about translating parts of it to
|
|||
|
another language.
|
|||
|
|
|||
|
On the other hand, high level languages do tend to isolate you from the
|
|||
|
machine. That is both their strength and their weakness. Usually, when
|
|||
|
implemented on a micro, a high level language provides an escape mechanism
|
|||
|
to the underlying operating system or to the bare machine. So, for
|
|||
|
example, BASIC has its PEEK and POKE. But, the route to the bare machine
|
|||
|
is often a circuitous one, leading to tricky programming which is hard to
|
|||
|
follow.
|
|||
|
|
|||
|
For those of us working on PC's connected to SHARE-class mainframes, we are
|
|||
|
generally concerned with three interfaces: the keyboard, the screen, and
|
|||
|
the communication line or lines. All three of these entities raise machine
|
|||
|
dependent issues which are imperfectly addressed by the underlying operat-
|
|||
|
ing system or by high level languages.
|
|||
|
|
|||
|
Sometimes, the system or the language does too little for you. For
|
|||
|
example, with the asynch adapter, the system provides no interrupt handler,
|
|||
|
no buffer, and no flow control. The application is stuck with the respon-
|
|||
|
sibility for monitoring that port and not missing any characters, then
|
|||
|
deciding what to do with all errors. BASIC does a reasonable job on some
|
|||
|
of this, but that is only BASIC. Most other languages do less.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 2
|
|||
|
|
|||
|
Sometimes, the system may do too much for you. System support for the key-
|
|||
|
board is an example. At the hardware level, all 83 keys on the keyboard
|
|||
|
send unique codes when they are pressed, held down, and released. But,
|
|||
|
someone has decided that certain keys, like Num Lock and Scroll Lock are
|
|||
|
going to do certain things before the application even sees them and can't
|
|||
|
therefore be used as ordinary keys.
|
|||
|
|
|||
|
Sometimes, the system does about the right amount of stuff but does it less
|
|||
|
efficiently then it should. System support for the screen is in this
|
|||
|
class. If you use only the official interface to the screen you sometimes
|
|||
|
slow your application down unacceptably. I said before, don't use assem-
|
|||
|
bler just to speed things up, but there I was talking about mainline code,
|
|||
|
which generally can't be speeded up much by assembler coding. A critical
|
|||
|
system interface is a different matter: sometimes we may have to use
|
|||
|
assembler to bypass a hopelessly inefficient implementation. We don't want
|
|||
|
to do this if we can avoid it, but sometimes we can't.
|
|||
|
|
|||
|
Assembly language code can overcome these deficiencies. In some cases, you
|
|||
|
can also overcome these deficiencies by judicious use of the escape valves
|
|||
|
which your high level language provides. In BASIC, you can PEEK and POKE
|
|||
|
and INP and OUT your way around a great many issues. In many other lan-
|
|||
|
guages you can issue system calls and interrupts and usually manage, one
|
|||
|
way or other, to modify system memory. Writing handlers to take real-time
|
|||
|
hardware interrupts from the keyboard or asynch port, though, is still
|
|||
|
going to be a problem in most languages. Some languages claim to let you
|
|||
|
do it but I have yet to see an acceptably clean implementation done that
|
|||
|
way.
|
|||
|
|
|||
|
The real reason while assembler is better than "tricky POKEs" for writing
|
|||
|
machine-dependent code, though, is the same reason why PASCAL is better
|
|||
|
than assembler for writing a payroll package: it is easier to maintain.
|
|||
|
|
|||
|
Let the high level language do what it does best, but recognize that there
|
|||
|
are some things which are best done in assembler code. The assembler,
|
|||
|
unlike the tricky POKE, can make judicious use of equates, macros, labels,
|
|||
|
and appropriately placed comments to show what is really going on in this
|
|||
|
machine-dependent realm where it thrives.
|
|||
|
|
|||
|
So, there are times when it becomes appropriate to write in assembler; giv-
|
|||
|
en that, if you are a responsible programmer or manager, you will want to
|
|||
|
be "assembler-literate" so you can decide when assembler code should be
|
|||
|
written.
|
|||
|
|
|||
|
What do I mean by "assembler-literate?" I don't just mean understanding
|
|||
|
the 8086 architecture; I think, even if you don't write much assembler code
|
|||
|
yourself, you ought to understand the actual process of turning out assem-
|
|||
|
bler code and the various ways to incorporate it into an application. You
|
|||
|
ought to be able to tell good assembler code from bad, and appropriate
|
|||
|
assembler code from inappropriate.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 3
|
|||
|
|
|||
|
Steps to becoming ASSEMBLER-LITERATE
|
|||
|
------------------------------------
|
|||
|
|
|||
|
1. Learn the 8086 architecture and most of the instruction set. Learn
|
|||
|
what you need to know and ignore what you don't. Reading: The 8086
|
|||
|
Primer by Stephen Morse, published by Hayden. You need to read only
|
|||
|
two chapters, the one on machine organization and the one on the
|
|||
|
instruction set.
|
|||
|
|
|||
|
2. Learn about a few simple DOS function calls. Know what services the
|
|||
|
operating system provides. If appropriate, learn a little about other
|
|||
|
systems too. It will aid portability later on. Reading: appendices D
|
|||
|
and E of the PC DOS manual.
|
|||
|
|
|||
|
3. Learn enough about the MACRO assembler and the LINKer to write some
|
|||
|
simple things that really work. Here, too, the main thing is figuring
|
|||
|
out what you don't need to know. Whatever you do, don't study the sam-
|
|||
|
ple programs distributed with the assembler unless you have nothing
|
|||
|
better!
|
|||
|
|
|||
|
4. At the same time as you are learning the assembler itself, you will
|
|||
|
need to learn a few tools and concepts to properly combine your assem-
|
|||
|
bler code with the other things you do. If you plan to call assembler
|
|||
|
subroutines from a high level language, you will need to study the
|
|||
|
interface notes provided in your language manual. Usually, this forms
|
|||
|
an appendix of some sort. If you plan to package your assembler rou-
|
|||
|
tines as .COM programs you will need to learn to do this. You should
|
|||
|
also learn to use DEBUG.
|
|||
|
|
|||
|
5. Read the Technical Reference, but very selectively. The most important
|
|||
|
things to know are the header comments in the BIOS listing. Next, you
|
|||
|
will want to learn about the RS 232 port and maybe about the video
|
|||
|
adapters.
|
|||
|
|
|||
|
Notice that the key thing in all five phases is being selective. It is
|
|||
|
easy to conclude that there is too much to learn unless you can throw away
|
|||
|
what you don't need. Most of the rest of this talk is going to deal with
|
|||
|
this very important question of what you need and don't need to learn in
|
|||
|
each phase. In some cases, I will have to leave you to do almost all of
|
|||
|
the learning, in others, I will teach a few salient points, enough, I hope,
|
|||
|
to get you started. I hope you understand that all I can do in an hour is
|
|||
|
get you started on the way.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 4
|
|||
|
|
|||
|
Phase 1: Learn the architecture and instruction set
|
|||
|
----------------------------------------------------
|
|||
|
|
|||
|
The Morse book might seem like a lot of book to buy for just two really
|
|||
|
important chapters; other books devote a lot more space to the instruction
|
|||
|
set and give you a big beautiful reference page on each instruction. And,
|
|||
|
some of the other things in the Morse book, although interesting, really
|
|||
|
aren't very vital and are covered too sketchily to be of any real help.
|
|||
|
|
|||
|
The reason I like the Morse book is that you can just read it; it has a
|
|||
|
very conversational style, it is very lucid, it tells you what you really
|
|||
|
need to know, and a little bit more which is by way of background; because
|
|||
|
nothing really gets belabored to much, you can gracefully forget the things
|
|||
|
you don't use. And, I very much recommend READING Morse rather than study-
|
|||
|
ing it. Get the big picture at this point.
|
|||
|
|
|||
|
Now, you want to concentrate on those things which are worth fixing in mem-
|
|||
|
ory. After you read Morse, you should relate what you have learned to this
|
|||
|
outline.
|
|||
|
|
|||
|
1. You want to fix in your mind the idea of the four segment registers
|
|||
|
CODE, DATA, STACK, and EXTRA. This part is pretty easy to grasp. The
|
|||
|
8086 and the 8088 use 20 bit addresses for memory, meaning that they
|
|||
|
can address up to 1 megabyte of memory. But, the registers and the
|
|||
|
address fields in all the instructions are no more that 16 bits long.
|
|||
|
So, how to address all of that memory? Their solution is to put
|
|||
|
together two 16 bit quantities like this:
|
|||
|
|
|||
|
calculation SSSS0 ---- value in the relevant segment register SHL 4
|
|||
|
depicted in AAAA ---- apparent address from register or instruction
|
|||
|
hexadecimal --------
|
|||
|
RRRRR ---- real address placed on address bus
|
|||
|
|
|||
|
In other words, any time memory is accessed, your program will supply a
|
|||
|
sixteen bit address. Another sixteen bit address is acquired from a
|
|||
|
segment register, left shifted four bits (one nibble) and added to it
|
|||
|
to form the real address. You can control the values in the segment
|
|||
|
registers and thus access any part of memory you want. But the segment
|
|||
|
registers are specialized: one for code, one for most data accesses,
|
|||
|
one for the stack (which we'll mention again) and one "extra" one for
|
|||
|
additional data accesses.
|
|||
|
|
|||
|
Most people, when they first learn about this addressing scheme become
|
|||
|
obsessed with converting everything to real 20 bit addresses. After a
|
|||
|
while, though, you get use to thinking in segment/offset form. You
|
|||
|
tend to get your segment registers set up at the beginning of the pro-
|
|||
|
gram, change them as little as possible, and think just in terms of
|
|||
|
symbolic locations in your program, as with any assembly language.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 5
|
|||
|
|
|||
|
EXAMPLE:
|
|||
|
|
|||
|
MOV AX,DATASEG
|
|||
|
MOV DS,AX ;Set value of Data segment
|
|||
|
ASSUME DS:DATASEG ;Tell assembler DS is usable
|
|||
|
.......
|
|||
|
MOV AX,PLACE ;Access storage symbolically by 16 bit address
|
|||
|
|
|||
|
In the above example, the assembler knows that no special issues are
|
|||
|
involved because the machine generally uses the DS register to complete
|
|||
|
a normal data reference.
|
|||
|
|
|||
|
If you had used ES instead of DS in the above example, the assembler
|
|||
|
would have known what to do, also. In front of the MOV instruction
|
|||
|
which accessed the location PLACE, it would have placed the ES segment
|
|||
|
prefix. This would tell the machine that ES should be used, instead of
|
|||
|
DS, to complete the address.
|
|||
|
|
|||
|
Some conventions make it especially easy to forget about segment regis-
|
|||
|
ters. For example, any program of the COM type gets control with all
|
|||
|
four segment registers containing the same value. This program exe-
|
|||
|
cutes in a simplified 64K address space. You can go outside this
|
|||
|
address space if you want but you don't have to.
|
|||
|
|
|||
|
2. You will want to learn what other registers are available and learn
|
|||
|
their personalities:
|
|||
|
|
|||
|
AX and DX are general purpose registers. They become special only
|
|||
|
when accessing machine and system interfaces.
|
|||
|
|
|||
|
CX is a general purpose register which is slightly specialized for
|
|||
|
counting.
|
|||
|
|
|||
|
BX is a general purpose register which is slightly specialized for
|
|||
|
forming base-displacement addresses.
|
|||
|
|
|||
|
AX-DX can be divided in half, forming AH, AL, BH, BL, CH, CL, DH,
|
|||
|
DL.
|
|||
|
|
|||
|
SI and DI are strictly 16 bit. They can be used to form indexed
|
|||
|
addresses (like BX) and they are also used to point to strings.
|
|||
|
|
|||
|
SP is hardly ever manipulated. It is there to provide a stack.
|
|||
|
|
|||
|
BP is a manipulable cousin to SP. Use it to access data which has
|
|||
|
been pushed onto the stack.
|
|||
|
|
|||
|
Most sixteen bit operations are legal (even if unusual) when per-
|
|||
|
formed in SI, DI, SP, or BP.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 6
|
|||
|
|
|||
|
3. You will want to learn the classifications of operations available
|
|||
|
WITHOUT getting hung up in the details of how 8086 opcodes are con-
|
|||
|
structed.
|
|||
|
|
|||
|
8086 opcodes are complex. Fortunately, the assembler opcodes used to
|
|||
|
assemble them are simple. When you read a book like Morse, you will
|
|||
|
learn some things which are worth knowing but NOT worth dwelling on.
|
|||
|
|
|||
|
a. 8086 and 8088 instructions can be broken up into subfields and bits
|
|||
|
with names like R/M, MOD, S and W. These parts of the instruction
|
|||
|
modify the basic operation in such ways as whether it is 8 bit or
|
|||
|
16 bit, if 16 bit, whether all 16 bits of the data are given,
|
|||
|
whether the instruction is register to register, register to
|
|||
|
memory, or memory to register, for operands which are registers,
|
|||
|
which register, for operands which are memory, what base and index
|
|||
|
registers should be used in finding the data.
|
|||
|
|
|||
|
b. Also, some instructions are actually represented by several differ-
|
|||
|
ent machine opcodes depending on whether they deal with immediate
|
|||
|
data or not, or on other issues, and there are some expedited forms
|
|||
|
which assume that one of the arguments is the most commonly used
|
|||
|
operand, like AX in the case of arithmetic.
|
|||
|
|
|||
|
There is no point in memorizing any of this detail; just distill the
|
|||
|
bottom line, which is, what kinds of operand combinations EXIST in the
|
|||
|
instruction set and what kinds don't. If you ask the assembler to ADD
|
|||
|
two things and the two things are things for which there is a legal ADD
|
|||
|
instruction somewhere in the instruction set, the assembler will find
|
|||
|
the right instruction and fill in all the modifier fields for you.
|
|||
|
|
|||
|
I guess if you memorized all the opcode construction rules you might
|
|||
|
have a crack at being able to disassemble hex dumps by eye, like you
|
|||
|
may have learned to do somewhat with 370 assembler. I submit to you
|
|||
|
that this feat, if ever mastered by anyone, would be in the same class
|
|||
|
as playing the "Minute Waltz" in a minute; a curiosity only.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 7
|
|||
|
|
|||
|
Here is the basic matrix you should remember:
|
|||
|
|
|||
|
Two operands: One operand:
|
|||
|
R <-- M R
|
|||
|
M <-- R M
|
|||
|
R <-- R S *
|
|||
|
R|M <-- I
|
|||
|
R|M <-- S *
|
|||
|
S <-- R|M *
|
|||
|
* -- data moving instructions (MOV, PUSH, POP) only
|
|||
|
S -- segment register (CS, DS, ES, SS)
|
|||
|
R -- ordinary register (AX, BX, CX, DX, SI, DI, BP, SP,
|
|||
|
AH, AL, BH, BL, CH, CL, DH, DL)
|
|||
|
M -- one of the following
|
|||
|
pure address
|
|||
|
[BX]+offset
|
|||
|
[BP]+offset
|
|||
|
any of the above indexed by SI
|
|||
|
any of the first three indexed by DI
|
|||
|
|
|||
|
4. Of course, you want to learn the operations themselves. As I've sug-
|
|||
|
gested, you want to learn the op codes as the assembler presents them,
|
|||
|
not as the CPU machine language presents them. So, even though there
|
|||
|
are many MOV op codes you don't need to learn them. Basically, here is
|
|||
|
the instruction set:
|
|||
|
|
|||
|
a. Ordinary two operand instructions. These instructions perform an
|
|||
|
operation and leave the result in place of one of the operands.
|
|||
|
|
|||
|
They are
|
|||
|
|
|||
|
1) ADD and ADC -- addition, with or without including a carry from
|
|||
|
a previous addition
|
|||
|
|
|||
|
2) SUB and SBB -- subtraction, with or without including a borrow
|
|||
|
from a previous subtraction
|
|||
|
|
|||
|
3) CMP -- compare. It is useful to think of this as a subtraction
|
|||
|
with the answer being thrown away and neither operand actually
|
|||
|
changed
|
|||
|
|
|||
|
4) AND, OR, XOR -- typical boolean operations
|
|||
|
|
|||
|
5) TEST -- like an AND, except the answer is thrown away and nei-
|
|||
|
ther operand is changed.
|
|||
|
|
|||
|
6) MOV -- move data from source to target
|
|||
|
|
|||
|
7) LDS, LES, LEA -- some specialized forms of MOV with side
|
|||
|
effects
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 8
|
|||
|
|
|||
|
b. Ordinary one operand instructions. These can take any of the oper-
|
|||
|
and forms described above. Usually, the perform the operation and
|
|||
|
leave the result in the stated place:
|
|||
|
|
|||
|
1) INC -- increment contents
|
|||
|
|
|||
|
2) DEC -- decrement contents
|
|||
|
|
|||
|
3) NEG -- twos complement
|
|||
|
|
|||
|
4) NOT -- ones complement
|
|||
|
|
|||
|
5) PUSH -- value goes on stack (operand location itself unchanged)
|
|||
|
|
|||
|
6) POP -- value taken from stack, replaces current value
|
|||
|
|
|||
|
c. Now you touch on some instructions which do not follow the general
|
|||
|
operand rules but which require the use of certain registers. The
|
|||
|
important ones are
|
|||
|
|
|||
|
1) The multiply and divide instructions
|
|||
|
|
|||
|
2) The "adjust" instructions which help in performing arithmetic
|
|||
|
on ASCII or packed decimal data
|
|||
|
|
|||
|
3) The shift and rotate instructions. These have a restriction on
|
|||
|
the second operand: it must either be the immediate value 1 or
|
|||
|
the contents of the CL register.
|
|||
|
|
|||
|
4) IN and OUT which send or receive data from one of the 1024
|
|||
|
hardware ports.
|
|||
|
|
|||
|
5) CBW and CWD -- convert byte to word or word to doubleword by
|
|||
|
sign extension
|
|||
|
|
|||
|
d. Flow of control instructions. These deserve study in themselves
|
|||
|
and we will discuss them a little more. They include
|
|||
|
|
|||
|
1) CALL, RET -- call and return
|
|||
|
|
|||
|
2) INT, IRET -- interrupt and return-from-interrupt
|
|||
|
|
|||
|
3) JMP -- jump or "branch"
|
|||
|
|
|||
|
4) LOOP, LOOPNZ, LOOPZ -- special (and useful) instructions which
|
|||
|
implement a counted loop similar to the 370 BCT instruction
|
|||
|
|
|||
|
5) various conditional jump instructions
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 9
|
|||
|
|
|||
|
e. String instructions. These implement a limited storage-to-storage
|
|||
|
instruction subset and are quite powerful. All of them have the
|
|||
|
property that
|
|||
|
|
|||
|
1) The source of data is described by the combination DS and SI.
|
|||
|
|
|||
|
2) The destination of data is described by the combination ES and
|
|||
|
DI.
|
|||
|
|
|||
|
3) As part of the operation, the SI and/or DI register(s) is(are)
|
|||
|
incremented or decremented so the operation can be repeated.
|
|||
|
|
|||
|
They include
|
|||
|
|
|||
|
1) CMPSB/CMPSW -- compare byte or word
|
|||
|
|
|||
|
2) LODSB/LODSW -- load byte or word into AL or AX
|
|||
|
|
|||
|
3) STOSB/STOSW -- store byte or word from AL or AX
|
|||
|
|
|||
|
4) MOVSB/MOVSW -- move byte or word
|
|||
|
|
|||
|
5) SCASB/SCASW -- compare byte or word with contents of AL or AX
|
|||
|
|
|||
|
6) REP/REPE/REPNE -- a prefix which can be combined with any of
|
|||
|
the above instructions to make them execute repeatedly across a
|
|||
|
string of data whose length is held in CX.
|
|||
|
|
|||
|
f. Flag instructions: CLI, STI, CLD, STD, CLC, STC. These can set or
|
|||
|
clear the interrupt (enabled) direction (for string operations) or
|
|||
|
carry flags.
|
|||
|
|
|||
|
The addressing summary and the instruction summary given above masks a
|
|||
|
lot of annoying little exceptions. For example, you can't POP CS, and
|
|||
|
although the R <-- M form of LES is legal, the M <-- R form isn't etc.
|
|||
|
etc. My advice is
|
|||
|
|
|||
|
a. Go for the general rules
|
|||
|
|
|||
|
b. Don't try to memorize the exceptions
|
|||
|
|
|||
|
c. Rely on common sense and the assembler to teach you about
|
|||
|
exceptions over time. A lot of the exceptions cover things you
|
|||
|
wouldn't want to do anyway.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 10
|
|||
|
|
|||
|
5. A few instructions are rich enough and useful enough to warrent careful
|
|||
|
study. Here are a few final study guidelines:
|
|||
|
|
|||
|
a. It is well worth the time learning to use the string instruction
|
|||
|
set effectively. Among the most useful are
|
|||
|
|
|||
|
REP MOVSB ;moves a string
|
|||
|
|
|||
|
REP STOSB ;initializes memory
|
|||
|
|
|||
|
REPNE SCASB ;look up occurance of character in string
|
|||
|
|
|||
|
REPE CMPSB ;compare two strings
|
|||
|
|
|||
|
b. Similarly, if you have never written for a stack machine before,
|
|||
|
you will need to exercise PUSH and POP and get very comfortable
|
|||
|
with them because they are going to be good friends. If you are
|
|||
|
used to the 370, with lots of general purpose registers, you may
|
|||
|
find yourself feeling cramped at first, with many fewer registers
|
|||
|
and many instructions having register restrictions. But, you have
|
|||
|
a hidden ally: you need a register and you don't want to throw
|
|||
|
away what's in it? Just PUSH it, and when you are done, POP it
|
|||
|
back. This can lead to abuse. Never have more than two
|
|||
|
"expedient" PUSHes in effect and never leave something PUSHed
|
|||
|
across a major header comment or for more than 15 instructions or
|
|||
|
so. An exception is the saving and restoring of registers at
|
|||
|
entrance to and exit from a subroutine; here, if the subroutine is
|
|||
|
long, you should probably PUSH everything which the caller may need
|
|||
|
saved, whether you will use the register or not, and POP it in
|
|||
|
reverse order at the end.
|
|||
|
|
|||
|
Be aware that CALL and INT push return address information on the
|
|||
|
stack and RET and IRET pop it off. It is a good idea to become
|
|||
|
familiar with the structure of the stack.
|
|||
|
|
|||
|
c. In practice, to invoke system services you will use the INT
|
|||
|
instruction. It is quite possible to use this instruction effec-
|
|||
|
tively in a cookbook fashion without knowing precisely how it
|
|||
|
works.
|
|||
|
|
|||
|
d. The transfer of control instructions (CALL, RET, JMP) deserve care-
|
|||
|
ful study to avoid confusion. You will learn that these can be
|
|||
|
classified as follows:
|
|||
|
|
|||
|
1) all three have the capability of being either NEAR (CS register
|
|||
|
unchanged) or FAR (CS register changed)
|
|||
|
|
|||
|
2) JMPs and CALLs can be DIRECT (target is assembled into instruc-
|
|||
|
tion) or INDIRECT (target fetched from memory or register)
|
|||
|
|
|||
|
3) if NEAR and DIRECT, a JMP can be SHORT (less than 128 bytes
|
|||
|
away) or LONG
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 11
|
|||
|
|
|||
|
In general, the third issue is not worth worrying about. On a for-
|
|||
|
ward jump which is clearly VERY short, you can tell the assembler
|
|||
|
it is short and save one byte of code:
|
|||
|
|
|||
|
JMP SHORT CLOSEBY
|
|||
|
|
|||
|
On a backward jump, the assembler can figure it out for you. On a
|
|||
|
forward jump of dubious length, let the assembler default to a LONG
|
|||
|
form; at worst you waste one byte.
|
|||
|
|
|||
|
Also leave the assembler to worry about how the target address is
|
|||
|
to be represented, in absolute form or relative form.
|
|||
|
|
|||
|
e. The conditional jump set is rather confusing when studied apart
|
|||
|
from the assembler, but you do need to get a feeling for it. The
|
|||
|
interactions of the sign, carry, and overflow flags can get your
|
|||
|
mind stuttering pretty fast if you worry about it too much. What
|
|||
|
is boils down to, though, is
|
|||
|
|
|||
|
JZ means what it says
|
|||
|
|
|||
|
JNZ means what it says
|
|||
|
|
|||
|
JG reater this means "if the SIGNED difference is positive"
|
|||
|
|
|||
|
JA bove this means "if the UNSIGNED difference is positive"
|
|||
|
|
|||
|
JL ess this means "if the SIGNED difference is negative"
|
|||
|
|
|||
|
JB elow this means "if the UNSIGNED difference is negative"
|
|||
|
|
|||
|
JC arry assembles the same as JB; it's an aesthetic choice
|
|||
|
|
|||
|
You should understand that all conditional jumps are inherently
|
|||
|
DIRECT, NEAR, and "short"; the "short" part means that they can't
|
|||
|
go more than 128 bytes in either direction. Again, this is some-
|
|||
|
thing you could easily imagine to be more of a problem than it is.
|
|||
|
I follow this simple approach:
|
|||
|
|
|||
|
1) When taking an abnormal exit from a block of code, I always use
|
|||
|
an unconditional jump. Who knows how far you are going to end
|
|||
|
up jumping by the time the program is finished. For example, I
|
|||
|
wouldn't code this:
|
|||
|
|
|||
|
TEST AL,IDIBIT ;Is the idiot bit on?
|
|||
|
JNZ OYVEY ;Yes. Go to general cleanup
|
|||
|
|
|||
|
Rather, I would probably code this:
|
|||
|
|
|||
|
TEST AL,IDIBIT ;Is the idiot bit on?
|
|||
|
JZ NOIDIOCY ;No. I am saved.
|
|||
|
JMP OYVEY ;Yes. What can we say...
|
|||
|
NOIDIOCY:
|
|||
|
IBM PC Assembly Language Tutorial 12
|
|||
|
|
|||
|
The latter, of course, is a jump around a jump. Some would say
|
|||
|
it is evil, but I submit it is hard to avoid in this language.
|
|||
|
|
|||
|
2) Otherwise, within a block of code, I use conditional jumps
|
|||
|
freely. If the block eventually grows so long that the assem-
|
|||
|
bler starts complaining that my conditional jumps are too long
|
|||
|
I
|
|||
|
|
|||
|
a) consider reorganizing the block but
|
|||
|
|
|||
|
b) also consider changing some conditional jumps to their
|
|||
|
opposite and use the "jump around a jump" approach as shown
|
|||
|
above.
|
|||
|
|
|||
|
Enough about specific instructions!
|
|||
|
|
|||
|
6. Finally, in order to use the assembler effectively, you need to know
|
|||
|
the default rules for which segment registers are used to complete
|
|||
|
addresses in which situations.
|
|||
|
|
|||
|
a. CS is used to complete an address which is the target of a NEAR
|
|||
|
DIRECT jump. On an NEAR INDIRECT jump, DS is used to fetch the
|
|||
|
address from memory but then CS is used to complete the address
|
|||
|
thus fetched. On FAR jumps, of course, CS is itself altered. The
|
|||
|
instruction counter is always implicitly pointing in the code seg-
|
|||
|
ment.
|
|||
|
|
|||
|
b. SS is used to complete an address if BP is used in its formation.
|
|||
|
Otherwise, DS is always used to complete a data address.
|
|||
|
|
|||
|
c. On the string instructions, the target is always formed from ES and
|
|||
|
DI. The source is normally formed from DS and SI. If there is a
|
|||
|
segment prefix, it overrides the source not the target.
|
|||
|
|
|||
|
Learning about DOS
|
|||
|
------------------
|
|||
|
|
|||
|
I think the best way to learn about DOS internals is to read the technical
|
|||
|
appendices in the manual. These are not as complete as we might wish, but
|
|||
|
they really aren't bad; I certainly have learned a lot from them. What you
|
|||
|
don't learn from them you might eventually learn via judicious disassembly
|
|||
|
of parts of DOS, but that shouldn't really be necessary.
|
|||
|
|
|||
|
From reading the technical appendices, you learn that interrupts 20H
|
|||
|
through 27H are used to communicate with DOS. Mostly, you will use inter-
|
|||
|
rupt 21H, the DOS function manager.
|
|||
|
|
|||
|
The function manager implements a great many services. You request the
|
|||
|
individual services by means of a function code in the AH register. For
|
|||
|
example, by putting a nine in the AH register and issuing interrupt 21H you
|
|||
|
tell DOS to print a message on the console screen.
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 13
|
|||
|
|
|||
|
Usually, but by no means always, the DX register is used to pass data for
|
|||
|
the service being requested. For example, on the print message service
|
|||
|
just mentioned, you would put the 16 bit address of the message in the DX
|
|||
|
register. The DS register is also implicitly part of this argument, in
|
|||
|
keeping with the universal segmentation rules.
|
|||
|
|
|||
|
In understanding DOS functions, it is useful to understand some history and
|
|||
|
also some of the philosophy of MS-DOS with regard to portability. General-
|
|||
|
ly, you will find, once you read the technical information on DOS and also
|
|||
|
the IBM technical reference, you will know more than one way to do almost
|
|||
|
anything. Which is best? For example, to do asynch adapter I/O, you can
|
|||
|
use the DOS calls (pretty incomplete), you can use BIOS, or you can go
|
|||
|
directly to the hardware. The same thing is true for most of the other
|
|||
|
primitive I/O (keyboard or screen) although DOS is more likely to give you
|
|||
|
added value in these areas. When it comes to file I/O, DOS itself offers
|
|||
|
more than one interface. For example, there are four calls which read data
|
|||
|
from a file.
|
|||
|
|
|||
|
The way to decide rationally among these alternatives is by understanding
|
|||
|
the tradeoffs of functionality versus portability. Three kinds of porta-
|
|||
|
bility need to be considered: machine portability, operating system porta-
|
|||
|
bility (for example, the ability to assemble and run code under CP/M 86)
|
|||
|
and DOS version portability (the ability for a program to run under older
|
|||
|
versions of DOS>.
|
|||
|
|
|||
|
Most of the functions originally offered in DOS 1.0 were direct descendents
|
|||
|
of CP/M functions; there is even a compatibility interface so that programs
|
|||
|
which have been translated instruction for instruction from 8080 assembler
|
|||
|
to 8086 assembler might have a reasonable chance of running if they use
|
|||
|
only the core CP/M function set. Among the most generally useful in this
|
|||
|
original compatibility set are
|
|||
|
|
|||
|
09 -- print a full message on the screen
|
|||
|
0A -- get a console input line with full DOS editing
|
|||
|
0F -- open a file
|
|||
|
10 -- close a file (really needed only when writing)
|
|||
|
11 -- find first file matching a pattern
|
|||
|
12 -- find next file matching a pattern
|
|||
|
13 -- erase a file
|
|||
|
16 -- create a file
|
|||
|
17 -- rename a file
|
|||
|
1A -- set disk transfer address
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 14
|
|||
|
|
|||
|
The next set provide no function above what you can get with BIOS calls or
|
|||
|
more specialized DOS calls. However, they are preferable to BIOS calls
|
|||
|
when portability is an issue.
|
|||
|
|
|||
|
00 -- terminate execution
|
|||
|
01 -- read keyboard character
|
|||
|
02 -- write screen character
|
|||
|
03 -- read COM port character
|
|||
|
04 -- write COM port character
|
|||
|
05 -- print a character
|
|||
|
06 -- read keyboard or write screen with no editing
|
|||
|
|
|||
|
The standard file I/O calls are inferior to the specialized DOS calls but
|
|||
|
have the advantage of making the program easier to port to CP/M style sys-
|
|||
|
tems. Thus they are worth mentioning:
|
|||
|
|
|||
|
14 -- sequential read from file
|
|||
|
15 -- sequential write to file
|
|||
|
21 -- random read from file
|
|||
|
22 -- random write to file
|
|||
|
23 -- determine file size
|
|||
|
24 -- set random record
|
|||
|
|
|||
|
In addition to the CP/M compatible services, DOS also offers some special-
|
|||
|
ized services which have been available in all releases of DOS. These
|
|||
|
include
|
|||
|
|
|||
|
27 -- multi-record random read.
|
|||
|
28 -- multi-record random write.
|
|||
|
29 -- parse filename
|
|||
|
2A-2D -- get and set date and time
|
|||
|
|
|||
|
All of the calls mentioned above which have anything to do with files make
|
|||
|
use of a data area called the "FILE CONTROL BLOCK" (FCB). The FCB is any-
|
|||
|
where from 33 to 37 bytes long depending on how it is used. You are
|
|||
|
responsible for creating an FCB and filling in the first 12 bytes, which
|
|||
|
contain a drive code, a file name, and an extension.
|
|||
|
|
|||
|
When you open the FCB, the system fills in the next 20 bytes, which
|
|||
|
includes a logical record length. The initial lrecl is always 128 bytes,
|
|||
|
to achieve CP/M compatibility. The system also provides other useful
|
|||
|
information such as the file size.
|
|||
|
|
|||
|
After you have opened the FCB, you can change the logical record length.
|
|||
|
If you do this, your program is no longer CP/M compatible, but that doesn't
|
|||
|
make it a bad thing to do. DOS documentation suggests you use a logical
|
|||
|
record length of one for maximum flexibility. This is usually a good
|
|||
|
recommendation.
|
|||
|
|
|||
|
To perform actual I/O to a file, you eventually need to fill in byte 33 or
|
|||
|
possibly bytes 34-37 of the FCB. Here you supply information about the
|
|||
|
record you are interested in reading or writing. For the most part, this
|
|||
|
part of the interface is compatible with CP/M.
|
|||
|
IBM PC Assembly Language Tutorial 15
|
|||
|
|
|||
|
In general, you do not need to (and should not) modify other parts of the
|
|||
|
FCB.
|
|||
|
|
|||
|
The FCB is pretty well described in appendix E of the DOS manual.
|
|||
|
Beginning with DOS 2.0, there is a whole new system of calls for managing
|
|||
|
files which don't require that you build an FCB at all. These calls are
|
|||
|
quite incompatible with CP/M and also mean that your program cannot run
|
|||
|
under older releases of DOS. However, these calls are very nice and easy
|
|||
|
to use. They have these characteristics
|
|||
|
|
|||
|
1. To open, create, delete, or rename a file, you need only a character
|
|||
|
string representing its name.
|
|||
|
|
|||
|
2. The open and create calls return a 16 bit value which is simply placed
|
|||
|
in the BX register on subsequent calls to refer to the file.
|
|||
|
|
|||
|
3. There is not a separate call required to specify the data buffer.
|
|||
|
|
|||
|
4. Any number of bytes can be transfered on a single call; no data area
|
|||
|
must be manipulated to do this.
|
|||
|
|
|||
|
The "new" DOS calls also include comprehensive functions to manipulate the
|
|||
|
new chained directory structure and to allocate and free memory.
|
|||
|
|
|||
|
Learning the assembler
|
|||
|
----------------------
|
|||
|
|
|||
|
It is my feeling that many people can teach themselves to use the assembler
|
|||
|
by reading the MACRO Assembler manual if
|
|||
|
|
|||
|
1. You have read and understood a book like Morse and thus have a feeling
|
|||
|
for the instruction set
|
|||
|
|
|||
|
2. You know something about DOS services and so can communicate with the
|
|||
|
keyboard and screen and do something marginally useful with files. In
|
|||
|
the absence of this kind of knowledge, you can't write meaningful prac-
|
|||
|
tice programs and so will not progress.
|
|||
|
|
|||
|
3. You have access to some good examples (the ones supplied with the
|
|||
|
assembler are not good, in my opinion. I will try to supply you with
|
|||
|
some more relevant ones.
|
|||
|
|
|||
|
4. You ignore the things which are most confusing and least useful. Some
|
|||
|
of the most confusing aspects of the assembler include the facilities
|
|||
|
combining segments. But, you can avoid using all but the simplest of
|
|||
|
these facilities in many cases, even while writing quite substantial
|
|||
|
applications.
|
|||
|
|
|||
|
5. The easiest kind of assembler program to write is a COM program. They
|
|||
|
might seem harder, at first, then EXE programs because there is an
|
|||
|
extra step involved in creating the executable file, but COM programs
|
|||
|
are structurally very much simpler.
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 16
|
|||
|
|
|||
|
At this point, it is necessary to talk about COM programs and EXE programs.
|
|||
|
|
|||
|
As you probably know, DOS supports two kinds of executable files. EXE pro-
|
|||
|
grams are much more general, can contain many segments, and are generally
|
|||
|
built by compilers and sometimes by the assembler. If you follow the lead
|
|||
|
given by the samples distributed with the assembler, you will end up with
|
|||
|
EXE programs. A COM program, in contrast, always contains just one
|
|||
|
segment, and receives control with all four segment registers containing
|
|||
|
the same value. A COM program, thus, executes in a simplified environment,
|
|||
|
a 64K address space. You can go outside this address space simply by tem-
|
|||
|
porarily changing one segment register, but you don't have to, and that is
|
|||
|
the thing which makes COM programs nice and simple. Let's look at a very
|
|||
|
simple one.
|
|||
|
|
|||
|
The classic text on writing programs for the C language says that the first
|
|||
|
thing you should write is a program which says
|
|||
|
|
|||
|
HELLO, WORLD.
|
|||
|
|
|||
|
when invoked. What's sauce for C is sauce for assembler, so let's start
|
|||
|
with a HELLO program of our own. My first presentation of this will be
|
|||
|
bare bones, not stylistically complete, but just an illustration of what an
|
|||
|
assembler program absolutely has to have:
|
|||
|
|
|||
|
HELLO SEGMENT ;Set up HELLO code and data section
|
|||
|
ASSUME CS:HELLO,DS:HELLO ;Tell assembler about conditions at entry
|
|||
|
ORG 100H ;A .COM program begins with 100H byte prefix
|
|||
|
MAIN: JMP BEGIN ;Control must start here
|
|||
|
MSG DB 'Hello, world.$' ;But it is generally useful to put data first
|
|||
|
BEGIN: MOV DX,OFFSET MSG ;Let DX --> message.
|
|||
|
MOV AH,9 ;Set DOS function code for printing a message
|
|||
|
INT 21H ;Invoke DOS
|
|||
|
RET ;Return to system
|
|||
|
HELLO ENDS ;End of code and data section
|
|||
|
END MAIN ;Terminate assembler and specify entry point
|
|||
|
|
|||
|
First, let's attend to some obvious points. The macro assembler uses the
|
|||
|
general form
|
|||
|
|
|||
|
name opcode operands
|
|||
|
|
|||
|
Unlike the 370 assembler, though, comments are NOT set off from operands by
|
|||
|
blanks. The syntax uses blanks as delimiters within the operand field (see
|
|||
|
line 6 of the example) and so all comments must be set off by semi-colons.
|
|||
|
|
|||
|
Line comments are frequently set off with a semi-colon in column 1. I use
|
|||
|
this approach for block comments too, although there is a COMMENT statement
|
|||
|
which can be used to introduce a block comment.
|
|||
|
|
|||
|
Being an old 370 type, I like to see assembler code in upper case, although
|
|||
|
my comments are mixed case. Actually, the assembler is quite happy with
|
|||
|
mixed case anywhere.
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 17
|
|||
|
|
|||
|
As with any assembler, the core of the opcode set consists of opcodes which
|
|||
|
generate machine instructions but there are also opcodes which generate
|
|||
|
data and ones which function as instructions to the assembler itself, some-
|
|||
|
times called pseudo-ops. In the example, there are five lines which gener-
|
|||
|
ate machine code (JMP, MOV, MOV, INT, RET), one line which generates data
|
|||
|
(DB) and five pseudo-ops (SEGMENT, ASSUME, ORG, ENDS, and END).
|
|||
|
|
|||
|
We will discuss all of them.
|
|||
|
|
|||
|
Now, about labels. You will see that some labels in the example end in a
|
|||
|
colon and some don't. This is just a bit confusing at first, but no real
|
|||
|
mystery. If a label is attached to a piece of code (as opposed to data),
|
|||
|
then the assembler needs to know what to do when you JMP to or CALL that
|
|||
|
label. By convention, if the label ends in a colon, the assembler will use
|
|||
|
the NEAR form of JMP or CALL. If the label does not end in a colon, it
|
|||
|
will use the FAR form. In practice, you will always use the colon on any
|
|||
|
label you are jumping to inside your program because such jumps are always
|
|||
|
NEAR; there is no reason to use a FAR jump within a single code section. I
|
|||
|
mention this, though, because leaving off the colon isn't usually trapped
|
|||
|
as a syntax error, it will generally cause something more abstruse to go
|
|||
|
wrong.
|
|||
|
|
|||
|
On the other hand, a label attached to a piece of data or a pseudo-op never
|
|||
|
ends in a colon.
|
|||
|
|
|||
|
Machine instructions will generally take zero, one or two operands. Where
|
|||
|
there are two operands, the one which receives the result goes on the left
|
|||
|
as in 370 assembler.
|
|||
|
|
|||
|
I tried to explain this before, now maybe it will be even clearer: there
|
|||
|
are many more 8086 machine opcodes then there are assembler opcodes to rep-
|
|||
|
resent them. For example, there are five kinds of JMP, four kinds of CALL,
|
|||
|
two kinds of RET, and at least five kinds of MOV depending on how you count
|
|||
|
them. The macro assembler makes a lot of decisions for you based on the
|
|||
|
form taken by the operands or on attributes assigned to symbols elsewhere
|
|||
|
in your program. In the example above, the assembler will generate the
|
|||
|
NEAR DIRECT form of JMP because the target label BEGIN labels a piece of
|
|||
|
code instead of a piece of data (this makes the JMP DIRECT) and ends in a
|
|||
|
colon (this makes the JMP NEAR). The assembler will generate the immediate
|
|||
|
forms of MOV because the form OFFSET MSG refers to immediate data and
|
|||
|
because 9 is a constant. The assembler will generate the NEAR form of RET
|
|||
|
because that is the default and you have not told it otherwise.
|
|||
|
|
|||
|
The DB (define byte) pseudo-op is an easy one: it is used to put one or
|
|||
|
more bytes of data into storage. There is also a DW (define word)
|
|||
|
pseudo-op and a DD (define doubleword) pseudo-op; in the PC MACRO assem-
|
|||
|
bler, the fact that a label refers to a byte of storage, a word of storage,
|
|||
|
or a doubleword of storage can be very significant in ways which we will
|
|||
|
see presently.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 18
|
|||
|
|
|||
|
About that OFFSET operator, I guess this is the best way to make the point
|
|||
|
about how the assembler decides what instruction to assemble: an analogy
|
|||
|
with 370 assembler:
|
|||
|
|
|||
|
PLACE DC ......
|
|||
|
...
|
|||
|
LA R1,PLACE
|
|||
|
L R1,PLACE
|
|||
|
|
|||
|
In 370 assembler, the first instruction puts the address of label PLACE in
|
|||
|
register 1, the second instruction puts the contents of storage at label
|
|||
|
PLACE in register 1. Notice that two different opcodes are used. In the
|
|||
|
PC assembler, the analogous instructions would be
|
|||
|
|
|||
|
PLACE DW ......
|
|||
|
...
|
|||
|
MOV DX,OFFSET PLACE
|
|||
|
MOV DX,PLACE
|
|||
|
|
|||
|
If PLACE is the label of a word of storage, then the second instruction
|
|||
|
will be understood as a desire to fetch that data into DX. If X is a
|
|||
|
label, then "OFFSET X" means "the ordinary number which represents X's off-
|
|||
|
set from the start of the segment." And, if the assembler sees an ordinary
|
|||
|
number, as opposed to a label, it uses the instruction which is equivalent
|
|||
|
to LA.
|
|||
|
|
|||
|
If PLACE were the label of a DB pseudo-op, instead of a DW, then
|
|||
|
|
|||
|
MOV DX,PLACE
|
|||
|
|
|||
|
would be illegal. The assembler worries about length attributes of its
|
|||
|
operands.
|
|||
|
|
|||
|
Next, numbers and constants in general. The assembler's default radix is
|
|||
|
decimal. You can change this, but I don't recommend it. If you want to
|
|||
|
represent numbers in other forms of notation such as hex or bit, you gener-
|
|||
|
ally use a trailing letter. For example,
|
|||
|
|
|||
|
21H
|
|||
|
|
|||
|
is hexidecimal 21,
|
|||
|
|
|||
|
00010000B
|
|||
|
|
|||
|
is the eight bit binary number pictured.
|
|||
|
|
|||
|
The next elements we should point to are the SEGMENT...ENDS pair and the
|
|||
|
END instruction. Every assembler program has to have these elements.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 19
|
|||
|
|
|||
|
SEGMENT tells the assembler you are starting a section of contiguous mate-
|
|||
|
rial (code and/or data). The symmetrically named ENDS statement tells the
|
|||
|
assembler you are finished with a section of contiguous material. I wish
|
|||
|
they didn't use the word SEGMENT in this context. To me, a "segment" is a
|
|||
|
hardware construct: it is the 64K of real storage which becomes address-
|
|||
|
able by virtue of having a particular value in a segment register. Now, it
|
|||
|
is true that the "segments" you make with the assembler often correspond to
|
|||
|
real hardware "segments" at execution time. But, if you look at things
|
|||
|
like the GROUP and CLASS options supported by the linker, you will discover
|
|||
|
that this correspondence is by no means exact. So, at risk of maybe con-
|
|||
|
fusing you even more, I am going to use the more informal term "section" to
|
|||
|
refer to the area set off by means of the SEGMENT and ENDS instructions.
|
|||
|
|
|||
|
The sections delimited by SEGMENT...ENDS pairs are really a lot like CSECTs
|
|||
|
and DSECTs in the 370 world.
|
|||
|
|
|||
|
I strongly recommend that you be selective in your study of the SEGMENT
|
|||
|
pseudo-op as described in the manual. Let me just touch on it here.
|
|||
|
|
|||
|
name SEGMENT
|
|||
|
name SEGMENT PUBLIC
|
|||
|
name SEGMENT AT nnn
|
|||
|
|
|||
|
Basically, you can get away with just the three forms given above. The
|
|||
|
first form is what you use when you are writing a single section of assem-
|
|||
|
bler code which will not be combined with other pieces of code at link
|
|||
|
time. The second form says that this assembly only contains part of the
|
|||
|
section; other parts might be assembled separately and combined later by
|
|||
|
the linker.
|
|||
|
|
|||
|
I have found that one can construct reasonably large modular applications
|
|||
|
in assembler by simply making every assembly use the same segment name and
|
|||
|
declaring the name to be PUBLIC each time. If you read the assembler and
|
|||
|
linker documentation, you will also be bombarded by information about more
|
|||
|
complex options such as the GROUP statement and the use of other "combine
|
|||
|
types" and "classes." I don't recommend getting into any of that. I will
|
|||
|
talk more about the linker and modular construction of programs a little
|
|||
|
later. The assembler manual also implies that a STACK segment is required.
|
|||
|
|
|||
|
This is not really true. There are numerous ways to assure that you have a
|
|||
|
valid stack at execution time.
|
|||
|
|
|||
|
Of course, if you plan to write applications in assembler which are more
|
|||
|
than 64K in size, you will need more than what I have told you; but who is
|
|||
|
really going to do that? Any application that large is likely to be coded
|
|||
|
in a higher level language.
|
|||
|
|
|||
|
The third form of the SEGMENT statement makes the delineated section into
|
|||
|
something like a "DSECT;" that is, it doesn't generate any code, it just
|
|||
|
describes what is present somewhere already in the computer's memory.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 20
|
|||
|
|
|||
|
Sometimes the AT value you give is meaningful. For example, the BIOS work
|
|||
|
area is located at location 40 hex. So, you might see
|
|||
|
|
|||
|
BIOSAREA SEGMENT AT 40H ;Map BIOS work area
|
|||
|
ORG BIOSAREA+10H
|
|||
|
EQUIP DB ? ;Location of equipment flags, first byte
|
|||
|
BIOSAREA ENDS
|
|||
|
|
|||
|
in a program which was interested in mucking around in the BIOS work area.
|
|||
|
At other times, the AT value you give may be arbitrary, as when you are
|
|||
|
mapping a repeated control block:
|
|||
|
|
|||
|
PROGPREF SEGMENT AT 0 ;Really a DSECT mapping the program prefix
|
|||
|
ORG PROGPREF+6
|
|||
|
MEMSIZE DW ? ;Size of available memory
|
|||
|
PROGPREF ENDS
|
|||
|
|
|||
|
Really, no matter whether the AT value represents truth or fiction, it is
|
|||
|
your responsibility, not the assembler's, to get set up a segment register
|
|||
|
so that you can really reach the storage in question. So, you can't say
|
|||
|
|
|||
|
MOV AL,EQUIP
|
|||
|
|
|||
|
unless you first say something like
|
|||
|
|
|||
|
MOV AX,BIOSAREA ;BIOSAREA becomes a symbol with value 40H
|
|||
|
MOV ES,AX
|
|||
|
ASSUME ES:BIOSAREA
|
|||
|
|
|||
|
Enough about SEGMENT. The END statement is simple. It goes at the end of
|
|||
|
every assembly. When you are assembling a subroutine, you just say
|
|||
|
|
|||
|
END
|
|||
|
|
|||
|
but when you are assembling the main routine of a program you say
|
|||
|
|
|||
|
END label
|
|||
|
|
|||
|
where 'label' is the place where execution is to begin.
|
|||
|
|
|||
|
Another pseudo-op illustrated in the program is ASSUME. ASSUME is like the
|
|||
|
USING statement in 370 assembler. However, ASSUME can ONLY refer to seg-
|
|||
|
ment registers. The assembler uses ASSUME information to decide whether to
|
|||
|
assemble segment override prefixes and to check that the data you are try-
|
|||
|
ing to access is really accessible. In this case, we can reassure the
|
|||
|
assembler that both the CS and DS registers will address the section called
|
|||
|
HELLO at execution time. Actually, the SS and ES registers will too, but
|
|||
|
the assembler never needs to make use of this information.
|
|||
|
|
|||
|
I guess I have explained everything in the program except that ORG
|
|||
|
pseudo-op. ORG means the same thing as it does in many assembly languages.
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 21
|
|||
|
|
|||
|
It tells the assembler to move its location counter to some particular
|
|||
|
address. In this case, we have asked the assembler to start assembling
|
|||
|
code hex 100 bytes from the start of the section called HELLO instead of at
|
|||
|
the very beginning. This simply reflects the way COM programs are loaded.
|
|||
|
|
|||
|
When a COM program is loaded by the system, the system sets up all four
|
|||
|
segment registers to address the same 64K of storage. The first 100 hex
|
|||
|
bytes of that storage contains what is called the program prefix; this area
|
|||
|
is described in appendix E of the DOS manual. Your COM program physically
|
|||
|
begins after this. Execution begins with the first physical byte of your
|
|||
|
program; that is why the JMP instruction is there.
|
|||
|
|
|||
|
Wait a minute, you say, why the JMP instruction at all? Why not put the
|
|||
|
data at the end? Well, in a simple program like this I probably could have
|
|||
|
gotten away with that. However, I have the habit of putting data first and
|
|||
|
would encourage you to do the same because of the way the assembler has of
|
|||
|
assembling different instructions depending on the nature of the operand.
|
|||
|
|
|||
|
Unfortunately, sometimes the different choices of instruction which can
|
|||
|
assemble from a single opcode have different lengths. If the assembler has
|
|||
|
already seen the data when it gets to the instructions it has a good chance
|
|||
|
of reserving the right number of bytes on the first pass. If the data is
|
|||
|
at the end, the assembler may not have enough information on the first pass
|
|||
|
to reserve the right number of bytes for the instruction. Sometimes the
|
|||
|
assembler will complain about this, something like "Forward reference is
|
|||
|
illegal" but at other times, it will make some default assumption. On the
|
|||
|
second pass, if the assumption turned out to be wrong, it will report what
|
|||
|
is called a "Phase error," a very nasty error to track down. So get in the
|
|||
|
habit of putting data and equated symbols ahead of code.
|
|||
|
OK. Maybe you understand the program now. Let's walk through the steps
|
|||
|
involved in making it into a real COM file.
|
|||
|
|
|||
|
1. The file should be created with the name HELLO.ASM (actually the name
|
|||
|
is arbitrary but the extension .ASM is conventional and useful)
|
|||
|
|
|||
|
2.
|
|||
|
ASM HELLO,,;
|
|||
|
(this is just one example of invoking the assembler; it uses the small
|
|||
|
assembler ASM, it produces an object file and a listing file with the
|
|||
|
same name as the source file. I am not going exhaustively into how to
|
|||
|
invoke the assembler, which the manual goes into pretty well. I guess
|
|||
|
this is the first time I mentioned that there are really two
|
|||
|
assemblers; the small assembler ASM will run in a 64K machine and
|
|||
|
doesn't support macros. I used to use it all the time; now that I have
|
|||
|
a bigger machine and a lot of macro libraries I use the full function
|
|||
|
assembler MASM. You get both when you buy the package).
|
|||
|
|
|||
|
3. If you issue DIR at this point, you will discover that you have
|
|||
|
acquired HELLO.OBJ (the object code resulting from the assembly) and
|
|||
|
HELLO.LST (a listing file). I guess I can digress for a second here
|
|||
|
concerning the listing file. It contains TAB characters. I have found
|
|||
|
there are two good ways to get it printed and one bad way. The bad way
|
|||
|
is to use LPT1: as the direct target of the listing file or to try
|
|||
|
IBM PC Assembly Language Tutorial 22
|
|||
|
|
|||
|
copying the LST file to LPT1 without first setting the tabs on the
|
|||
|
printer. The two good ways are to either
|
|||
|
|
|||
|
a. direct it to the console and activate the printer with CTRL-PRTSC.
|
|||
|
In this case, DOS will expand the tabs for you.
|
|||
|
|
|||
|
b. direct to LPT1: but first send the right escape sequence to LPT1 to
|
|||
|
set the tabs every eight columns. I have found that on some early
|
|||
|
serial numbers of the IBM PC printer, tabs don't work quite right,
|
|||
|
which forces you to the first option.
|
|||
|
|
|||
|
4.
|
|||
|
LINK HELLO;
|
|||
|
(again, there are lots of linker options but this is the simplest. It
|
|||
|
takes HELLO.OBJ and makes HELLO.EXE). HELLO.EXE? I thought we were
|
|||
|
making a COM program, not an EXE program. Right. HELLO.EXE isn't
|
|||
|
really executable; its just that the linker doesn't know about COM pro-
|
|||
|
grams. That requires another utility. You don't have this utility if
|
|||
|
you are using DOS 1.0; you have it if you are using DOS 1.1 or DOS 2.0.
|
|||
|
Oh, by the way, the linker will warn you that you have no stack
|
|||
|
segment. Don't worry about it.
|
|||
|
|
|||
|
5.
|
|||
|
EXE2BIN HELLO HELLO.COM
|
|||
|
|
|||
|
This is the final step. It produces the actual program you will exe-
|
|||
|
cute. Note that you have to spell out HELLO.COM; for a nominally
|
|||
|
rational but actually perverse reason, EXE2BIN uses the default exten-
|
|||
|
sion BIN instead of COM for its output file. At this point, you might
|
|||
|
want to erase HELLO.EXE; it looks a lot more useful than it is.
|
|||
|
|
|||
|
Chances are you won't need to recreate HELLO.COM unless you change the
|
|||
|
source and then you are going to have to redo the whole thing.
|
|||
|
|
|||
|
6.
|
|||
|
HELLO
|
|||
|
|
|||
|
You type hello, that invokes the program, it says
|
|||
|
|
|||
|
HELLO YOURSELF!!!
|
|||
|
(oops, what did I do wrong....?)
|
|||
|
|
|||
|
What about subroutines?
|
|||
|
-----------------------
|
|||
|
|
|||
|
I started with a simple COM program because I actually think they are easi-
|
|||
|
er to create than subroutines to be called from high level languages, but
|
|||
|
maybe its really the latter you are interested in. Here, I think you
|
|||
|
should get comfortable with the assembler FIRST with little exercises like
|
|||
|
the one above and also another one which I will finish up with.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 23
|
|||
|
|
|||
|
Next you are ready to look at the interface information for your particular
|
|||
|
language. You usually find this in some sort of an appendix. For example,
|
|||
|
the BASIC manual has Appendix C on Machine Language Subroutines. The
|
|||
|
PASCAL manual buries the information a little more deeply: the interface
|
|||
|
to a separately compiled routine can be found in the Chapter on Procedures
|
|||
|
and Functions, in a subsection called Internal Calling Conventions.
|
|||
|
|
|||
|
Each language is slightly different, but here are what I think are some
|
|||
|
common issues in subroutine construction.
|
|||
|
|
|||
|
1. NEAR versus FAR? Most of the time, your language will probably call
|
|||
|
your assembler routine as a FAR routine. In this case, you need to
|
|||
|
make sure the assembler will generate the right kind of return. You do
|
|||
|
this with a PROC...ENDP statement pair. The PROC statement is probably
|
|||
|
a good idea for a NEAR routine too even though it is not strictly
|
|||
|
required:
|
|||
|
|
|||
|
FAR linkage: | NEAR linkage:
|
|||
|
|
|
|||
|
ARBITRARY SEGMENT | SPECIFIC SEGMENT PUBLIC
|
|||
|
PUBLIC THENAME | PUBLIC THENAME
|
|||
|
ASSUME CS:ARBITRARY | ASSUME CS:SPECIFIC,DS:SPECIFIC
|
|||
|
THENAME PROC FAR | ASSUME ES:SPECIFIC,SS:SPECIFIC
|
|||
|
..... code and data | THENAME PROC NEAR
|
|||
|
THENAME ENDP | ..... code and data ....
|
|||
|
ARBITRARY ENDS | THENAME ENDP
|
|||
|
END | SPECIFIC ENDS
|
|||
|
| END
|
|||
|
|
|||
|
With FAR linkage, it doesn't really matter what you call the segment.
|
|||
|
you must declare the name by which you will be called in a PUBLIC pseu-
|
|||
|
do-op and also show that it is a FAR procedure. Only CS will be ini-
|
|||
|
tialized to your segment when you are called. Generally, the other
|
|||
|
segment registers will continue to point to the caller's segments.
|
|||
|
|
|||
|
With NEAR linkage, you are executing in the same segment as the caller.
|
|||
|
Therefore, you must give the segment a specific name as instructed by
|
|||
|
the language manual. However, you may be able to count on all segment
|
|||
|
registers pointing to your own segment (sometimes the situation can be
|
|||
|
more complicated but I cannot really go into all of the details). You
|
|||
|
should be aware that the code you write will not be the only thing in
|
|||
|
the segment and will be physically relocated within the segment by the
|
|||
|
linker. However, all OFFSET references will be relocated and will be
|
|||
|
correct at execution time.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 24
|
|||
|
|
|||
|
2. Parameters passed on the stack. Usually, high level languages pass
|
|||
|
parameters to subroutines by pushing words onto the stack prior to
|
|||
|
calling you. What may differ from language to language is the nature
|
|||
|
of what is pushed (OFFSET only or OFFSET and SEGMENT) and the order in
|
|||
|
which it is pushed (left to right, right to left within the CALL state-
|
|||
|
ment). However, you will need to study the examples to figure out how
|
|||
|
to retrieve the parameters from the stack. A useful fact to exploit is
|
|||
|
the fact that a reference involving the BP register defaults to a ref-
|
|||
|
erence to the stack segment. So, the following strategy can work:
|
|||
|
|
|||
|
ARGS STRUC
|
|||
|
DW 3 DUP(?) ;Saved BP and return address
|
|||
|
ARG3 DW ?
|
|||
|
ARG2 DW ?
|
|||
|
ARG1 DW ?
|
|||
|
ARGS ENDS
|
|||
|
...........
|
|||
|
PUSH BP ;save BP register
|
|||
|
MOV BP,SP ;Use BP to address stack
|
|||
|
MOV ...,[BP].ARG2 ;retrieve second argument
|
|||
|
(etc.)
|
|||
|
|
|||
|
This example uses something called a structure, which is only available
|
|||
|
in the large assembler; furthermore, it uses it without allocating it,
|
|||
|
which is not a well-documented option. However, I find the above
|
|||
|
approach generally pleasing. The STRUC is like a DSECT in that it
|
|||
|
establishes labels as being offset a certain distance from an arbitrary
|
|||
|
point; these labels are then used in the body of code by beginning them
|
|||
|
with a period; the construction ".ARG2" means, basically, " +
|
|||
|
(ARG2-ARGS)."
|
|||
|
|
|||
|
What you are doing here is using BP to address the stack, accounting
|
|||
|
for the word where you saved the caller's BP and also for the two words
|
|||
|
which were pushed by the CALL instruction.
|
|||
|
|
|||
|
3. How big is the stack? BASIC only gives you an eight word stack to play
|
|||
|
with. On the other hand, it doesn't require you to save any registers
|
|||
|
except the segment registers. Other languages give you a liberal
|
|||
|
stack, which makes things a lot easier. If you have to create a new
|
|||
|
stack segment for yourself, the easiest thing is to place the stack at
|
|||
|
the end of your program and:
|
|||
|
|
|||
|
CLI ;suppress interrupts while changing the stack
|
|||
|
MOV SSAVE,SS ;save old SS in local storage (old SP
|
|||
|
; already saved in BP)
|
|||
|
MOV SP,CS ;switch
|
|||
|
MOV SS,SP ;the
|
|||
|
MOV SP,OFFSET STACKTOP ;stack
|
|||
|
STI ;(maybe)
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 25
|
|||
|
|
|||
|
Later, you can reverse these steps before returning to the caller. At
|
|||
|
the end of your program, you place the stack itself:
|
|||
|
|
|||
|
DW 128 DUP(?) ;stack of 128 words (liberal)
|
|||
|
STACKTOP LABEL WORD
|
|||
|
|
|||
|
4. Make sure you save and restore those registers required by the caller.
|
|||
|
|
|||
|
5. Be sure to get the right kind of addressibility. In the FAR call exam-
|
|||
|
ple, only CS addresses your segment. If you are careful with your
|
|||
|
ASSUME statements the assembler will keep track of this fact and gener-
|
|||
|
ate CS prefixes when you make data references; however, you might want
|
|||
|
to do something like
|
|||
|
|
|||
|
MOV AX,CS ;get current segment address
|
|||
|
MOV DS,AX ;To DS
|
|||
|
ASSUME DS:THISSEG
|
|||
|
|
|||
|
Be sure you keep your ASSUMEs in synch with reality.
|
|||
|
|
|||
|
Learning about BIOS and the hardware
|
|||
|
------------------------------------
|
|||
|
|
|||
|
You can't do everything with DOS calls. You may need to learn something
|
|||
|
about the BIOS and about the hardware itself. In this, the Technical Ref-
|
|||
|
erence is a very good thing to look at.
|
|||
|
|
|||
|
The first thing you look at in the Technical Reference, unless you are
|
|||
|
really determined to master the whole ball of wax, is the BIOS listings
|
|||
|
presented in Appendix A. Glory be: here is the whole 8K of ROM which deals
|
|||
|
with low level hardware support layed out with comments and everything.
|
|||
|
|
|||
|
In fact, if you are just interested in learning what BIOS can do for you,
|
|||
|
you just need to read the header comments at the beginning of each section
|
|||
|
of the listing.
|
|||
|
|
|||
|
BIOS services are invoked by means of the INT instruction; the BIOS occu-
|
|||
|
pies interrupts 10H through 1FH and also interrupt 5H; actually, of these
|
|||
|
seventeen interrupts, five are used for user exit points or data pointers,
|
|||
|
leaving twelve actual services.
|
|||
|
|
|||
|
In most cases, a service deals with a particular hardware interface; for
|
|||
|
example, BIOS interrupt 10H deals with the screen. As with DOS function
|
|||
|
calls, many BIOS services can be passed a function code in the AH register
|
|||
|
and possible other arguments.
|
|||
|
|
|||
|
I am not going to summarize the most useful BIOS features here; you will
|
|||
|
see some examples in the next sample program we will look at.
|
|||
|
|
|||
|
The other thing you might want to get into with the Tech reference is the
|
|||
|
description of some hardware options, particularly the asynch adapter,
|
|||
|
which are not well supported in the BIOS. The writeup on the asynch adapt-
|
|||
|
er is pretty complete.
|
|||
|
IBM PC Assembly Language Tutorial 26
|
|||
|
|
|||
|
Actually, the Tech reference itself is pretty complete and very nice as far
|
|||
|
as it goes. One thing which is missing from the Tech reference is informa-
|
|||
|
tion on the programmable peripheral chips on the system board. These
|
|||
|
include
|
|||
|
|
|||
|
the 8259 interrupt controller
|
|||
|
the 8253 timer
|
|||
|
the 8237 DMA controller and
|
|||
|
the 8255 peripheral interface
|
|||
|
|
|||
|
To make your library absolutely complete, you should order the INTEL data
|
|||
|
sheets for these beasts.
|
|||
|
|
|||
|
I should say, though, that the only I ever found I needed to know about was
|
|||
|
the interrupt controller. If you happen to have the 8086 Family User's
|
|||
|
Manual, there is an appendix there which gives an adequate description of the
|
|||
|
8259.
|
|||
|
|
|||
|
A final example
|
|||
|
---------------
|
|||
|
|
|||
|
I leave you with a more substantial example of code which illustrates some
|
|||
|
good elementary techniques; I won't claim its style is perfect, but I think
|
|||
|
it is adequate. I think this is a much more useful example than what you
|
|||
|
will get with the assembler:
|
|||
|
|
|||
|
PAGE 61,132
|
|||
|
TITLE SETSCRN -- Establish correct monitor use at boot time
|
|||
|
;
|
|||
|
; This program is a variation on many which toggle the equipment flags
|
|||
|
; to support the use of either video option (monochrome or color).
|
|||
|
; The thing about this one is it prompts the user in such a way that he
|
|||
|
; can select the use of the monitor he is currently looking at (or which
|
|||
|
; is currently connected or turned on) without really having to know
|
|||
|
; which is which. SETSCRN is a good program to put first in an
|
|||
|
; AUTOEXEC.BAT file.
|
|||
|
;
|
|||
|
; This program is highly dependent on the hardware and BIOS of the IBMPC
|
|||
|
; and is hardly portable, except to very exact clones. For this reason,
|
|||
|
; BIOS calls are used in lieu of DOS function calls where both provide
|
|||
|
; equal function.
|
|||
|
;
|
|||
|
|
|||
|
OK. That's the first page of the program. Notice the PAGE statement,
|
|||
|
which you can use to tell the assembler how to format the listing. You
|
|||
|
give it lines per page and characters per line. I have mine setup to print
|
|||
|
on the host lineprinter; I routinely upload my listings at 9600 baud and
|
|||
|
print them on the host; it is faster than using the PC printer.
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 27
|
|||
|
|
|||
|
There is also a TITLE statement. This simply provides a nice title for
|
|||
|
each page of your listing. Now for the second page:
|
|||
|
|
|||
|
SUBTTL -- Provide .COM type environment and Data
|
|||
|
PAGE
|
|||
|
;
|
|||
|
; First, describe the one BIOS byte we are interested in
|
|||
|
;
|
|||
|
BIOSDATA SEGMENT AT 40H ;Describe where BIOS keeps his data
|
|||
|
ORG 10H ;Skip parts we are not interested in
|
|||
|
EQUIP DB ? ;Equipment flag location
|
|||
|
MONO EQU 00110000B ;These bits on if monochrome
|
|||
|
COLOR EQU 11101111B ;Mask to make BIOS think of the color board
|
|||
|
BIOSDATA ENDS ;End of interesting part
|
|||
|
;
|
|||
|
; Next, describe some values for interrupts and functions
|
|||
|
;
|
|||
|
DOS EQU 21H ;DOS Function Handler INT code
|
|||
|
PRTMSG EQU 09H ;Function code to print a message
|
|||
|
KBD EQU 16H ;BIOS keyboard services INT code
|
|||
|
GETKEY EQU 00H ;Function code to read a character
|
|||
|
SCREEN EQU 10H ;BIOS Screen services INT code
|
|||
|
MONOINIT EQU 02H ;Value to initialize monochrome screen
|
|||
|
COLORINIT EQU 03H ;Value to initialize color screen (80x25)
|
|||
|
COLORINIT EQU 01H ;Value to initialize color screen (40X25)
|
|||
|
;
|
|||
|
; Now, describe our own segment
|
|||
|
;
|
|||
|
SETSCRN SEGMENT ;Set operating segment for CODE and DATA
|
|||
|
;
|
|||
|
ASSUME CS:SETSCRN,DS:SETSCRN,ES:SETSCRN,SS:SETSCRN ;All segments
|
|||
|
;
|
|||
|
ORG 100H ;Begin assembly at standard .COM offset
|
|||
|
;
|
|||
|
MAIN PROC NEAR ;COM files use NEAR linkage
|
|||
|
JMP BEGIN ;And, it is helpful to put the data first, but
|
|||
|
; ;then you must branch around it.
|
|||
|
;
|
|||
|
; Data used in SETSCRN
|
|||
|
;
|
|||
|
CHANGELOC DD EQUIP ;Location of the EQUIP, recorded as far pointer
|
|||
|
MONOPROMPT DB 'Please press the plus ( + ) key.$' ;User sees on mono
|
|||
|
COLORPROMPT DB 'Please press the minus ( - ) key.$' ;User sees on color
|
|||
|
|
|||
|
|
|||
|
Several things are illustrated on this page. First, in addition to titles,
|
|||
|
the assembler supports subtitles: hence the SUBTTL pseudo-op. Second, the
|
|||
|
PAGE pseudo-op can be used to go to a new page in the listing. You see an
|
|||
|
example here of the DSECT-style segment in the "SEGMENT AT 40H". Here, our
|
|||
|
our interest is in correctly describing the location of some data in the
|
|||
|
BIOS work area which really is located at segment 40H.
|
|||
|
|
|||
|
|
|||
|
IBM PC Assembly Language Tutorial 28
|
|||
|
|
|||
|
You will also see illustrated the EQU instruction, which just gives a sym-
|
|||
|
bolic name to a number. I don't make a fetish of giving a name to every
|
|||
|
single number in a program. I do feel strongly, though, that interrupts
|
|||
|
and function codes, where the number is arbitrary and the function being
|
|||
|
performed is the thing of interest, should always be given symbolic names.
|
|||
|
|
|||
|
One last new element in this section is the define doubleword (DD) instruc-
|
|||
|
tion. A doubleword constant can refer, as in this case, to a location in
|
|||
|
another segment. The assembler will be happy to use information at its
|
|||
|
disposal to properly assemble it. In this case, the assembler knows that
|
|||
|
EQUIP is offset 10 in the segment BIOSDATA which is at 40H.
|
|||
|
|
|||
|
SUBTTL -- Perform function
|
|||
|
PAGE
|
|||
|
BEGIN: CALL MONOON ;Turn on mono display
|
|||
|
MOV DX,OFFSET MONOPROMPT ;GET MONO PROMPT
|
|||
|
MOV AH,PRTMSG ;ISSUE
|
|||
|
INT DOS ;IT
|
|||
|
CALL COLORON ;Turn on color display
|
|||
|
MOV DX,OFFSET COLORPROMPT ;GET COLOR PROMPT
|
|||
|
MOV AH,PRTMSG ;ISSUE
|
|||
|
INT DOS ;IT
|
|||
|
MOV AH,GETKEY ;Obtain user response
|
|||
|
INT KBD
|
|||
|
CMP AL,'+' ;Does he want MONO?
|
|||
|
JNZ NOMONO
|
|||
|
CALL MONOON ;yes. give it to him
|
|||
|
NOMONO: RET
|
|||
|
MAIN ENDP
|
|||
|
|
|||
|
The main code section makes use of subroutines to keep the basic flow sim-
|
|||
|
ple. About all that's new to you in this section is the use of the BIOS
|
|||
|
interrupt KBD to read a character from the keyboard.
|
|||
|
|
|||
|
Now for the subroutines, MONOON and COLORON:
|
|||
|
|
|||
|
SUBTTL -- Routines to turn monitors on
|
|||
|
PAGE
|
|||
|
MONOON PROC NEAR ;Turn mono on
|
|||
|
LES DI,CHANGELOC ;Get location to change
|
|||
|
ASSUME ES:BIOSDATA ;TELL ASSEMBLER ABOUT CHANGE TO ES
|
|||
|
OR EQUIP,MONO
|
|||
|
MOV AX,MONOINIT ;Get screen initialization value
|
|||
|
INT SCREEN ;Initialize screen
|
|||
|
RET
|
|||
|
MONOON ENDP
|
|||
|
COLORON PROC NEAR ;Turn color on
|
|||
|
LES DI,CHANGELOC ;Get location to change
|
|||
|
ASSUME ES:BIOSDATA ;TELL ASSEMBLER ABOUT CHANGE TO ES
|
|||
|
AND EQUIP,COLOR
|
|||
|
MOV AX,COLORINIT ;Get screen initialization value
|
|||
|
INT SCREEN ;Initialize screen
|
|||
|
RET
|
|||
|
IBM PC Assembly Language Tutorial 29
|
|||
|
|
|||
|
COLORON ENDP
|
|||
|
SETSCRN ENDS ;End of segment
|
|||
|
END MAIN ;End of assembly; execution at MAIN
|
|||
|
|
|||
|
The instructions LES and LDS are useful ones for dealing with doubleword
|
|||
|
addresses. The offset is loaded into the operand register and the segment
|
|||
|
into ES (for LES) or DS (for LDS). By telling the assembler, with an
|
|||
|
ASSUME, that ES now addresses the BIOSDATA segment, it is able to correctly
|
|||
|
assemble the OR and AND instructions which refer to the EQUIP byte. An ES
|
|||
|
segment prefix is added.
|
|||
|
|
|||
|
To understand the action here, you simply need to know that flags in that
|
|||
|
particular byte control how the BIOS screen service initializes the adapt-
|
|||
|
ers. BIOS will only work with one adapter at a time; by setting the equip-
|
|||
|
ment flags to show one or the other as installed and calling BIOS screen
|
|||
|
initialization, we achieve the desired effect.
|
|||
|
|
|||
|
The rest is up to you.
|
|||
|
|