Getting Started Assembling your First
Program Registers General Purpose Registers Segment
Registers Index Registers 32 bit Registers High and Low
Registers Variables Basic Commands MOV Shift Loops
and Conditions Loops Conditionals Nested
Loops Arrays Procedures and Macros Procedures Macros
Getting Started
Before you begin coding i
would get a program called Assembly Editor. It is basically a dos
based text editor but it will assemble and run your programs for you
and it has a debugger. You should also get MASM. That is the
microsoft assembler. I believe the current version is 6.13. One
thing to mention is that assembly language is architecture
dependent. That means that from one system to the next there is a
different assembly language. I will be doing intel based assembly
language. That includes athlons and cyrix chips if you were
wondering.
Assembling Your first
program
There are a few things that need to be in every
assembly program. Here is a simple program that shows these things.
title Hello World (hello.asm)
.model small .stack 100h .data message db "Hello
World",0dh,0ah,0 .code
main proc mov ax,@data mov ds,ax
mov ah,9 mov
dx,offset message int 21h
mov ax,4C00h int 21h main
endp end main
As you can see it takes a lot of code just output hello
world. To assemble this with assembly editor just go to the run
menu. First do assemble. Then link. It will ask if you want to link
anything just hit enter, then it will go to dos, just hit enter
until you get back to the editor. Then click run. It will output
Hello World. Now to explain the program. It start off with title
of the program which can be anything and then the filename of this
file in parenthesis. The next two lines will probably be the same
for any program you write. The model is basically what kind of
memory requirements this program is going to need. Small should
suffice for any programs you will be writing. Next is the size of
the stack. The stack is basically a storage device where you can
temporarily put data in. You can put stuff on the stack with the
push command and remove items from the stack by popping them. It is
a LIFO data type. That is last-in first-out if you didn't know. For
instance say you push three things onto the stack you will only be
able to pop the last thing you put on. In this case the stack has a
size of 100h. That h stands for hex. Most numbers in assembly are in
hex so it might be a good idea to become familiar with it. Next
is the data section. This is where you create all your variables.
Creating variable in assembly is a little different from any high
level language. Instead of having different types of variable like
integers,characters,etc. you define the size of the variable in
bytes. In assembly characters and numbers are all the same, they are
just binary numbers. The sizes that you can define are db,dw,dd.
Define byte is one byte and is good for small numbers and
characters. Define word is two bytes and is good for most numbers
you would use. Define double is 4 bytes and is for very big numbers.
I created one variable called message. Next is it's size db which
stands for define byte. Then is the string followed by the hex code
for a carriage return and a line feed. It uses a byte for each
character in this string. All this is ended by a 0. You probably
have heard of null terminated string. Well that 0 is what it is
doing. It tells that this is the end of the string. Next up is
the code section. First thing is to start the main procedure by
saying main proc. Every program should have a main. It is just like
the main in a c++ program. The first part is to load the data
segment into a register. In order to use all those variables you
created in the data segment you must have a register pointing to
there location in memory. By putting a @ in front of data i am
telling it to put the address in to the register instead of it's
value. Don't worry I will explain registers later. Next part we are
moving the message variable into register so we can print it. That
int 21h is what actually prints it out. int 21h is a special
function that you will use quite often. What it does depends what is
in the ah register. Next is another int 21h that basically tells the
program to quit. We finish by ending the main procedure then ending
the main program. That probably didn't make much sense but it
will become clear later on. Like I said earlier assembly
language uses the hex number system. Hex is based on a sixteen
number set. It has the regular 0-9 but it keeps on going up to 15.
10-15 are represented by the letters A-F. Hex is mainly used because
binary numbers can be a pain in the ass to convert to decimal. Here
is an example on how to convert binary to hex. 1 6 0 7 9 4 0001
0110 0000 0111 1001 0100 = 160794h Every 4 bits of a binary
number will equal one hex number. This makes it easier to convert
them. To figure out the hex value you do 2 raised to the index of
place if there is a 1 in that place. For instance 0100 has a 1 in
the 2 spot so it would be 2^2=4. 1001 has a 1 in the 0 place so it
would be 2^0=1 plus 2^3=8 which gives you 9. To convert a hex number
to binary you will need to first try dividing the individual digit
by 8 then 4,2,1. If you can divide the number by these numbers then
put a 1 in that position. For instance lets do F5A2. F is equal to
15 decimal. 15 can be divided by 8 once so the first digit is a 1.
You take the remainder see if 4 divides into which it does, so you
put a one in the second place. You continue this until you get to
zero. DEC 15 5 10 2 HEX F 5 A 2 BIN 1111 0101 1010 0010
That is pretty much all you need to know about hex. If you need
to do some quick conversions the calculator program that comes with
windows can do it.
Registers
Registers are
storage space with a specific purpose. The computer uses these
registers to perform all it's operations. Registers are memory that
is built into chip. They are the fastest memory in your computer. In
order to do any operations on your data you must take it out of main
memory and put into a register. Each register has a two letter name.
General Purpose Registers There are four
general purpose registers AX, BX, CX, and DX. The AX register is
called the accumulator register because it will perform arithmetic
operations slightly faster than the other ones. BX is called the
base register, it can hold an address of a procedure or a variable
and also perform arithmetic. CX is called the counter register, it
is usually used as a counter when doing loops. DX is called the data
register, it can be used for arithmetic and it when multiplying it
will hold the high 16bits of the product.
Segment
Registers The segment register keep track of where different
parts of your code are at. The CS register holds the location of
where your code starts. The DS register hold the location of the
data segment where all your variables were created. The SS register
holds the location of the stack.
Index
Registers The index registers hold the offset of an array.
They are mainly used when processing string. The SI and DI register
are mainly used for moving strings. The SI would be the source and
DI would be the destination.
32 bit
Registers With the newer processors and bigger programs came
bigger memory demands so 32-bit registers were created. They
essentially work the same but have an E in front of their name. EAX,
EBX, ECX, and EDX are the general purpose 32 bit registers. High
and Low Registers Each register has a high and low part. For
instance AX consists of AH and AL. If you do not need a 16 bit
register you can use these 8 bit parts of the registers. Many
operations will use these 8 bit registers. What the int 21h
instruction does depends on what is in the AH register.
Variables
Variables are a little different
than in other languages. In assembly language the computer doesn't
care if it a number, character, or string. It only cares about the
size of the variable. The sizes include db, dw, and dd which stand
for define byte, word, and double. A byte is 8 bit and can hold a
number up to 256, a word is 16 bit and can hold a number up to about
32000, a double is 32 bit and can hold a number up to about 65000.
To create a variable the format is name size value. Here are some
examples.
number dw 267 aLetter db 'F' aString db "Hello
World" noValue db ?
The last one is unitialized.
Basic
Commands
mov The command you will probably use the
most is mov. It basically moves the source into the destination. The
syntax is as follows mov dest,source. There are a few rules you must
follow. For instance the destination can not be an immediate value.
Also you can not move a variable into another variable. One thing to
note is that the source is not affected by the mov operation. It is
more of a copy than a move. Arithmetic Commands The add
command will add the source to the destination. The syntax is add
dest,source. The sub command has the same syntax but it subtracts
the source from the destination. Both of these commands follow the
same rules about having the destination and source being variables.
The inc and dec will increment and decrement respectively the
operand given. The mul command takes only one operand and it
multiplies it by whatever is in eax,ax,or al depending on the size
of the operand. The operand can not be an immediate value. The
answer is put in ax. The div operation also takes just one operand
which it divides into ax and puts the answer into al and the
remainder into ah. Shift The shl and shr operations will move
the bits of the first operand the number of times in second operand.
One cool use of this is for multiplication and division. The shl and
shr are much faster than the mul and div operations. The shl
function will multiply the first operand by 2^second operand. The
shr is the same but it divides.
Loops and
Conditions
Loops Loops are pretty easy in
assembly language. The first thing you want to do is to move into cx
the number of times you want to loop. Then you create a label which
can be called anything followed by a colon. Then you put all your
code. When you want to loop put loop then the name of the label you
made without the colon. Here is an example.
mov cx,9 nameLabel2: inc si inc di loop
nameLabel2
That code will increment si and di nine times.
Conditional Loops Before you do any conditions you
need to do a cmp operation. It compares the first to the second
operand then sets some flags that you can use to determine the
relation between the two. Then you can use operations like je, jne,
jle, jg. They are short for jump equal, not equal, less than or
equal, and greater than. If the condition is true it will jump to
the operand you supplied. Here is an example.
mov cx,9 nameLabel2: cmp si,di je anotherLabel inc
si dec di loop nameLabel2 anotherLabel:
This will exit the loop if si and di are equal. If you would
rather just jump without doing a condition you can use the jmp
command. Here is an example of an if then else statement.
cmp bx,9 je label1 mov ax,5 jmp
endLabel label1: mov ax,4 endLabel:
One thing that is different is that is backwards. The else
part is first then the then part comes next. A while loop is done
like this.
label1: cmp bx,4 je label2 inc bx loop
label1 label2:
This while loop will continue until bx is equal to 4.
Nested Loops This would be a good time to learn
how to use the stack. Here is an example of a nested loop.
mov cx,5 nameLabel: push cx mov
cx,9 nameLabel2: mov ax,0 mov ax,[si] mov [di],ax inc
si inc di loop nameLabel2 add di,23 pop cx loop
nameLabel
As you can see we have two loops. The problem is that both of
these loops use cx as their counter. To remedy this when we enter
the loop we push the value of cx onto the stack and then when the
inner loop is done we pop the top of the stack into cx.
Arrays
Here are some examples of different
kinds of arrays.
someStrings db "hello world","hello 2 ","hello 3
",0 someNumbers dw 10 dup(?) moreNumbers dw
2,1,2,5,5,6,3,2 someChars db 'adsf',0
The first variable is a an array of strings. One thing you
should do is make all the strings the same length. It makes it a lot
easier to use them. The second variable uses the dup command to make
ten empty locations which you can set later on. You can put a number
in those parentheses and it will put that number in each location.
The third variable is just an array of numbers. Each number needs to
separated by a comma. The last variable is an array of characters.
To use these arrays you must load the offset of one of them into a
register. Here is an example.
mov si,offset someStrings
Once the offset is in the register you can access the values like
this.
mov ax,[si]
Since si just contains an address you need to put [] around
so that it will put the value into ax and not the address. That will
put whatever is in the first position of si which would be the
letter h not the whole string because a string is an array itself.
To get to the next character do this.
inc si
Procedures and Macros
Procedures and macros
are very similiar to functions in c++. They can be used to make
repetive tasks easier.
Procedures Do you remember
the main proc that is in every program well that is procedure. A
procedure can be created by giving it a name then putting proc after
it. Put your code there. Then end it with a ret then the name you
chose then endp. You should put this all outside of the main
procedure. To use this procedure you use the call command and then
the name of the procedure. Here is an example
.code main proc mov ax,@data mov ds,ax call
myProc mov ax,4c00h int 21h main endp myProc proc inc
bx ret myProc endp
This is how it works. When the procedure is called it goes
down to the location of the procedure. It does it stuff then ret is
called and it returns to where it was in main.
Macro A macro is very similiar syntatically but
it's inner workings is different. First it can be passed a
parameter. Second instead of the program going to where the macro is
located it pastes a copy of it into the program wherever it is
called. For this reason it is not a good idea to use a macro in a
loop. Here is an example of a macro.
.code main proc mov ax,@data mov ds,ax aMacro
100 mov ax,4c00h int 21h main endp aMacro macro
theParameter mov ax,theParameter endm
This macro is passed the value 100 and the macro puts that
into ax.
|