Free Web Hosting by Netfirms
Web Hosting by Netfirms | Free Domain Names by Netfirms

Assembly language Tutorial


Getting Started
Assembling your First Program
Registers
General Purpose Registers
Segment Registers
Index Registers
32 bit Registers
High and Low Registers
Variables
Basic Commands
MOV
Shift
Loops and Conditions
Loops
Conditionals
Nested Loops
Arrays
Procedures and Macros
Procedures
Macros



Getting Started


Before you begin coding i would get a program called Assembly Editor. It is basically a dos based text editor but it will assemble and run your programs for you and it has a debugger. You should also get MASM. That is the microsoft assembler. I believe the current version is 6.13. One thing to mention is that assembly language is architecture dependent. That means that from one system to the next there is a different assembly language. I will be doing intel based assembly language. That includes athlons and cyrix chips if you were wondering.



Assembling Your first program


There are a few things that need to be in every assembly program. Here is a simple program that shows these things.

title Hello World (hello.asm)

.model small
.stack 100h
.data
message db "Hello World",0dh,0ah,0
.code

main proc
mov ax,@data
mov ds,ax

mov ah,9
mov dx,offset message
int 21h

mov ax,4C00h
int 21h
main endp
end main


As you can see it takes a lot of code just output hello world. To assemble this with assembly editor just go to the run menu. First do assemble. Then link. It will ask if you want to link anything just hit enter, then it will go to dos, just hit enter until you get back to the editor. Then click run. It will output Hello World.
Now to explain the program. It start off with title of the program which can be anything and then the filename of this file in parenthesis. The next two lines will probably be the same for any program you write. The model is basically what kind of memory requirements this program is going to need. Small should suffice for any programs you will be writing. Next is the size of the stack. The stack is basically a storage device where you can temporarily put data in. You can put stuff on the stack with the push command and remove items from the stack by popping them. It is a LIFO data type. That is last-in first-out if you didn't know. For instance say you push three things onto the stack you will only be able to pop the last thing you put on. In this case the stack has a size of 100h. That h stands for hex. Most numbers in assembly are in hex so it might be a good idea to become familiar with it.
Next is the data section. This is where you create all your variables. Creating variable in assembly is a little different from any high level language. Instead of having different types of variable like integers,characters,etc. you define the size of the variable in bytes. In assembly characters and numbers are all the same, they are just binary numbers. The sizes that you can define are db,dw,dd. Define byte is one byte and is good for small numbers and characters. Define word is two bytes and is good for most numbers you would use. Define double is 4 bytes and is for very big numbers. I created one variable called message. Next is it's size db which stands for define byte. Then is the string followed by the hex code for a carriage return and a line feed. It uses a byte for each character in this string. All this is ended by a 0. You probably have heard of null terminated string. Well that 0 is what it is doing. It tells that this is the end of the string.
Next up is the code section. First thing is to start the main procedure by saying main proc. Every program should have a main. It is just like the main in a c++ program. The first part is to load the data segment into a register. In order to use all those variables you created in the data segment you must have a register pointing to there location in memory. By putting a @ in front of data i am telling it to put the address in to the register instead of it's value. Don't worry I will explain registers later. Next part we are moving the message variable into register so we can print it. That int 21h is what actually prints it out. int 21h is a special function that you will use quite often. What it does depends what is in the ah register. Next is another int 21h that basically tells the program to quit. We finish by ending the main procedure then ending the main program.
That probably didn't make much sense but it will become clear later on.
Like I said earlier assembly language uses the hex number system. Hex is based on a sixteen number set. It has the regular 0-9 but it keeps on going up to 15. 10-15 are represented by the letters A-F. Hex is mainly used because binary numbers can be a pain in the ass to convert to decimal. Here is an example on how to convert binary to hex. 1 6 0 7 9 4
0001 0110 0000 0111 1001 0100 = 160794h
Every 4 bits of a binary number will equal one hex number. This makes it easier to convert them. To figure out the hex value you do 2 raised to the index of place if there is a 1 in that place. For instance 0100 has a 1 in the 2 spot so it would be 2^2=4. 1001 has a 1 in the 0 place so it would be 2^0=1 plus 2^3=8 which gives you 9. To convert a hex number to binary you will need to first try dividing the individual digit by 8 then 4,2,1. If you can divide the number by these numbers then put a 1 in that position. For instance lets do F5A2. F is equal to 15 decimal. 15 can be divided by 8 once so the first digit is a 1. You take the remainder see if 4 divides into which it does, so you put a one in the second place. You continue this until you get to zero. DEC 15 5 10 2
HEX F 5 A 2
BIN 1111 0101 1010 0010
That is pretty much all you need to know about hex. If you need to do some quick conversions the calculator program that comes with windows can do it.


Registers


Registers are storage space with a specific purpose. The computer uses these registers to perform all it's operations. Registers are memory that is built into chip. They are the fastest memory in your computer. In order to do any operations on your data you must take it out of main memory and put into a register. Each register has a two letter name.


General Purpose Registers

There are four general purpose registers AX, BX, CX, and DX. The AX register is called the accumulator register because it will perform arithmetic operations slightly faster than the other ones. BX is called the base register, it can hold an address of a procedure or a variable and also perform arithmetic. CX is called the counter register, it is usually used as a counter when doing loops. DX is called the data register, it can be used for arithmetic and it when multiplying it will hold the high 16bits of the product.


Segment Registers

The segment register keep track of where different parts of your code are at. The CS register holds the location of where your code starts. The DS register hold the location of the data segment where all your variables were created. The SS register holds the location of the stack.


Index Registers

The index registers hold the offset of an array. They are mainly used when processing string. The SI and DI register are mainly used for moving strings. The SI would be the source and DI would be the destination.


32 bit Registers

With the newer processors and bigger programs came bigger memory demands so 32-bit registers were created. They essentially work the same but have an E in front of their name. EAX, EBX, ECX, and EDX are the general purpose 32 bit registers.
High and Low Registers
Each register has a high and low part. For instance AX consists of AH and AL. If you do not need a 16 bit register you can use these 8 bit parts of the registers. Many operations will use these 8 bit registers. What the int 21h instruction does depends on what is in the AH register.


Variables


Variables are a little different than in other languages. In assembly language the computer doesn't care if it a number, character, or string. It only cares about the size of the variable. The sizes include db, dw, and dd which stand for define byte, word, and double. A byte is 8 bit and can hold a number up to 256, a word is 16 bit and can hold a number up to about 32000, a double is 32 bit and can hold a number up to about 65000. To create a variable the format is name size value. Here are some examples.

number dw 267
aLetter db 'F'
aString db "Hello World"
noValue db ?


The last one is unitialized.


Basic Commands


mov
The command you will probably use the most is mov. It basically moves the source into the destination. The syntax is as follows mov dest,source. There are a few rules you must follow. For instance the destination can not be an immediate value. Also you can not move a variable into another variable. One thing to note is that the source is not affected by the mov operation. It is more of a copy than a move.
Arithmetic Commands
The add command will add the source to the destination. The syntax is add dest,source. The sub command has the same syntax but it subtracts the source from the destination. Both of these commands follow the same rules about having the destination and source being variables. The inc and dec will increment and decrement respectively the operand given. The mul command takes only one operand and it multiplies it by whatever is in eax,ax,or al depending on the size of the operand. The operand can not be an immediate value. The answer is put in ax. The div operation also takes just one operand which it divides into ax and puts the answer into al and the remainder into ah.
Shift
The shl and shr operations will move the bits of the first operand the number of times in second operand. One cool use of this is for multiplication and division. The shl and shr are much faster than the mul and div operations. The shl function will multiply the first operand by 2^second operand. The shr is the same but it divides.


Loops and Conditions



Loops

Loops are pretty easy in assembly language. The first thing you want to do is to move into cx the number of times you want to loop. Then you create a label which can be called anything followed by a colon. Then you put all your code. When you want to loop put loop then the name of the label you made without the colon. Here is an example.

mov cx,9
nameLabel2:
inc si
inc di
loop nameLabel2


That code will increment si and di nine times.

Conditional Loops

Before you do any conditions you need to do a cmp operation. It compares the first to the second operand then sets some flags that you can use to determine the relation between the two. Then you can use operations like je, jne, jle, jg. They are short for jump equal, not equal, less than or equal, and greater than. If the condition is true it will jump to the operand you supplied. Here is an example.

mov cx,9
nameLabel2:
cmp si,di
je anotherLabel
inc si
dec di
loop nameLabel2
anotherLabel:


This will exit the loop if si and di are equal. If you would rather just jump without doing a condition you can use the jmp command. Here is an example of an if then else statement.

cmp bx,9
je label1
mov ax,5
jmp endLabel
label1:
mov ax,4
endLabel:


One thing that is different is that is backwards. The else part is first then the then part comes next. A while loop is done like this.

label1:
cmp bx,4
je label2
inc bx
loop label1
label2:


This while loop will continue until bx is equal to 4.

Nested Loops

This would be a good time to learn how to use the stack. Here is an example of a nested loop.

mov cx,5
nameLabel:
push cx
mov cx,9
nameLabel2:
mov ax,0
mov ax,[si]
mov [di],ax
inc si
inc di
loop nameLabel2
add di,23
pop cx
loop nameLabel


As you can see we have two loops. The problem is that both of these loops use cx as their counter. To remedy this when we enter the loop we push the value of cx onto the stack and then when the inner loop is done we pop the top of the stack into cx.


Arrays


Here are some examples of different kinds of arrays.

someStrings db "hello world","hello 2 ","hello 3 ",0
someNumbers dw 10 dup(?)
moreNumbers dw 2,1,2,5,5,6,3,2
someChars db 'adsf',0


The first variable is a an array of strings. One thing you should do is make all the strings the same length. It makes it a lot easier to use them. The second variable uses the dup command to make ten empty locations which you can set later on. You can put a number in those parentheses and it will put that number in each location. The third variable is just an array of numbers. Each number needs to separated by a comma. The last variable is an array of characters. To use these arrays you must load the offset of one of them into a register. Here is an example.

mov si,offset someStrings

Once the offset is in the register you can access the values like this.

mov ax,[si]


Since si just contains an address you need to put [] around so that it will put the value into ax and not the address. That will put whatever is in the first position of si which would be the letter h not the whole string because a string is an array itself. To get to the next character do this.

inc si



Procedures and Macros


Procedures and macros are very similiar to functions in c++. They can be used to make repetive tasks easier.

Procedures

Do you remember the main proc that is in every program well that is procedure. A procedure can be created by giving it a name then putting proc after it. Put your code there. Then end it with a ret then the name you chose then endp. You should put this all outside of the main procedure. To use this procedure you use the call command and then the name of the procedure. Here is an example

.code
main proc
mov ax,@data
mov ds,ax
call myProc
mov ax,4c00h
int 21h
main endp
myProc proc
inc bx
ret
myProc endp


This is how it works. When the procedure is called it goes down to the location of the procedure. It does it stuff then ret is called and it returns to where it was in
main.

Macro

A macro is very similiar syntatically but it's inner workings is different. First it can be passed a parameter. Second instead of the program going to where the macro is located it pastes a copy of it into the program wherever it is called. For this reason it is not a good idea to use a macro in a loop. Here is an example of a macro.

.code
main proc
mov ax,@data
mov ds,ax
aMacro 100
mov ax,4c00h
int 21h
main endp
aMacro macro theParameter
mov ax,theParameter
endm


This macro is passed the value 100 and the macro puts that into ax.