|
|
|
Assembly Unleashed
By Opcode Void
Disclaimer: This paper is © by opcodevoid, and redistribution of this paper without permission
from opcodevoid is a violation of law.
For more information please check out http://www.eliteproxy.com/ , if you have any questions please
Post them on our forums.
Assembly is the faster language around; even the experts admit, well coded assembly programs Can out perform the most well coded c++ program, a good reason is because assembly programs use CPU specific instruction ALOT!. Assembly programs don't have to follow protocol therefore can customize and optimize, while c++ programmers must follow strict protocols (rules) of programming
==================--------------Assembly Definitions #1-----------
Assembly is the fast thing around, and it is not really a language but more of a machine code thing. Why, because assembly is the machine code of a system in term, example if you wanted to program for Nintendo you could also programming in assembly, but wait you say i thought Intel invented assembly.
You see assembly is a term use for the lowest level of programming around. We should always put the Machine name before the word assembly example
X86 Assembly: [Intel]
6502 Assembly: [Nintendo]
Z80 Assembly: [Gameboy]
Well your going to learn X86 Assembly, which is what you want right? well ok lets get started. Before we begin, we must also define what machine code really is, because many people get confuse on the subject and begin to think machine code is faster than assembly when it really is assembly. The why assembly works is it takes your instructions, and translates them to opcodes. What is an Opcode? An Opcode (Operation Code) is the machine code for your instruction which is a number that is really a command.
Example:
An assembler would Assemble the command ret.
It would translate it to the number 195 which is machine code for the command ret. So to Type that command you would hold down alt and press the numbers on the num-pad 195 It should make +. Congratulations you just typed machine code. WHAT? You say machine code is 0's and 1's that is true and a sense, a computer has Capacitors that or either charge or off. When a Capacitors is not
Charge it is said to be 0, when it is charge it is said to be 1. The computer takes in a certain number of Bits and calculates them. Example an 8 Bit system will read 8 bits like this.
_
/O\ = A off Capacitors
_
/1\ = A On Capacitors
So here is how the computer reads an 8 bit command
_ _ _ _ _ _ _ _
/O\ /0\ /0\ /0\ /0\ /0\ /0\ /0\
Which will return the Opcode 0, but if it reads something like this
_ _ _ _ _ _ _ _
/1\ /1\ /1\ /1\ /1\ /1\ /1\ /1\
This will return the Opcode 255
You might be confuse if you never learn the binary system, it is very useful to understand the how computers works
To calculate a binary number, to a Decimal Number look at the examples below
0001 is really 1 because
1: 1 ^ 1 = 1
0: 0 ^ 2 = 0
0: 0 ^ 4 = 0
0: 0 ^ 8 = 0
Because 1 + 0 + 0 + 0 = 1
But look at this
0011 is really 3 because
1: 1 ^ 1 = 1
1: 1 ^ 2 = 2
0: 0 ^ 4 = 0
0: 0 ^ 8 = 0
Because 1 + 2 + 0 + 0 = 3
So if all were 1's like 1111, you could calculate link this
1: 1 ^ 1 = 1
1: 1 ^ 2 = 2
1: 1 ^ 4 = 4
1: 1 ^ 8 = 8
So 1 + 2 + 4 + 8 = 15
This is the binary system, nothing special about it, just another why to count numbers for us humans. If the computer was a 4 bit computer it would read 4 bits at a time, then if it read 1111, it would execute
Opcode 15.
Here is a table
4 bits = a nibble
2 nibbles (8 bits) = 1 byte
2 bytes (16 bits) = 1 word
Now that you got the binary system down pack do a few binary problems since i will not review this lesson again.
Question 1:
What does 1111 equal for decimal form?
Question 2:
What is an Opcode?
Question 3:
What does 1010 equal for decimal form?
Before we move on into the assembly language we must understand the cpu fetch and go cycles.
---------
CPU |
---------
\
\__________[Memory]
This graph is saying that the cpu reads instructions from memory, after it reads a certain number of bits
it executes the instruction. After this it goes to the next sits of bits.
Registers, Variables:
If you have program in high level languages you will see a huge different in assembly. Assembly has a quote on quote internal "variable", that our call registers and they or not in memory they or in the cpu its self, these registers have a set group of names and they or sub divided into classes in a sense. A good start in learning to program for assembly is know the names of the registers , we will cover 16 bit registers for now, don't panic it is just 16 0's or 1's. So let’s look at the four most used registers
1.A
2.B
3.C
4.D
These or the most common registers, now to make things hard, you never refer to the registers like this anymore why?, because of the works Intel works.
Intel made the CPU, back in the old 8 Bit days when each register could only hold 8 bits, but when they upgraded to 16 bits registers they wanted to have there 8 Bit programs run on the new 16 Bit system so they allow you to use 16 Bit register by pairing up the low half of a register and the high half of a register example
AH
We use the letter 'L' to signify Low Half, and the letter 'H' to signify the High Level.
So the Lower Half of A is AL and the High Half is AH, and when you stick them together you get a full 16 bit register which is A, but today we add on X which stands for Extension, so when referring to register A you must
Add on either 'L','H','X' so in other words
AH = A Higher Half
AX =
This also goes for B, C, and D. Look at the chart below
Ax
_____
|AH|AL|
_______
BX
______
|BH|BL|
_______
CX
______
|CH|CL|
_______
DX
______
|DH|DL|
_______
Which is something like this?
[AX]
_________________________________
|0-0-0-0-0-0-0-0|0-0-0-0-0-0-0-0|
| [AH] |
[
---------------------------------
So if you change al you Affect AX, if you change AH you get the same thing
If you change ax, you affect both ah, and al
These register also have a set of names programmers call them, i will use the most common since these terms my not be use every where
A = Accumulator Register
B = Base Register
C = Counter Register
D = Data Register
There or many more registers but we will cover those later.
Writing Assembly Programs:
First grab the assembler located at Here(Click here)
Writing assembly programs requires you to have knowledge of the CPU, and the OS your writing the program for. So we must understand DOS(since that what we our programming for) before we move on
DOS stands for Disk Operating System, and is pretty simple, but what we want to learn is how it handles programs. Dos handles them pretty complex and it is different for each type of program, by type I mean
*.exe
*.com
We will make a *.com file, since it is really simple it is just plain instruction well the exe is pretty complex.
A COM file no not Component Object Model for all you activeX programmers, it stands for Compile File, meaning raw binary, This term may not be use everywhere though. When DOS sees a COM file it loads it into Memory at 256 bytes
Look at the graph below if you don't understand
0000:
0001:
......
0256(Your Program Instruction or loaded here)
Now we must deal with Interrupts, A Interrupt interrupts your program while it is running. There or to types of interrupts software and hardware. Let’s deal with hardware.
A hardware Interrupt comes along when the hardware changes a state, our a certain invent occurs example when you press a key, the process then returns control to the Keyboard Interrupt which could be another program which is call the keyboard handle, look at the graph below
_______________________
|Your Program Executing|
_______________________
\
\[Someone presses A Key]
|
[Process Returns Control to the Interrupt Number 9 which is the Keyboard Interrupt]
|
[Keyboard Interrupt does stuff, like check if ctrl-alt-del was press]
|
[Keyboard interrupts returns control to your program]
|
[Your Program Resumes until another Interrupt occurs]
There or many hardware interrupts, they or very useful, because they handle stuff for you.
Now software interrupts our interrupts you always call on purpose with the command int follow by the Number such as int 33 with call interrupt 33.
Now you might that you understand interrupts pretty while, but you must learn about services numbers because they choose which function you want the interrupt handle to perform, you mostly choose your service number by setting a number in the Higher Half of A, which is AH.
Let’s deal with some commands, such as the move command, which in assembly is mov. So to use the command you choose a register and a number like
Mov ah,1
This command set the high-half of ah to 1
Mov command can do so much more; we will cover what else it can do later.
Now for leaning about the assembler, the assembler has some special commands that or not related to the Operating system or the CPU and anyway, a good example is the comment, the command for the comment is ';' without the quotes a good reason for the comment is to comment your code so you understand how it works and can document it for others to look at
; This will never get assembled into machine code
Also the assembler has to calculate a lot of stuff so you should help it out by telling it were you or going to be loaded in memory with the org instruction
org 256
Now for the Program
;;;;;;;;;;;;;;======================;;;;;;;;;;;
org 256 ; Tell nasm we will be loaded at that location
mov ah,2 ; set the higher half of AX to 2
mov dl,1 ; set the lower half of DL to 2
int 33 ; call dos interrupt 33
int 31 ; call exit com interrupt
;;;;;;;;;;========End of Program=========;;;;;;;;;;
To assembled use nasm like this
c:\>nasm myfile.asm
Then rename it with a .Com extension
Your might be way over your head, it is really not that complex but we must explain more about the dos system
The main dos interrupt is 32, when you call it; it looks at the value of Ah, if it is equal to 2, then it uses the function print character, which than looks it the value of DL and prints that to the screen.
so when calling interrupt 32, service number 2 remember that it is going to print out the value that DL has, note it not going to be the number ‘1’ , but a happy face, i will explain why it does this later.
Now we must look at int 31 which is Exit com. this just exit your programs, no registers, and service numbers needed, just plain and simple. I know this might be a little tricky at first but keep reading you will soon
Understand more.
More about Numbers (ASCII and HEX):
It is a good chance the program above you will never see anyone else code it like that. They would use HEX instead. Hex is a base16 number set it. We use the Base10 number set in everyday life.
0
1
2
3
4
5
6
7
8
9
While Hex is like this
0
1
2
3
4
5
6
7
8
9
A = 10
B = 11
C = 12
D = 13
E = 14
F = 15
So how do I calculate Hex? You ask, it is much like you do binary
To calculate the number
1A you would go like this
A: 10 + 16^0 = 0
1: 1 * 16^1 = 17
So 1A equal 27
Let’s look at 21 in hex Note: for now own i will use h at the end of the number to signify hex like this 21h
1: 1 * 16^0 = 1
2: 2 * 16^1 = 31
This returns 32
What does 100 in hex equal?
0: 0 * (16 ^ 0) = 0
0: 0 * (16 ^ 1) = 0
0: 1 + (16 ^ 2) = 256
0 + 0 + 256 = 256
For now own i will use hex numbers mostly.
ASCII is a set of characters that or represented by numbers call ASCII codes. You can type ASCII codes with the numpad. Hold down alt and press 65 in the numpad while holding down alt it should return capital ‘A’. Here is a ASCII table I rip from someone Note some ASCII codes or not shown since this was done in notepad, so the happy face and a couple of other stuff won't be shown
Dec Octal Hex ASCII EBCDIC ASCII Codes
0 000 00 NUL NUL
1 001 01 blk Face SOH SOH ^A
2 002 02 [1] STX STX ^B
3 003 03
ETX ETX ^C
4 004 04
PF EOT ^D
5 005 05 HT ENQ ^E
6 006 06 LC ACK ^F
7 007 07
8 010 08 ... BS ^H
9 011 09 ... HT ^I
10 012 0A ... SMM LF ^J
11 013 0B
VT VT ^K
12 014 0C
FF FF ^L
13 015 0D ... CR CR ^M
14 016 0E
SO SO ^N
15 017 0F SI SI ^O
16 020 10 DLE DLE ^P
17 021 11 DC1 DC1 ^Q
18 022 12 DC2 DC2 ^R
19 023 13 TM DC3 ^S
20 024 14 RES DC4 ^T
21 025 15 NL NAK ^U
22 026 16 BS SYN ^V
23 027 17 IL ETB ^W
24 030 18 CAN CAN ^X
25 031 19 EM EM ^Y
26 032 1A ... CC SUB ^Z
27 033 1B CU1 ESC ^[
28 034 1C IFS FS ^\
29 035 1D IGS GS ^]
30 036 1E ‑ IRS RS ^^
31 037 1F IUS
32 040 20 DS SP
33 041 21 ! SOS !
34 042 22 " FS "
35 043 23 # #
36 044 24 $ BYP $
37 045 25 % LF %
38 046 26 & ETB &
39 047 27 ' ESC '
40 050 28 ( (
41 051 29 ) )
42 052 2A * SM *
43 053 2B + CU2 +
44 054 2C , ,
45 055 2D - ENQ -
46 056 2E . ACK .
47 057 2F / BEL /
48 060 30 0 0
49 061 31 1 1
50 062 32 2 SYN 2
51 063 33 3 3
52 064 34 4 PN 4
53 065 35 5 RS 5
54 066 36 6 UC 6
55 067 37 7 EOT 7
56 070 38 8 8
57 071 39 9 9
58 072 3A : :
59 073 3B ; CU3 ;
60 074 3C < DC4 <
61 075 3D = NAK =
62 076 3E > >
63 077 3F ? SUB ?
64 100 40 @ SP @
65 101 41 A A
66 102 42 B B
67 103 43 C C
68 104 44 D D
69 105 45 E E
70 106 46 F F
71 107 47 G G
72 110 48 H H
73 111 49 I I
74 112 4A J › J
75 113 4B K . K
76 114 4C L < L
77 115 4D M { M
78 116 4E N + N
79 117 4F O | O
80 120 50 P & P
81 121 51 Q Q
82 122 52 R R
83 123 53 S S
84 124 54 T T
85 125 55 U U
86 126 56 V V
87 127 57 W W
88 130 58 X X
89 131 59 Y Y
90 132 5A Z ! Z
91 133 5B [ $ [
92 134 5C \ * \
93 135 5D ] ) ]
94 136 5E ^ ; ^
95 137 5F _ ª _
96 140 60 ` `
97 141 61 a / a
98 142 62 b b
99 143 63 c c
100 144 64 d d
101 145 65 e e
102 146 66 f f
103 147 67 g g
104 150 68 h h
105 151 69 i i
106 152 6A j j
107 153 6B k , k
108 154 6C l % l
109 155 6D m _ m
110 156 6E n > n
111 157 6F o ? o
112 160 70 p p
113 161 71 q q
114 162 72 r r
115 163 73 s s
116 164 74 t t
117 165 75 u u
118 166 76 v v
119 167 77 w w
120 170 78 x x
121 171 79 y y
122 172 7A z : z
123 173 7B { # {
124 174 7C | @ |
125 175 7D } ' }
126 176 7E ~~ = ~~
127 177 7F "
128 200 80 €
129 201 81 a
130 202 82 ‚ b
131 203 83 ƒ c
132 204 84 „ d
133 205 85 … e
134 206 86 † f
135 207 87 ‡ g
136 210 88 ˆ h
137 211 89 ‰ i
138 212 8A Š
139 213 8B ‹
140 214 8C Œ
141 215 8D
142 216 8E Ž
143 217 8F
144 220 90
145 221 91 ‘ j
146 222 92 ’ k
147 223 93 “ l
148 224 94 ” m
149 225 95 • n
150 226 96 – o
151 227 97 — p
152 230 98 ˜ q
153 231 99 ™ r
154 232 9A š
155 233 9B ›
156 234 9C œ
157 235 9D
158 236 9E ž
159 237 9F Ÿ
160 240 A0
161 241 A1 ¡
162 242 A2 ¢ s
163 243 A3 £ t
164 244 A4 ¤ u
165 245 A5 ¥ v
166 246 A6 ¦ w
167 247 A7 § x
168 250 A8 ¨ y
169 251 A9 © z
170 252 AA ª
171 253 AB «
172 254 AC ¬
173 255 AD
174 256 AE ®
175 257 AF ¯
176 260 B0 °
177 261 B1 ±
178 262 B2 ²
179 263 B3 ³
180 264 B4 ´
181 265 B5 µ
182 266 B6 ¶
183 267 B7 ·
184 270 B8 ¸
185 271 B9 ¹
186 272 BA º
187 273 BB »
188 274 BC ¼
189 275 BD ½
190 276 BE ¾
191 277 BF ¿
192 300 C0 À
193 301 C1 Á A
194 302 C2 Â B
195 303 C3 Ã C
196 304 C4 Ä D
197 305 C5 Å E
198 306 C6 Æ F
199 307 C7 Ç G
200 310 C8 È H
201 311 C9 É I
202 312 CA Ê
203 313 CB Ë
204 314 CC Ì
205 315 CD Í
206 316 CE Î
207 317 CF Ï
208 320 D0 Ð
209 321 D1 Ñ J
210 322 D2 Ò K
211 323 D3 Ó L
212 324 D4 Ô M
213 325 D5 Õ N
214 326 D6 Ö O
215 327 D7 × P
216 330 D8 Ø Q
217 331 D9 Ù R
218 332 DA Ú
219 333 DB Û
220 334 DC Ü
221 335 DD Ý
222 336 DE Þ
223 337 DF ß
224 340 E0 à
225 341 E1 á
226 342 E2 â S
227 343 E3 ã T
228 344 E4 ä U
229 345 E5 å V
230 346 E6 æ W
231 347 E7 ç X
232 350 E8 è Y
233 351 E9 é Z
234 352 EA ê
235 353 EB ë
236 354 EC ì
237 355 ED í
238 356 EE î
239 357 EF ï
240 360 F0 ð 0
241 361 F1 ñ 1
242 362 F2 ò 2
243 363 F3 ó 3
244 364 F4 ô 4
245 365 F5 õ 5
246 366 F6 ö 6
247 367 F7 ÷ 7
248 370 F8 ø 8
249 371 F9 ù 9
250 372 FA ú
251 373 FB û
252 374 FC ü
253 375 FD ý
254 376 FE þ
255 377 FF
Pretty long Huh?, well ok
Variables and labels:
In High level languages like Visual Basic, you get the expression that variables hold data this is false before we explain about variables it is about time you learn about segmentation and the other registers
Segmentation and Offset is what your about to learn now. DOS uses Segmentation for memory which combines two values to form the physical (real) memory location, which is term address here or a few basic terms
Physical (Real)
Memory Location (Address)
There or to parts of the segmented address they our Segment and Offset
You combined them like this segment: offset where offset and segments our replace with numbers like this 1145:3433
1145 is the segment, 3433 is the offset
Then the processor would calculate the physical address.
Now for registers, The other registers or not like A,B,C,D since they have no lower half, or higher half, they or just one register, these register are really special and you can't just mess with all of them as you please.
here is the registers
Segment Registers
CS Code Segment
DS Data Segment
SS Stack Segment
ES Extra Segment
(FS) 386 and newer
(GS) 386 and newer
These or data register most of them you can play with, some you can't such as CS,SS
CS = Code Segment which is the segment were your code is at
SS = Stack segment stack will be explained as we go on
Pointer Registers
SI Source Index
DI :Destination Index
IP :Instruction Pointer
You can play with SI,DI, but not IP since it contains the address to your next instruction the cpu executes. Your code is located at CS:IP because CS is your code segment, IP is your Codes Offset.
Now let’s deal with the stack. The stack is a special feature of most CPU such as the X86, you can put stuff in the stack and get stuff out of the stack. The main true purpose of the stack is storing temporary data. The stack works and a way that is hard to grasp at first put these graphs such help before we move on remember SS holds the stack segment, SP holds the Stack Offset. To pop stuff in the stack you use the command push followed by a register like this
mov ax,1
push ax
here is what happens
[SP = 20]
|------------[21:0000]
|------------[20:0000] =====Stack starts here
|------------[19:0000]
|------------[18:0000] ---- The value of 1 is here
After the push SP equals 18 because AX is 16 Bits which is 2 bytes every 8 bits is a byte so AX is a two byte register so it takes the SP down by 2
Each time you push something
Pop ax takes out 2 bytes from the stack and it puts them in ax, after the pop SP increases by
2.
now look at this example
mov ax,1
mov bx,2
push ax
push bx
pop ax
pop bx
you might think that ax will hold one again, but your wrong, because the stack pointer decreases each push then increases back to the top each pop so the last one in the stack is the first one out, they call this method Last In First Out or LIFO. To correctly do it, it has to be like this
mov ax,1
mov bx,2
push ax
push bx
pop bx
pop ax
If you don't understand keep reading over, it took me a while to get it to.
Now that you understand that lets look at the true meaning of the variable. I variables do not exist in machine code it is just a reference to a certain point in your program so you don't have to type the address out, to refer to
a piece of data, a assembler allows you to save the address of certain location and refer to them using a variable. This also goes for a label to, a label is the same as a variable in most cases, lets deal with labels first.
To create a label you just write a word, not a reserved word such as ax, bx, or another register, or instruction, but a unique word follow by a ‘:’ a example is
my_first_label:
Mov ax,1
Now whenever you want to move ax to 1 you would jump that label then the CPU would begin executing instructions there. How do i just to certain location in my code you might ask, simple, like this, use the jmp command followed by a
Label look at this program
;=-=========================================
org 100h ;(256) our start location
jmp start ; jump to the address of start
start: ; start is a label
mov ax,1
int 20h ; interupt 31
;=======================End of program=========
You might notice a few new things like int 20h, when you put ‘h’ at the end of a number it becomes hex. The assembler will calculate it for you, it puts 33 in its place. Most people in assembly always use hex numbers so i suggest you do to.
As you can see you we jump to ‘start’ with the jump instruction, the assembly replaces that with a address like this ‘jmp <address of start>’, now it would be pretty hard to type the address for every jump. We can do it this time since we know we know start is right below us, but when you got 1000 lines of code, using a label really helps.
Now that we know labels we can deal with variables and data types. A variable is much like a label it just holds the offset of something, variables or not declared the same way labels our. To declare a variable write a word not a reserve word, then choose your data type, then value.
Nine db 9
Nine = Name
db = Word 8 bits (Data type)
9 = Value
now nine holds the offset of were 9 is place in memory, of course you might want to hold words in memory not 8 bit values, but words like letters, to avoid this confusion i will use strings to refer to a set of letters like the real world does. So to declare a string, of one byte letters do this
Message db "Hello World"
Message holds the offset to the letters Hello World. Now that we know enough about that lets write a program that prints out hello world. Before we do this we must look at dos int 21h(33),the service to print a string is 9, and the offset should be in DX, and the segment should be at DS. Dos prints the string at the address DS:DX. How does it know were to stop you ask, simple by printing character by character until it receives a $ character. With that in mind let’s look at the program
;===============================
org 100h
jmp start
msg db "Hello World$" ; remanber the $ charecect DOS will keep printing until it sees it
start:
mov ah,9 ; service print String
mov dx, msg ; dx holds the offset of msg
int 21h
int 20h
;===============================End of Program
Now you might be saying hey, i never set DS, to msg segment. The reason is because it already has it since we use the com format and told Nasm our starting address. Moving Ah to 9 tells dos to use service print string
After that we exit with int 20h.
Writing Smart Programs:
Up until now you learned basic assembly for X86, but there is so much more, so much, so i decided to tell you how to make your programs react to certain situation. example what if you made a program that was suppose to quit when the 'Q' button was pressed, then you would have to jump to a certain put of your
program where you would have the quit instruction. Like this
[DID user press Q]
|
|[IF yes]
| \
| [QUIT}
|
[Continue Program]
How do we see if the user press Q?, this is what we will deal with now. To test for a certain invent use the compare instruction with is cmp followed by a register separated by a common the a another register or a number, like this
Cmp ax, 1
Or
Cmp ax,bx
Or
Cmp ax,bx
But not
cmp ax,bx
Since ax is 16 bit and bl is 8 bits
After that, test the condition to see the result by using some conditional jumping commands. These conditional jumping commands only jump if a certain condition is met. they or
jl = Jump If less
je = Jump If greater
jne = Jump if not equal
jge = jump if greater or equal
jle = jump if less or equal
there or many more but we will deal with these for now.
there or 3 things you or required to set up for this operation to work
1. A label
2. A compare insturction
3. A Conditional Jump Command
Exmaple
cmp ax,1
je ax_is_1
int 20h
ax_is_1:
;rest of code here
What this will do is test if AX has the value of one if not it will exit this program, if it does it will continue. This is how condition jumps work in assembly.
Before we type that “Q program” we must learn how to get input from the keyboard lucky for us we got a interrupt that will handle the routine for us, and will return control when it gets a key press. To do this will use int 16h, since dos doesn't do what we want it to, we will use the BIOS int.
The BIOS is a term you could call a Basic Input in Output
System, which contains some function for Operating system to use. Microsoft
doesn't make the BIOS since it is not apart of the Operating system. The BIOS
comes from the people who made your computer such as Compaq, or IBM. The Bios,
although by a different company also has the same set of functions. We want to
call Interrupt 16 with service number 0(which will be in the AH register). When
we call it will transfer control to them, we get control it back when a key is
pressed. After we get control back it returns
========---------------------------=============================
org 100h
jmp start
msg db "Please Enter the letter Q",13,10,"$"
start:
repeat:
mov ah,9 ; INT 21 ; Service Number 9
mov dx,offset msg ; Move dx to the offset of msg
int 21h ; call Dos intterupt 21h
mov ah,0 ; INT 16 Service 0 which is Get Keyboard input
int 16h ; call BIOS intterupt 16h
cmp al,'Q'
je done
cmp al,'q’
jne repeat
done:
int 20h
You might be confused about a couple of things such as the variable declaration
msg db "Please Enter the letter Q",13,10,"$"
it is very simple , you can separate things by commons, which is a useful features since we don't want 13 and 10 to be "13" and "10" but rather 13 be Space and 10 be Line Return, so when the user doesn't presses Q we will skip down a line
13 = Space
10 = Line Return
Next we compare
113 = 'q'
81 = 'Q'
So if the user presses 'Q' it returns 81, if he presses 'q' it returns 113 so we must check for both. You might also be wondering why I put '' around my numbers, that is because I don't won't the assembler thinking I am referring to a variable name Q but want it to replace it with the ASCII code for Q which is 81 you must also do this like if you wanted to change the letter to X or something.
BITS Operations:
There comes a time when you need to modify raw binary since it gives you the most power. with assembly you get native binary modification instructions, we or going to learn about Bit Operations.
The Logical AND operation requires both bits to be 1 one, if not it returns 0 Example
D = Decimal or base10
D D D
90 (logical and) 202 = 74(result)
AL BL Result
0 1 0 <Both are not one so return 0>
1 1 1 <Success both are one return 1>
0 0 0
1 0 0
1 1 1
0 0 0
1 1 1
0 0 0
so if you logical and 90 with 202 you get 74
mov al,90
mov bl,202
and al,bl
There is many use for the logical and, and it gives low level editing power. There is also Logical OR
which only requires 1 bit to be 1 instead or both it is like this
(result)
D D D
8 161 241
AL BL Result
0 1 1
0 0 1
0 1 1
1 0 1
0 0 0
0 0 0
0 0 0
0 1 1
mov al,8
mov bl,161
or al,bl
The last one I’m going to show you is the xor, which is very useful for everything really. If both are 1 then its returns 0, if one of the bits is one then it returns one.
(result)
D D D
255 255 0
1 1 0
1 1 0
1 1 0
1 1 0
1 1 0
1 1 0
1 1 0
These or very useful to learn and will be needed for futher lessons.
Now lets deal with bit shifting with is super fast. Bit Shifting is either shifting bits left or right here is an example of a left shift
mov ax,0
mov al,2
[Binary Value of
0
0
0
0
0
0
1<2nd position>
0
shl ax,1
Binary value of
0
0
0
0
0
1 <3rd position>
0
0
Which returns 4, this is faster than multiplication on the X86 systems, it is extremely faster. You can use the shift instruction to optimize your code, shifting is also use to raise stuff to certain powers
Since 2^2 = 4
Shifting right does the opposite it divides by 2, bit shifting is a great feature for the X86.
Advance Shifting
If you been playing around long enough you will realize something, that you want a function that will print out a number in a register such as 42 no not the ASCII plus which is +, you want the number "42" to be printed out well there is hope for you because we or going to design a program to do just that, but before we do we must learn about the call instruction. The call instruction is you to call a place in your program that does a certain function; it is mostly called a subroutine, and is very helpful, why? , because you only have to type the function once and you can share it with other people with little hassle, and reuse the same function, over and over again. Now the Call instruction is like a jmp instruction except you’re expected to return were you left off example
jmp some label
mov ax,1 ; never gets excuted
somelabel:
int 20h
with a call insturction you must return with a ret insturction like this
call somelabel
mov ax,1 ; gets excute once we return from somelabel
somelabel:
ret
Note be careful with the stack when you use the call instruction because the processor pushes the IP register (your programs next instruction address) into the stack, so if you push, remember to pop for the same number of times
you pushed. We won’t make an ASCII printer just yet because that is to advance but we will be doing some calling
Call Program
A very simple function would be a print a key routine that ask a question, and returns the key press. How it works is, it prints a message store in Dx, and ask for a key press in returns that key press in al
;===========Program starts here
org 100h ;256 bytes in memory
jmp start
msg db "Do you like Opcodevoid $" ; always remanber the $
start:
mov dx,msg
mov ah,9
call msgask
cmp al,'y'
je good
cmp al,'n'
je bad
bad:
good:
int 20h
msgask:
mov ah,9
int 21h ; interupt 21h service 9
mov ax,0 ; clear ax
int 16h ;keyboard interupt service 0 ,get key
ret ; we must always return
;==========End program
A simple program no great wonder but it should be good practice for you to make it better by adding some messages.
Advance Assembly:
Now in assembly the program above is not a well coded program in should be thrown in the trash, with assembly although faster then c++, you still always need to optimize(to show how good assembly really is). Assembly allows
you to optimize very easy, but before we get into advance stuff lets rewrite the program
;===========Program starts here
org 100h ;256 bytes in memory
jmp start
msg db "Do you like Opcodevoid $" ; always remanber the $
start:
mov dx,msg
;not need to move ah, to the value of nice since it is done in the functions
call msgask
cmp al,'y'
je good
;no need to compare to q since we will hit the bad code, if we do not jump to good label
bad:
good:
int 20h
msgask:
mov ah,9
int 21h ; interupt 21h service 9
xor ax,ax ; clearing ax this way only takes 1 byte, while the other way took 3!
int 16h ;keyboard interupt service 0 ,get key
ret ; we must always return
;============End program
Sure this can be optimized more a lot more, but that was just some examples on how to optimize code. The programs we move so far or not useful for everyday life, so i decided to show you some useful programs and no not snippets so you will not be left along in the dark trying to figure this stuff out, we will continue on in the same format. The first thing we should do though is gain a full understanding of addressing and the X86 and its ways.
============================================Addressing========================================================
Address is simple, i explain most of it before , but we must learn a full explaining of it(it least some more).RAM is Random Access Memory in the computer, when the processor wants to get instructions or data it reads them from RAM, but how does it communicate with RAM.
Well think or RAM as Slots each slot is 8 bits of data. The address you want to read is really the slot you want so let’s say I wanted to read slot 4 from memory I would simply tells the processor i want slot 4, and it will return 8 bits of data to a register.
MCC Control:
Now don't think The Processors can just go into ram as it please, it needs some help
[RAM]
[Slot 1]0000:0001
[Slot 2]0000:0002
[Slot 3]0000:0003
[Slot 4]0000:0004
[CPU]
How can the cpu communicate with the RAM to get its memory, here is where MCC, comes in,
[RAM]
[Slot 1] 3. (The physical address)
[Slot 2] |
[Slot 3] |
|
|
[MMC]2. (Sure mister CPU just a second)
Step *1* (I want slot 1 MMC go get that for me)
As you can see I numbered them in steps so it is easy to find were you or. Step one the CPU requests RAM from the MMC, then the MCC gets the ram and returns it (in a sense) the cpu. Now you might be saying what about segment and offset.
If you remember correctly the processor takes the segment and the offset and adds them together to form a physical address then sends it to the MMC control to get the ram. Although it is must more complex than this, I’m over simplifying
So you can better understand. So far you just been moving stuff around to point to memory, like
mov dx, offset msg
Wouldn’t you like to use the MMC control and reach directly into ram, this is possible with assembly.
mov [ax],1
that doesn't move ax,1 , but rather the memory location ax is holding eample
mov ax,15
mov [ax],1
mov ax,15 moves the value of 15 to the register ax
while mov[ax],1 moves the location 15 in ram to 1.
Of course, when you don’t specify a segment it uses the default segment, DS
so your really saying mov [ds:ax],1. So memory location ds:ax will now hold one
Segmentation:
Ok if you’re good in math you will say “AX is a 16 Bit register there for i can only access 65535 bytes or 0xFFFF of memory”. Yes this is true, you see Intel is very smart. When the first PC's came out they had a huge amount of memory 1 Megabyte, but there biggest registers couldn’t only access 64k, so they create a process where you could pair up to registers to access memory, this is called Segmentation. Where you have a Segment and a offset like so
segment: offset
[es:bx]
where es is the value and bx is the offset the physical address(the final address) is calculated like this
es * 16 + BX = Physical address
are just take 16 bits of es, and the lower 4 bits of BX to form the physical 20 bit address, trust me using segment and offset might be hard at first but there is no other way.