Free Web Hosting by Netfirms
Web Hosting by Netfirms | Free Domain Names by Netfirms

Assembly Unleashed

By Opcode Void

 

 

Disclaimer: This paper is © by opcodevoid, and redistribution of this paper without permission

from opcodevoid is a violation of law.

 

 

For more information please check out http://www.eliteproxy.com/ , if you have any questions please

Post them on our forums.

 

 

 

Assembly is the faster language around; even the experts admit, well coded assembly programs Can out perform the most well coded c++ program, a good reason is because assembly programs use CPU specific instruction ALOT!. Assembly programs don't have to follow protocol therefore can customize and optimize, while c++ programmers must follow strict protocols (rules) of programming

 

 

==================--------------Assembly Definitions #1-----------

Assembly is the fast thing around, and it is not really a language but more of a machine code thing. Why, because assembly is the machine code of a system in term, example if you wanted to program for Nintendo you could also programming in assembly, but wait you say i thought Intel invented assembly.

You see assembly is a term use for the lowest level of programming around. We should always put the Machine name before the word assembly example

 

 

X86 Assembly: [Intel]

6502 Assembly: [Nintendo]

Z80 Assembly: [Gameboy]

 

 

Well your going to learn X86 Assembly, which is what you want right? well ok lets get started. Before we begin, we must also define what machine code really is, because many people get confuse on the subject and begin to think machine code is faster than assembly when it really is assembly. The why assembly works is it takes your instructions, and translates them to opcodes. What is an Opcode? An Opcode (Operation Code) is the machine code for your instruction which is a number that is really a command.

 

Example:

 

An assembler would Assemble the command ret.

It would translate it to the number 195 which is machine code for the command ret. So to Type that command you would hold down alt and press the numbers on the num-pad 195 It should make +. Congratulations you just typed machine code. WHAT? You say machine code is 0's and 1's that is true and a sense, a computer has Capacitors that or either charge or off. When a Capacitors is not

Charge it is said to be 0, when it is charge it is said to be 1. The computer takes in a certain number of Bits and calculates them. Example an 8 Bit system will read 8 bits like this.

 

_

/O\ = A off Capacitors

_

/1\ = A On Capacitors

 

 

So here is how the computer reads an 8 bit command

 

_ _ _ _ _ _ _ _

/O\ /0\ /0\ /0\ /0\ /0\ /0\ /0\

 

Which will return the Opcode 0, but if it reads something like this

 

_ _ _ _ _ _ _ _

/1\ /1\ /1\ /1\ /1\ /1\ /1\ /1\

 

This will return the Opcode 255

 

You might be confuse if you never learn the binary system, it is very useful to understand the how computers works

 

To calculate a binary number, to a Decimal Number look at the examples below

 

 

0001 is really 1 because

 

1: 1 ^ 1 = 1

0: 0 ^ 2 = 0

0: 0 ^ 4 = 0

0: 0 ^ 8 = 0

 

Because 1 + 0 + 0 + 0 = 1

 

But look at this

 

 

0011 is really 3 because

 

 

1: 1 ^ 1 = 1

1: 1 ^ 2 = 2

0: 0 ^ 4 = 0

0: 0 ^ 8 = 0

Because 1 + 2 + 0 + 0 = 3

So if all were 1's like 1111, you could calculate link this

 

1: 1 ^ 1 = 1

1: 1 ^ 2 = 2

1: 1 ^ 4 = 4

1: 1 ^ 8 = 8

 

So 1 + 2 + 4 + 8 = 15

 

 

This is the binary system, nothing special about it, just another why to count numbers for us humans. If the computer was a 4 bit computer it would read 4 bits at a time, then if it read 1111, it would execute

Opcode 15.

 

Here is a table

 

4 bits = a nibble

2 nibbles (8 bits) = 1 byte

2 bytes (16 bits) = 1 word

 

 

 

Now that you got the binary system down pack do a few binary problems since i will not review this lesson again.

 

 

Question 1:

 

What does 1111 equal for decimal form?

 

Question 2:

What is an Opcode?

 

Question 3:

What does 1010 equal for decimal form?

 

 

 

Before we move on into the assembly language we must understand the cpu fetch and go cycles.

 

 

 

---------

CPU |

---------

\

\__________[Memory]

 

 

This graph is saying that the cpu reads instructions from memory, after it reads a certain number of bits

it executes the instruction. After this it goes to the next sits of bits.

 

 

Registers, Variables:

 

 

If you have program in high level languages you will see a huge different in assembly. Assembly has a quote on quote internal "variable", that our call registers and they or not in memory they or in the cpu its self, these registers have a set group of names and they or sub divided into classes in a sense. A good start in learning to program for assembly is know the names of the registers , we will cover 16 bit registers for now, don't panic it is just 16 0's or 1's. So let’s look at the four most used registers

 

 

1.A

2.B

3.C

4.D

 

 

These or the most common registers, now to make things hard, you never refer to the registers like this anymore why?, because of the works Intel works.

 

 

Intel made the CPU, back in the old 8 Bit days when each register could only hold 8 bits, but when they upgraded to 16 bits registers they wanted to have there 8 Bit programs run on the new 16 Bit system so they allow you to use 16 Bit register by pairing up the low half of a register and the high half of a register example

 

AL

AH

 

We use the letter 'L' to signify Low Half, and the letter 'H' to signify the High Level.

 

So the Lower Half of A is AL and the High Half is AH, and when you stick them together you get a full 16 bit register which is A, but today we add on X which stands for Extension, so when referring to register A you must

Add on either 'L','H','X' so in other words

 

AL = A Lower Half

AH = A Higher Half

AX = AL and AH but together

 

This also goes for B, C, and D. Look at the chart below

 

 

 

 

Ax

_____

|AH|AL|

_______

 

BX

______

|BH|BL|

_______

 

CX

______

|CH|CL|

_______

 

DX

______

|DH|DL|

_______

 

 

Which is something like this?

 

[AX]

_________________________________

|0-0-0-0-0-0-0-0|0-0-0-0-0-0-0-0|

| [AH] | [AL] |

---------------------------------

 

So if you change al you Affect AX, if you change AH you get the same thing

If you change ax, you affect both ah, and al

 

 

These register also have a set of names programmers call them, i will use the most common since these terms my not be use every where

 

A = Accumulator Register

 

B = Base Register

 

C = Counter Register

 

D = Data Register

 

 

There or many more registers but we will cover those later.

 

 

 

 

Writing Assembly Programs:

 

First grab the assembler located at Here(Click here)

 

Writing assembly programs requires you to have knowledge of the CPU, and the OS your writing the program for. So we must understand DOS(since that what we our programming for) before we move on

 

DOS stands for Disk Operating System, and is pretty simple, but what we want to learn is how it handles programs. Dos handles them pretty complex and it is different for each type of program, by type I mean

 

*.exe

*.com

 

We will make a *.com file, since it is really simple it is just plain instruction well the exe is pretty complex.

 

A COM file no not Component Object Model for all you activeX programmers, it stands for Compile File, meaning raw binary, This term may not be use everywhere though. When DOS sees a COM file it loads it into Memory at 256 bytes

Look at the graph below if you don't understand

 

0000:

0001:

......

0256(Your Program Instruction or loaded here)

 

 

Now we must deal with Interrupts, A Interrupt interrupts your program while it is running. There or to types of interrupts software and hardware. Let’s deal with hardware.

 

A hardware Interrupt comes along when the hardware changes a state, our a certain invent occurs example when you press a key, the process then returns control to the Keyboard Interrupt which could be another program which is call the keyboard handle, look at the graph below

 

 

_______________________

|Your Program Executing|

_______________________

\

\[Someone presses A Key]

|

[Process Returns Control to the Interrupt Number 9 which is the Keyboard Interrupt]

|

[Keyboard Interrupt does stuff, like check if ctrl-alt-del was press]

|

[Keyboard interrupts returns control to your program]

|

[Your Program Resumes until another Interrupt occurs]

 

There or many hardware interrupts, they or very useful, because they handle stuff for you.

 

 

Now software interrupts our interrupts you always call on purpose with the command int follow by the Number such as int 33 with call interrupt 33.

 

Now you might that you understand interrupts pretty while, but you must learn about services numbers because they choose which function you want the interrupt handle to perform, you mostly choose your service number by setting a number in the Higher Half of A, which is AH.

 

Let’s deal with some commands, such as the move command, which in assembly is mov. So to use the command you choose a register and a number like

 

Mov ah,1

 

This command set the high-half of ah to 1

Mov command can do so much more; we will cover what else it can do later.

 

 

Now for leaning about the assembler, the assembler has some special commands that or not related to the Operating system or the CPU and anyway, a good example is the comment, the command for the comment is ';' without the quotes a good reason for the comment is to comment your code so you understand how it works and can document it for others to look at

 

; This will never get assembled into machine code

 

Also the assembler has to calculate a lot of stuff so you should help it out by telling it were you or going to be loaded in memory with the org instruction

 

org 256

 

 

Now for the Program

 

;;;;;;;;;;;;;;======================;;;;;;;;;;;

org 256 ; Tell nasm we will be loaded at that location

mov ah,2 ; set the higher half of AX to 2

mov dl,1 ; set the lower half of DL to 2

 

int 33 ; call dos interrupt 33

 

int 31 ; call exit com interrupt

;;;;;;;;;;========End of Program=========;;;;;;;;;;

 

 

 

To assembled use nasm like this

 

c:\>nasm myfile.asm

 

Then rename it with a .Com extension

 

Your might be way over your head, it is really not that complex but we must explain more about the dos system

 

The main dos interrupt is 32, when you call it; it looks at the value of Ah, if it is equal to 2, then it uses the function print character, which than looks it the value of DL and prints that to the screen.

 

 

so when calling interrupt 32, service number 2 remember that it is going to print out the value that DL has, note it not going to be the number ‘1’ , but a happy face, i will explain why it does this later.

 

Now we must look at int 31 which is Exit com. this just exit your programs, no registers, and service numbers needed, just plain and simple. I know this might be a little tricky at first but keep reading you will soon

Understand more.

 

 

More about Numbers (ASCII and HEX):

 

It is a good chance the program above you will never see anyone else code it like that. They would use HEX instead. Hex is a base16 number set it. We use the Base10 number set in everyday life.

 

0

1

2

3

4

5

6

7

8

9

 

While Hex is like this

0

1

2

3

4

5

6

7

8

9

A = 10

B = 11

C = 12

D = 13

E = 14

F = 15

 

 

So how do I calculate Hex? You ask, it is much like you do binary

 

 

To calculate the number

 

1A you would go like this

 

 

A: 10 + 16^0 = 0

1: 1 * 16^1 = 17

 

So 1A equal 27

 

Let’s look at 21 in hex Note: for now own i will use h at the end of the number to signify hex like this 21h

 

1: 1 * 16^0 = 1

2: 2 * 16^1 = 31

 

This returns 32

 

What does 100 in hex equal?

 

0: 0 * (16 ^ 0) = 0

0: 0 * (16 ^ 1) = 0

0: 1 + (16 ^ 2) = 256

0 + 0 + 256 = 256

 

For now own i will use hex numbers mostly.

 

 

 

ASCII is a set of characters that or represented by numbers call ASCII codes. You can type ASCII codes with the numpad. Hold down alt and press 65 in the numpad while holding down alt it should return capital ‘A’. Here is a ASCII table I rip from someone Note some ASCII codes or not shown since this was done in notepad, so the happy face and a couple of other stuff won't be shown

 

 

 

Dec Octal Hex ASCII EBCDIC ASCII Codes

0 000 00 NUL NUL

1 001 01 blk Face SOH SOH ^A

2 002 02 [1] STX STX ^B

3 003 03


ETX ETX ^C

4 004 04


PF EOT ^D

5 005 05 HT ENQ ^E

6 006 06  LC ACK ^F

7 007 07

DEL BEL ^G

8 010 08 ... BS ^H

9 011 09 ... HT ^I

10 012 0A ... SMM LF ^J

11 013 0B

VT VT ^K

12 014 0C

FF FF ^L

13 015 0D ... CR CR ^M

14 016 0E

SO SO ^N

15 017 0F SI SI ^O

16 020 10 DLE DLE ^P

17 021 11  DC1 DC1 ^Q

18 022 12  DC2 DC2 ^R

19 023 13  TM DC3 ^S

20 024 14  RES DC4 ^T

21 025 15  NL NAK ^U

22 026 16 BS SYN ^V

23 027 17 IL ETB ^W

24 030 18 CAN CAN ^X

25 031 19 EM EM ^Y

26 032 1A ... CC SUB ^Z

27 033 1B CU1 ESC ^[

28 034 1C  IFS FS ^\

29 035 1D IGS GS ^]

30 036 1E ‑ IRS RS ^^

31 037 1F IUS US ^_

32 040 20 DS SP

33 041 21 ! SOS !

34 042 22 " FS "

35 043 23 # #

36 044 24 $ BYP $

37 045 25 % LF %

38 046 26 & ETB &

39 047 27 ' ESC '

40 050 28 ( (

41 051 29 ) )

42 052 2A * SM *

43 053 2B + CU2 +

44 054 2C , ,

45 055 2D - ENQ -

46 056 2E . ACK .

47 057 2F / BEL /

48 060 30 0 0

49 061 31 1 1

50 062 32 2 SYN 2

51 063 33 3 3

52 064 34 4 PN 4

53 065 35 5 RS 5

54 066 36 6 UC 6

55 067 37 7 EOT 7

56 070 38 8 8

57 071 39 9 9

58 072 3A : :

59 073 3B ; CU3 ;

60 074 3C < DC4 <

61 075 3D = NAK =

62 076 3E > >

63 077 3F ? SUB ?

64 100 40 @ SP @

65 101 41 A A

66 102 42 B B

67 103 43 C C

68 104 44 D D

69 105 45 E E

70 106 46 F F

71 107 47 G G

72 110 48 H H

73 111 49 I I

74 112 4A J › J

75 113 4B K . K

76 114 4C L < L

77 115 4D M { M

78 116 4E N + N

79 117 4F O | O

80 120 50 P & P

81 121 51 Q Q

82 122 52 R R

83 123 53 S S

84 124 54 T T

85 125 55 U U

86 126 56 V V

87 127 57 W W

88 130 58 X X

89 131 59 Y Y

90 132 5A Z ! Z

91 133 5B [ $ [

92 134 5C \ * \

93 135 5D ] ) ]

94 136 5E ^ ; ^

95 137 5F _ ª _

96 140 60 ` `

97 141 61 a / a

98 142 62 b b

99 143 63 c c

100 144 64 d d

101 145 65 e e

102 146 66 f f

103 147 67 g g

104 150 68 h h

105 151 69 i i

106 152 6A j j

107 153 6B k , k

108 154 6C l % l

109 155 6D m _ m

110 156 6E n > n

111 157 6F o ? o

112 160 70 p p

113 161 71 q q

114 162 72 r r

115 163 73 s s

116 164 74 t t

117 165 75 u u

118 166 76 v v

119 167 77 w w

120 170 78 x x

121 171 79 y y

122 172 7A z : z

123 173 7B { # {

124 174 7C | @ |

125 175 7D } ' }

126 176 7E ~~ = ~~

127 177 7F  " 

128 200 80 €

129 201 81 a

130 202 82 ‚ b

131 203 83 ƒ c

132 204 84 „ d

133 205 85 … e

134 206 86 † f

135 207 87 ‡ g

136 210 88 ˆ h

137 211 89 ‰ i

138 212 8A Š

139 213 8B ‹

140 214 8C Œ

141 215 8D

142 216 8E Ž

143 217 8F

144 220 90

145 221 91 ‘ j

146 222 92 ’ k

147 223 93 “ l

148 224 94 ” m

149 225 95 • n

150 226 96 – o

151 227 97 — p

152 230 98 ˜ q

153 231 99 ™ r

154 232 9A š

155 233 9B ›

156 234 9C œ

157 235 9D

158 236 9E ž

159 237 9F Ÿ

160 240 A0  

161 241 A1 ¡

162 242 A2 ¢ s

163 243 A3 £ t

164 244 A4 ¤ u

165 245 A5 ¥ v

166 246 A6 ¦ w

167 247 A7 § x

168 250 A8 ¨ y

169 251 A9 © z

170 252 AA ª

171 253 AB «

172 254 AC ¬

173 255 AD ­

174 256 AE ®

175 257 AF ¯

176 260 B0 °

177 261 B1 ±

178 262 B2 ²

179 263 B3 ³

180 264 B4 ´

181 265 B5 µ

182 266 B6 ¶

183 267 B7 ·

184 270 B8 ¸

185 271 B9 ¹

186 272 BA º

187 273 BB »

188 274 BC ¼

189 275 BD ½

190 276 BE ¾

191 277 BF ¿

192 300 C0 À

193 301 C1 Á A

194 302 C2 Â B

195 303 C3 Ã C

196 304 C4 Ä D

197 305 C5 Å E

198 306 C6 Æ F

199 307 C7 Ç G

200 310 C8 È H

201 311 C9 É I

202 312 CA Ê

203 313 CB Ë

204 314 CC Ì

205 315 CD Í

206 316 CE Î

207 317 CF Ï

208 320 D0 Ð

209 321 D1 Ñ J

210 322 D2 Ò K

211 323 D3 Ó L

212 324 D4 Ô M

213 325 D5 Õ N

214 326 D6 Ö O

215 327 D7 × P

216 330 D8 Ø Q

217 331 D9 Ù R

218 332 DA Ú

219 333 DB Û

220 334 DC Ü

221 335 DD Ý

222 336 DE Þ

223 337 DF ß

224 340 E0 à

225 341 E1 á

226 342 E2 â S

227 343 E3 ã T

228 344 E4 ä U

229 345 E5 å V

230 346 E6 æ W

231 347 E7 ç X

232 350 E8 è Y

233 351 E9 é Z

234 352 EA ê

235 353 EB ë

236 354 EC ì

237 355 ED í

238 356 EE î

239 357 EF ï

240 360 F0 ð 0

241 361 F1 ñ 1

242 362 F2 ò 2

243 363 F3 ó 3

244 364 F4 ô 4

245 365 F5 õ 5

246 366 F6 ö 6

247 367 F7 ÷ 7

248 370 F8 ø 8

249 371 F9 ù 9

250 372 FA ú

251 373 FB û

252 374 FC ü

253 375 FD ý

254 376 FE þ

255 377 FF

 

 

Pretty long Huh?, well ok

 

 

 

 

Variables and labels:

 

 

In High level languages like Visual Basic, you get the expression that variables hold data this is false before we explain about variables it is about time you learn about segmentation and the other registers

 

 

Segmentation and Offset is what your about to learn now. DOS uses Segmentation for memory which combines two values to form the physical (real) memory location, which is term address here or a few basic terms

 

Physical (Real)

Memory Location (Address)

 

There or to parts of the segmented address they our Segment and Offset

 

You combined them like this segment: offset where offset and segments our replace with numbers like this 1145:3433

 

1145 is the segment, 3433 is the offset

 

Then the processor would calculate the physical address.

 

 

 

Now for registers, The other registers or not like A,B,C,D since they have no lower half, or higher half, they or just one register, these register are really special and you can't just mess with all of them as you please.

 

here is the registers

 

Segment Registers

 

CS Code Segment

DS Data Segment

SS Stack Segment

ES Extra Segment

(FS) 386 and newer

(GS) 386 and newer

 

 

These or data register most of them you can play with, some you can't such as CS,SS

 

CS = Code Segment which is the segment were your code is at

SS = Stack segment stack will be explained as we go on

 

Pointer Registers

 

SI Source Index

DI :Destination Index

IP :Instruction Pointer

 

 

You can play with SI,DI, but not IP since it contains the address to your next instruction the cpu executes. Your code is located at CS:IP because CS is your code segment, IP is your Codes Offset.

 

 

Now let’s deal with the stack. The stack is a special feature of most CPU such as the X86, you can put stuff in the stack and get stuff out of the stack. The main true purpose of the stack is storing temporary data. The stack works and a way that is hard to grasp at first put these graphs such help before we move on remember SS holds the stack segment, SP holds the Stack Offset. To pop stuff in the stack you use the command push followed by a register like this

 

mov ax,1

push ax

 

here is what happens

[SP = 20]

|------------[21:0000]

|------------[20:0000] =====Stack starts here

|------------[19:0000]

|------------[18:0000] ---- The value of 1 is here

 

After the push SP equals 18 because AX is 16 Bits which is 2 bytes every 8 bits is a byte so AX is a two byte register so it takes the SP down by 2

Each time you push something

 

Pop ax takes out 2 bytes from the stack and it puts them in ax, after the pop SP increases by

2.

 

now look at this example

 

mov ax,1

mov bx,2

 

push ax

push bx

 

pop ax

pop bx

 

you might think that ax will hold one again, but your wrong, because the stack pointer decreases each push then increases back to the top each pop so the last one in the stack is the first one out, they call this method Last In First Out or LIFO. To correctly do it, it has to be like this

 

mov ax,1

mov bx,2

 

push ax

push bx

 

pop bx

pop ax

 

If you don't understand keep reading over, it took me a while to get it to.

 

Now that you understand that lets look at the true meaning of the variable. I variables do not exist in machine code it is just a reference to a certain point in your program so you don't have to type the address out, to refer to

a piece of data, a assembler allows you to save the address of certain location and refer to them using a variable. This also goes for a label to, a label is the same as a variable in most cases, lets deal with labels first.

 

To create a label you just write a word, not a reserved word such as ax, bx, or another register, or instruction, but a unique word follow by a ‘:’ a example is

 

my_first_label:

Mov ax,1

 

Now whenever you want to move ax to 1 you would jump that label then the CPU would begin executing instructions there. How do i just to certain location in my code you might ask, simple, like this, use the jmp command followed by a

Label look at this program

 

 

;=-=========================================

org 100h ;(256) our start location

jmp start ; jump to the address of start

 

start: ; start is a label

 

mov ax,1

int 20h ; interupt 31

;=======================End of program=========

 

 

You might notice a few new things like int 20h, when you put ‘h’ at the end of a number it becomes hex. The assembler will calculate it for you, it puts 33 in its place. Most people in assembly always use hex numbers so i suggest you do to.

 

 

As you can see you we jump to ‘start’ with the jump instruction, the assembly replaces that with a address like this ‘jmp <address of start>’, now it would be pretty hard to type the address for every jump. We can do it this time since we know we know start is right below us, but when you got 1000 lines of code, using a label really helps.

 

Now that we know labels we can deal with variables and data types. A variable is much like a label it just holds the offset of something, variables or not declared the same way labels our. To declare a variable write a word not a reserve word, then choose your data type, then value.

 

Nine db 9

 

Nine = Name

 

db = Word 8 bits (Data type)

 

9 = Value

 

now nine holds the offset of were 9 is place in memory, of course you might want to hold words in memory not 8 bit values, but words like letters, to avoid this confusion i will use strings to refer to a set of letters like the real world does. So to declare a string, of one byte letters do this

 

Message db "Hello World"

 

Message holds the offset to the letters Hello World. Now that we know enough about that lets write a program that prints out hello world. Before we do this we must look at dos int 21h(33),the service to print a string is 9, and the offset should be in DX, and the segment should be at DS. Dos prints the string at the address DS:DX. How does it know were to stop you ask, simple by printing character by character until it receives a $ character. With that in mind let’s look at the program

 

 

;===============================

org 100h

jmp start

msg db "Hello World$" ; remanber the $ charecect DOS will keep printing until it sees it

start:

mov ah,9 ; service print String

mov dx, msg ; dx holds the offset of msg

int 21h

int 20h

;===============================End of Program

 

Now you might be saying hey, i never set DS, to msg segment. The reason is because it already has it since we use the com format and told Nasm our starting address. Moving Ah to 9 tells dos to use service print string

After that we exit with int 20h.

 

 

 

Writing Smart Programs:

 

 

Up until now you learned basic assembly for X86, but there is so much more, so much, so i decided to tell you how to make your programs react to certain situation. example what if you made a program that was suppose to quit when the 'Q' button was pressed, then you would have to jump to a certain put of your

program where you would have the quit instruction. Like this

 

 

 

[DID user press Q]

|

|[IF yes]

| \

| [QUIT}

|

[Continue Program]

 

 

How do we see if the user press Q?, this is what we will deal with now. To test for a certain invent use the compare instruction with is cmp followed by a register separated by a common the a another register or a number, like this

 

Cmp ax, 1

 

Or

 

Cmp ax,bx

 

Or

Cmp ax,bx

 

But not

cmp ax,bx

 

Since ax is 16 bit and bl is 8 bits

 

After that, test the condition to see the result by using some conditional jumping commands. These conditional jumping commands only jump if a certain condition is met. they or

 

jl = Jump If less

je = Jump If greater

jne = Jump if not equal

jge = jump if greater or equal

jle = jump if less or equal

 

there or many more but we will deal with these for now.

 

there or 3 things you or required to set up for this operation to work

 

1. A label

2. A compare insturction

3. A Conditional Jump Command

 

Exmaple

 

 

cmp ax,1

je ax_is_1

int 20h

ax_is_1:

 

;rest of code here

 

What this will do is test if AX has the value of one if not it will exit this program, if it does it will continue. This is how condition jumps work in assembly.

 

Before we type that “Q program” we must learn how to get input from the keyboard lucky for us we got a interrupt that will handle the routine for us, and will return control when it gets a key press. To do this will use int 16h, since dos doesn't do what we want it to, we will use the BIOS int.

 

The BIOS is a term you could call a Basic Input in Output System, which contains some function for Operating system to use. Microsoft doesn't make the BIOS since it is not apart of the Operating system. The BIOS comes from the people who made your computer such as Compaq, or IBM. The Bios, although by a different company also has the same set of functions. We want to call Interrupt 16 with service number 0(which will be in the AH register). When we call it will transfer control to them, we get control it back when a key is pressed. After we get control back it returns AL with the KeyCode pressed. So if the user presses the letter A it will return 65

 

 

========---------------------------=============================

org 100h

jmp start

 

msg db "Please Enter the letter Q",13,10,"$"

start:

 

 

repeat:

mov ah,9 ; INT 21 ; Service Number 9

mov dx,offset msg ; Move dx to the offset of msg

int 21h ; call Dos intterupt 21h

 

mov ah,0 ; INT 16 Service 0 which is Get Keyboard input

int 16h ; call BIOS intterupt 16h

 

cmp al,'Q'

je done

cmp al,'q’

jne repeat

 

done:

int 20h

 

You might be confused about a couple of things such as the variable declaration

 

msg db "Please Enter the letter Q",13,10,"$"

 

it is very simple , you can separate things by commons, which is a useful features since we don't want 13 and 10 to be "13" and "10" but rather 13 be Space and 10 be Line Return, so when the user doesn't presses Q we will skip down a line

 

13 = Space

10 = Line Return

 

Next we compare AL to both Q and q, since the user may not have cap lock on, and Q and q are to very different letters

 

113 = 'q'

81 = 'Q'

 

So if the user presses 'Q' it returns 81, if he presses 'q' it returns 113 so we must check for both. You might also be wondering why I put '' around my numbers, that is because I don't won't the assembler thinking I am referring to a variable name Q but want it to replace it with the ASCII code for Q which is 81 you must also do this like if you wanted to change the letter to X or something.

 

 

BITS Operations:

 

There comes a time when you need to modify raw binary since it gives you the most power. with assembly you get native binary modification instructions, we or going to learn about Bit Operations.

 

The Logical AND operation requires both bits to be 1 one, if not it returns 0 Example

 

D = Decimal or base10

 

D D D

90 (logical and) 202 = 74(result)

AL BL Result

0 1 0 <Both are not one so return 0>

1 1 1 <Success both are one return 1>

0 0 0

1 0 0

1 1 1

0 0 0

1 1 1

0 0 0

 

so if you logical and 90 with 202 you get 74

 

 

mov al,90

mov bl,202

and al,bl

 

There is many use for the logical and, and it gives low level editing power. There is also Logical OR

 

which only requires 1 bit to be 1 instead or both it is like this

(result)

D D D

8 161 241

 

AL BL Result

0 1 1

0 0 1

0 1 1

1 0 1

0 0 0

0 0 0

0 0 0

0 1 1

 

 

mov al,8

mov bl,161

or al,bl

 

 

The last one I’m going to show you is the xor, which is very useful for everything really. If both are 1 then its returns 0, if one of the bits is one then it returns one.

 

(result)

D D D

255 255 0

1 1 0

1 1 0

1 1 0

1 1 0

1 1 0

1 1 0

1 1 0

 

These or very useful to learn and will be needed for futher lessons.

 

Now lets deal with bit shifting with is super fast. Bit Shifting is either shifting bits left or right here is an example of a left shift

 

 

mov ax,0

 

mov al,2

 

[Binary Value of AL]

 

0

0

0

0

0

0

1<2nd position>

0

 

 

shl ax,1

 

Binary value of AL now

 

0

0

0

0

0

1 <3rd position>

0

0

 

Which returns 4, this is faster than multiplication on the X86 systems, it is extremely faster. You can use the shift instruction to optimize your code, shifting is also use to raise stuff to certain powers

 

Since 2^2 = 4

 

 

Shifting right does the opposite it divides by 2, bit shifting is a great feature for the X86.

 

Advance Shifting

 

If you been playing around long enough you will realize something, that you want a function that will print out a number in a register such as 42 no not the ASCII plus which is +, you want the number "42" to be printed out well there is hope for you because we or going to design a program to do just that, but before we do we must learn about the call instruction. The call instruction is you to call a place in your program that does a certain function; it is mostly called a subroutine, and is very helpful, why? , because you only have to type the function once and you can share it with other people with little hassle, and reuse the same function, over and over again. Now the Call instruction is like a jmp instruction except you’re expected to return were you left off example

 

jmp some label

mov ax,1 ; never gets excuted

 

somelabel:

int 20h

 

 

with a call insturction you must return with a ret insturction like this

 

call somelabel

mov ax,1 ; gets excute once we return from somelabel

 

somelabel:

ret

 

Note be careful with the stack when you use the call instruction because the processor pushes the IP register (your programs next instruction address) into the stack, so if you push, remember to pop for the same number of times

you pushed. We won’t make an ASCII printer just yet because that is to advance but we will be doing some calling

 

 

 

Call Program

 

 

 

A very simple function would be a print a key routine that ask a question, and returns the key press. How it works is, it prints a message store in Dx, and ask for a key press in returns that key press in al

 

 

 

 

 

;===========Program starts here

org 100h ;256 bytes in memory

jmp start

msg db "Do you like Opcodevoid $" ; always remanber the $

 

start:

 

mov dx,msg

mov ah,9

call msgask

cmp al,'y'

je good

cmp al,'n'

je bad

 

bad:

good:

int 20h

msgask:

mov ah,9

int 21h ; interupt 21h service 9

mov ax,0 ; clear ax

int 16h ;keyboard interupt service 0 ,get key

ret ; we must always return

 

;==========End program

 

 

 

 

A simple program no great wonder but it should be good practice for you to make it better by adding some messages.

 

Advance Assembly:

 

 

Now in assembly the program above is not a well coded program in should be thrown in the trash, with assembly although faster then c++, you still always need to optimize(to show how good assembly really is). Assembly allows

you to optimize very easy, but before we get into advance stuff lets rewrite the program

 

 

 

 

;===========Program starts here

org 100h ;256 bytes in memory

jmp start

msg db "Do you like Opcodevoid $" ; always remanber the $

 

start:

 

mov dx,msg

;not need to move ah, to the value of nice since it is done in the functions

call msgask

cmp al,'y'

je good

;no need to compare to q since we will hit the bad code, if we do not jump to good label

 

bad:

good:

int 20h

msgask:

mov ah,9

int 21h ; interupt 21h service 9

xor ax,ax ; clearing ax this way only takes 1 byte, while the other way took 3!

int 16h ;keyboard interupt service 0 ,get key

ret ; we must always return

 

;============End program

 

 

Sure this can be optimized more a lot more, but that was just some examples on how to optimize code. The programs we move so far or not useful for everyday life, so i decided to show you some useful programs and no not snippets so you will not be left along in the dark trying to figure this stuff out, we will continue on in the same format. The first thing we should do though is gain a full understanding of addressing and the X86 and its ways.

 

 

============================================Addressing========================================================

 

Address is simple, i explain most of it before , but we must learn a full explaining of it(it least some more).RAM is Random Access Memory in the computer, when the processor wants to get instructions or data it reads them from RAM, but how does it communicate with RAM.

 

Well think or RAM as Slots each slot is 8 bits of data. The address you want to read is really the slot you want so let’s say I wanted to read slot 4 from memory I would simply tells the processor i want slot 4, and it will return 8 bits of data to a register.

 

MCC Control:

 

 

Now don't think The Processors can just go into ram as it please, it needs some help

 

 

 

 

[RAM]

[Slot 1]0000:0001

[Slot 2]0000:0002

[Slot 3]0000:0003

[Slot 4]0000:0004

[CPU]

 

 

How can the cpu communicate with the RAM to get its memory, here is where MCC, comes in,

 

 

[RAM]

[Slot 1] 3. (The physical address)

[Slot 2] |

[Slot 3] |

|

|

[MMC]2. (Sure mister CPU just a second)

Step *1* (I want slot 1 MMC go get that for me)

 

 

As you can see I numbered them in steps so it is easy to find were you or. Step one the CPU requests RAM from the MMC, then the MCC gets the ram and returns it (in a sense) the cpu. Now you might be saying what about segment and offset.

 

If you remember correctly the processor takes the segment and the offset and adds them together to form a physical address then sends it to the MMC control to get the ram. Although it is must more complex than this, I’m over simplifying

So you can better understand. So far you just been moving stuff around to point to memory, like

 

mov dx, offset msg

 

Wouldn’t you like to use the MMC control and reach directly into ram, this is possible with assembly.

 

mov [ax],1

 

that doesn't move ax,1 , but rather the memory location ax is holding eample

 

mov ax,15

mov [ax],1

 

mov ax,15 moves the value of 15 to the register ax

 

while mov[ax],1 moves the location 15 in ram to 1.

 

Of course, when you don’t specify a segment it uses the default segment, DS

so your really saying mov [ds:ax],1. So memory location ds:ax will now hold one

 

 

Segmentation:

 

 

Ok if you’re good in math you will say “AX is a 16 Bit register there for i can only access 65535 bytes or 0xFFFF of memory”. Yes this is true, you see Intel is very smart. When the first PC's came out they had a huge amount of memory 1 Megabyte, but there biggest registers couldn’t only access 64k, so they create a process where you could pair up to registers to access memory, this is called Segmentation. Where you have a Segment and a offset like so

 

segment: offset

 

[es:bx]

 

where es is the value and bx is the offset the physical address(the final address) is calculated like this

 

es * 16 + BX = Physical address

 

are just take 16 bits of es, and the lower 4 bits of BX to form the physical 20 bit address, trust me using segment and offset might be hard at first but there is no other way.