Many parts of this page are
based highly upon taken from Sean McLaughlin's Learn Assembly in 28 Days, Day 3.
Necessary reading, this. Computers don't count the same way you and I do.
All number systems use a particular radix. Radix is synonymous with "base" if it helps, although I should caution you that saying such things as "All you radix are belong to us" is a great way to get people to throw pointy things at you (but whether it's because of the horrid pun or the tired pop-culture reference is hard to say... :-). To understand what a radix is, consider our everyday system of numbers, which uses base ten.
Like you learned in grade school and forgot over summer, in a base ten number, each digit specifies a certain power of 10, and as a consequence you need ten different digits to denote any number. The rightmost digit specifies 100, the second digit specifies 101, the third 102 and so on. You can, therefore, break down a decimal number, such as 276310, like this (although it does wind up to be redundant):
276310 = (2 x 103) + (7 x 102) + (6 x 101) + (3 x 100) = (2 x 1000) + (7 x 100) + (6 x 10) + (3 x 1) = 2000 + 700 + 60 + 3 = 276310
Computers enjoy working with two other bases: binary and hexadecimal. Octal is base-8, and seems to have died out. The only operating system to use octal was UNIX.
Binary is a base-2 system, so it uses only two digits (0 and 1), and each digit represents a power of 2:
101101012 = (1 x 27) + (0 x 26) + (1 x 25) + (1 x 24) + (0 x 23) + (1 x 22) + (0 x 21) + (1 x 20) = (1 x 128) + (0 x 64) + (1 x 32) + (1 x 16) + (0 x 8) + (1 x 4) + (0 x 2) + (1 x 1) = 128 + 32 + 16 + 4 + 1 = 18110
A single binary digit is familiarly called a bit. Eight bits are called a byte. Other combinations you could hear about: Name Size nibble 4 bits word 16 bits dword 32 bits quadword 64 bits Since the Z80 can directly manipulate only bytes and words (and nibbles in some circumstances), the majority of data handling you do will involve mostly those, so you don't have to concern yourself with the others too much (although it would still be a good idea to familiarize yourself with them).
We will find ourselves working with, or at the very least referencing the individual bits of a byte or word. The nomenclature:
- If we arranged the bits out horizontally, we call the rightmost bit "bit 0", and each bit to the left is given a number one greater.
- The leftmost and rightmost bits are given special names: the leftmost bit is called the high-order bit or the most-significant bit (since it controls the highest power of two of the number, it makes the most significant contribution to the value). The rightmost bit is called the low-order bit or the least-significant bit.
- We can apply these points to nibbles in a byte, bytes in a word or dword, etc. So for example the rightmost byte in a 64-bit quantity would be termed the least-significant byte.
Hexadecimal is base-16, so it uses 16 digits: the regular digits 0 to 9, and the letters A to F that correspond to the decimal values 10 to 15.
1A2F16 = (1 x 163) + (10 x 162) + (2 x 161) + (15 x 160) = (1 x 4096) + (10 x 256) + (2 x 16) + (15 x 1) = 4096 + 2560 + 32 + 15 = 670310
Hex values have an interesting relationship with binary: take the number 110100112. In hex, this value is represented as D316, but consider the individual digits:
D16 = 11012
316 = 00112
Compare these two binary numbers with the original. You should see that one hex digit is equivalent to one nibble. This is what's so great about hexadecimal, converting binary numbers used by the computer into more manageable hex values is a snap.
To designate base above, we have adopted the notation used by many mathematicians by writing the radix as a subscript. Too bad we must write assembly code in plain text format, which has no capability for such niceties. The way to denote radix varies, but in all cases it involves attaching an extra character or characters to the number. TASM gives you a choice between a symbolic prefix or an alphabetic suffix.
Prefix Format Suffix Format Base %10011011 10011011b Binary $31D0 31D0h Hexadecimal @174 174o Octal 12305 * 12305d Decimal * no prefix
It doesn't matter which format you use, provided you don't mix them (like $4F33h). The prefix formats may be easier to read, since the letter sort of gets lost among the numbers (especially if it's upper case).
Registers are sections of very expensive RAM inside the CPU that are used to store numbers and rapidly operate on them. There are fourteen registers on this CPU: A B C D E F H I L R PC SP IX and IY. You don't need to concern yourself about registers I, R, PC and SP yet.
The single-letter registers are 8 bits in size, so they can store any number from 0 to 255. Since this is oftentimes inadequate, they can be combined into four register pairs: AF BC DE HL. These, along with IX and IY, are 16-bit, and can store a number in the range 0 to 65535.
These registers are general purpose, to a point. What I mean by that is that you can usually use whichever register you want, but many times you are forced to, or it's just better to, use a specific one. For example, only the HL, IX and IY registers can be used for indirect memory addressing when loading into registers other than A, all 16-bit registers can be used to load to or from an indirectly addressed memory location when loading from or to (respectively) register A.
The special uses of the registers:
- A is also called the "accumulator". It is the primary register for arithmetic operations and accessing memory. Indeed, it's the only 8-bit register you can use.
- B is commonly used as an 8-bit counter.
- C is used when you want to interface with hardware ports.
- F is known as the flags. The bits of this register signify (that is to say they "flag") whether certain events have occurred. For example, one of the flags (bits) reports if the accumulator is zero or not. The uses of the flags will be explained at a later day because we have no use for them at this point.
- I is the high byte of the interrupt vector when the processor is in Interrupt Mode 2 (IM 2), the low byte of the vector comes from the data bus and is functionally random. Note that this vector is used to load a value from RAM first, then that address is called. You can only load to and from it using A.
- R is the dynamic RAM refresh register, it increases after every instruction by an amount depending on the instruction. Its contents are pseudo random. You can only load to and from it using A.
The two bytes of all 16-bit registers can be used separately as well.
- AF is only used in pushes and pops.
- HL has two purposes. One, it is like the 16-bit equivalent of the accumulator, i.e. it is used for 16-bit arithmetic. Two, it stores the high and low bytes of a memory address.
- BC is used by instructions and code sections that operate on streams of bytes as a byte counter.
- DE holds the address of a memory location that is a destination.
- PC holds the address of the currently executing instruction, you can only change it's contents with jumps, calls and rets. The only way to load directly to it is 'jp (hl)' you could see this as a 'ld pc,hl'.
- SP is the Stack Pointer, it determines where in RAM values are Pushed and Poped, you can only load to it using HL, and you can only load from it using arithmetic like 'add hl,sp', if HL was zero before the addition it now holds the value of SP.
- IX and IY is a funky li'l registers called the index registers. Almost everywhere HL is acceptable, so too is IX and IY. Important to note that using IX and IY results in slower and more inflated code than HL would (approximately double the size and time) because they were not present on the 8080 (the processor the Z80 is based on), so use it only when necessary (usually when HL is tied up). IX and IY can do something special that no other register can though, we'll look at that in due time.
To store to a register, you use the LD instruction.
LD destination, source Stores the value of source into destination.
There are many more, but they involve registers you haven't heard of yet.
Note: imm8: 8-bit immediate value. imm16: 16-bit immediate value.
Destination source A B C D E H L BC DE HL (BC) (DE) (HL) (imm16) A * * * * * * * * * * * B * * * * * * * * C * * * * * * * * D * * * * * * * * E * * * * * * * * H * * * * * * * * L * * * * * * * * BC * DE * HL * (BC) * (DE) * (HL) * * * * * * * (imm16) * * * * imm8 * * * * * * * * imm16 * * *
You obviously have no clue what difference parentheses make for an operand. You'll see shortly. You can only to/from I and R using A.
LD A, 25 Stores 25 into register A LD D, B Stores the value of register B into register D. LD ($8325), A Stores the value of register A into address $8325 (explained later on).
Some points that should be made clear:
The two operands for LD cannot both be register pairs, excepting SP. You have to load the registers separately:
; Since we can't do LD DE, HL... LD D, H LD E, L ; But we can do this: LD SP,HL ; The following instruction effectively loads HL into PC JP (HL)
If you use LD with a number that is too big for the register to hold, you will get an error at assembly time. Storing negative numbers, however, is legal, but the number will get "wrapped" to fit. For example, if you assign -1 to A, it will really hold 255. If you assign -2330 to BC, it will really hold 63206. Adding one plus the maximum value the register will hold gives you the value that will be stored. There is a reason for this phenomenon that will be made clear shortly.
An instruction similar to LD but functionally different, is EX. Despite the fact that it is very particular about its operands, it is a very useful instruction. (90% of the time the registers you want to exchange are HL and DE).
EX DE, HL Swaps the value of DE with the value of HL.
If you want to swap and other register pair with HL (or an index register) without losing an other register you could do the following (although slow)
PUSH BC EX (SP),HL POP BC
Registers F and AF cannot be used as operands for the LD instruction. Actually, these registers can not be operands for any instruction barring a few.
Up to this point there has been an implication that registers are only capable of assuming positive values, but in the real world negative numbers are just as common. Fortunately, there are ways to represent negative numbers. In assembly, we can attribute a number as either signed or unsigned. Unsigned means that the number can only take on positive values, signed means that the number can be either positive or negative. It is this concept of signed numbers we need to look at.
It turns out that there are many signed numbering schemes, but the only one we're interested in is called the two's complement. When we have a signed value in two's complement, the most significant bit of the number is termed the sign bit and its state determines the sign of the number. The existence of the sign bit naturally imposes a restriction on the number of bits a number may be composed of. With this, the amount of bits at our disposal to represent the number is reduced by one; for a string of eight bits, we can have a numeric range of -128 to +127. For a string of sixteen, it's -32, 768 to 32, 767, etc.
As to what bearing the state of the sign bit has on the value, it is this: if the sign bit is clear, the value is a positive quantity and is stored normally, as if it were an unsigned number. If the sign bit is set, the value is negative and is stored in two's complement format. To convert a postive number to its negative counterpart, you have two methods, either of which you can choose based on convenience.
- Calculate zero minus the number (like negative numbers in the Real World). If you're confused how to do this, you can consider 0 and 256 (or 65536 if appropriate) to be the same number. Therefore, -6 would be 256 - 6 or 250: %11111010.
- Flip the state of every bit, and add one. Therefore, -6 would be %11111001 + 1 or %11111010.
There is one special case of two's complement where negation fails, and that is when you try to negate the smallest negative value:
%10000000 -128 %01111111 Invert all bits %10000000 Add one (=-128)
Of course -(-128) isn't -128, but the value +128 cannot be represented in two's complement with just eight bits, so the smallest negative value can never be negated.
There is an instruction to automate two's complement: NEG Calculates the two's complement of the accumulator. I'm sure you find the theory wonderfully engrossing, but what you're probably interested in is how the CPU handles the difference between unsigned and signed numbers. The answer is, it doesn't. You see, the beauty of two's complement is that if we add or subtract two numbers, the result will be valid for both signed and unsigned values:
unsigned signed %00110010 5 5 + %11001110 + 206 + -5 %1 00000000 256 0 (Disqualify ninth bit)
This phenomenon was not lost on the makers of the Z80. You could use the same hardware for adding signed numbers as unsigned numbers only with two's complement, and less hardware means a cheaper chip (true, nowadays you can get a fistful of Z80s for fifty cents, but back in 1975 it was a big deal, just look at the 6502).