Instruction Sets
Below are the rudiments of typical computer instruction-sets as they affect code-generation by a compiler. The main kinds of machines are 0-address, 1-address, 2-address, and 3-address machines. A particular computer may have instructions of only one kind, or it may have a mixture of instructions of two or more kinds.
General
- Typical operation-code mnemonics:
-
- <opcd> ::= load | sto (store) | add | sub | rev_sub | mult | div | rev_div | and | or | compare |...
<op> ::= + | - | rev- | * | % | rev% | & | or | compare |...
<cond> ::= eq | ne | le | lt | ge | gt |...- note that (x `rev-` y) is (y-x), and that (x `rev%` y) is (y%x).
- Division (div) typically leaves the result and the remainder in a pair of registers on a 2-address machine, possibly an even-odd pair.
- <opcd> ::= load | sto (store) | add | sub | rev_sub | mult | div | rev_div | and | or | compare |...
- There may be instructions for
- bit-shifting (left | right) and (arithmetic | logical | circular).
- As well is general-purpose integer-|index-registers, there may also be separate registers for floating-point arithmetic (possibly single and double precision).
- The following ignores byte- | half-word- | word- | double-word |... addressing. It assumes that memory is in units all of one size -- the word. In fact many machines use byte-addressing where a word is probably 4 bytes (?maybe 8 bytes on a 64-bit machine?) and is probably aligned on boundaries that are multiples of that size. In that case an op-code would also indicate if it stands for a half-word, word or double-word operation where necessary.
0-address machine (reverse Polish) | ||
---|---|---|
where the processor has a stack and some supporting hardware, at least a top of stack (TOS) pointer. |
||
operation | e.g. or comment | |
load_literal <int> | effect: TOS:=TOS+1; stack[TOS]:=<int> load a constant onto the top of stack; this can be used in arithmetic or to get an address onto the stack for use by a load or a store instruction later (it is splitting hairs to argue whether the literal is an address or a constant which might happen to be used as an address elsewhere) |
|
load | effect: stack[TOS]:=memory[stack[TOS]] take the top-of-stack as an address, replace the top-of-stack with the contents of that address. |
|
sto | effect: memory[stack[TOS-1]]:=stack[TOS]; TOS:=TOS-2 store contents of top of stack at the address in stack[TOS-1] then pop the value and the address |
|
<opcd> | where <opcd> is add | sub |... effect: stack[TOS-1] := stack[TOS-1] <op> stack[TOS]; TOS:=TOS-1 |
|
|
1-address machine |
where the processor has a special accumulator which is implicitly used in each arithmetic operation. The accumulator could be the only register, or there could also be one or more index-registers in which case there would very probably be accumulator-to-indexReg transfer operations. In the former case an address is limited to being an integer constant; in the latter case indexed addresses would be allowed. |
|
---|---|---|
operation | alternative notation | e.g. or comment |
load <addr> | acc:=<addr> | load the word at the given address into the acc |
load_indirect1 <addr> | acc := [<addr>] | use the word at the address as a pointer to the word to be loaded into acc; something like this is necessary if there are no index registers, or... |
load_indirect2 | acc := [acc] | ... use the contents of the acc as a pointer to the word to be loaded into the acc; something like this is necessary if there are no index registers |
sto <addr> | <addr>:=acc | store contents of accumulator at <addr> |
store indirect, similar considerations to load indirect | ||
<opcd> <addr> | acc <op>= <addr> | effect: acc := acc <op> memory[<addr>] |
|
2-address machine (?1.5-address?) | ||
---|---|---|
| ||
operation | alternative notation | e.g. or comment |
load <reg>, <reg> | <reg>:=<reg> | load R7, R3, or R7:=R3 transfer from one register to another |
<opcd> <reg>,<reg> | <reg> <op>= <reg> | add R7, R3, or R7 += R3, arithmetic operation on two registers |
load <reg>, <addr> | <reg>:=<addr> | load R7,36[R2], or R7 := 36[R2]. load the register with the contents of the word at the given address |
sto <reg>, <addr> | <addr>:=<reg> | sto R7,36[R2], or 36[R2] := R2, store the contents of the register in the word at the given address |
<opcd> <reg>, <addr> | <reg><op>=<addr> | add R3, 36[R2], or R3+=36[R2] |
jmp <reg> | jmp <reg> | unconditional jump to the address held in the register |
jmp <addr> | jmp <addr> | unconditional jump to the address |
jmp <cond> <addr> | jmp <cond> <addr> | jump to the address if the condition has been set, e.g., by a compare instruction |
BAL <reg>, <addr> | BAL <reg>, <addr> | Branch and link: save the program counter in the register (e.g., for procedure return) and then jump to the address |
Each instruction above applies to a register and a "full" memory address. Arguably a register is a restricted kind of address, so you might call them 1.5-address instructions. There may also be some 2-full-address instructions. These are in fact necessary for any operation on arbitrarily long operands such as strings or decimal numbers; there is also the matter of specifying an operand's length -- either in the instruction, or in the operand, or in a register. | ||
<opcd> <addr>,<addr> | <addr> <op>= <addr> | add 10[R3], 20[R6], or 10[R3] += 20[R6] |
3-address machine | ||
---|---|---|
where three-address instructions act on values in memory, not in registers. | ||
operation | alternative notation | e.g. or comment |
<opcd> <addr> <addr> <addr> | <addr> := <addr> <op> <addr> |
add 10[R1],20[R2],30[R3], or 10[R1]:=20[R2]+30[R3] Combine the second and third operands using the operator (e.g., +) and store the result in the location indicated by the first address. |
|
Register-file machine (3 register addresses) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
has 2-address (1.5-address) instructions for loading from, and storing in, memory but the other instructions each specify three registers. | ||||||||||
operation | alternative notation | e.g. or comment | ||||||||
load <reg> <addr> | <reg> := <addr> | |||||||||
sto <reg> <addr> | <addr> := reg> | |||||||||
<opcd> <reg><reg><reg> | <reg> := <reg><op><reg> | add R2 R3 R4, or R2:=R3+R4 etc. |
||||||||
On the CDC 6600 machines, address-registers, A0-A7, were paired with general arithmetic registers, R0-R7 (floating-point values, or integer values stored as unnormalised floats). Loading an address into A0-A5 (A6-A7) caused the corresponding general arithmetic register to be loaded from (stored into) the memory word having that address. There were 3-address instructions on the general arithmetic registers:
Note that this form of instruction is efficient for stepping through an array because adding one to an A-register causes the corresponding R-register to be loaded from (stored in) the array. |
The zero-address machine needs more instructions for