A Simple Assembly Language Computer Simulator

This module describes a simple assembly language and a model of the CPU of a computer. You are able to create an assembly language program using the simulator described here and to simulate running the program. The simulator shows you the machine cycle - the fetch, decode, execute steps - as each assembly instruction is executed. I suggest that as you read this module you have the simulator window open so that you can try creating and executing the example programs discussed here. Then you should be able to tackle the exercise questions posed at the end of this module.


Click on the Load button shown here and you should shortly see a window with two tabs labelled Code Builder (selected by default) and Little Man Computer. Click on the Little Man Computer tab to see the design of the CPU simulator.

On the right hand side there is a table of memory addresses numbered from 00 to 99. The column labelled Instruction shows the contents of the memory cells, all of which are empty at the moment. This is where the instructions of the assembly language programs you create will be stored and where the data that the program works with will be stored. Each memory cell can hold one 3 digit decimal number. You can convert the decimal values to binary representation if you want by using the Mode radio buttons at the bottom of this memory table, but its just as easy to leave them as decimal values.

In the centre section is the area labelled CPU. In our model computer the CPU consists of only two data areas - the register and the instruction counter. As you should have already learnt, the instruction counter contains the memory address of the next program instruction to be executed. As you run a program and click on the Step button you'll see the value in the Instruction Counter change to specify the memory address of the whichever instruction must be executed next.

The model computer CPU does not include instruction and address registers, nor does it have numerous other registers for holding data values, as a real CPU does. We can make do with just the one register. As you Step through the instructions of a program you'll see the value in the Register change as the program manipulates the data. This will become clear in our first example.

The Instruction counter and the Register also can hold only a 3 digit decimal value.

On the left is a simple input and output area. Here, when you are stepping through a program you have created you can enter the data required by a read instruction whenever one is encountered, or see the output created by a print instruction whenever one is encountered.

The Instruction Set

Click on the Code Builder tab and you will see a window with two areas labelled Instruction Set and Your Assembly Program. Before demonstrating the actual mechanics of how to use the simulator let's examine what each of the instructions does.

The STOP instruction is very simple - whenever it is encountered it simply tells the computer to stop executing the program.

The LOAD instruction requires that you also specify a memory cell address. This is the purpose of the Address text box at the bottom of the Instruction Set panel. When it is executed the LOAD instruction takes whatever value is in the specified memory cell and places it in Register. The value in the memory cell is left unaltered, i.e. a copy is placed in the register.

The STORE instruction also requires that you specify a memory address in the Address text box. The instruction takes whatever value is in the register and stores it in the specified memory cell. In this case the value in the register is left unaltered.

The ADD and SUBTRACT instructions also require you to specify a memory address. These instructions add or subtract whatever value is in the specified memory cell to/from whatever value is currently in the register. In each case the value in the register is changed, but the value in the memory cell is unaltered.

For the moment we'll skip the BRANCH and STOREA instructions. They are described below.

The READ instruction takes whatever value is in the input stream and places it in the register, while the PRINT instruction takes whatever value is in the register and places it in the output stream. The input and output streams are the only way a human can provide data and see the results. You can think of them as the keyboard and the screen respectively.

A Simple Program

We have enough at this point to try writing a simple assembly language program and to see how it runs on the simulator. The purpose of the instruction described above will become much clearer as you attempt to use them.

So let's write a program that will add two numbers. The basic approach will be to use the READ instruction twice to obtain the two numbers from the input, the ADD instruction and the PRINT instruction. We'll deliberately make a mistake in our first attempt just to illustrate a basic principle of the computer.

Using the Code Builder tab, click on the READ instruction in the Instruction Set panel and then click on the arrow in the blue box that points towards the Your Assembly Program panel. Notice that the Read instruction is entered on line 00 of your program.

Next click on the READ instruction again and also the right pointing arrow. A second READ instuction is entered on line 01.

Now click on the ADD instuction and try to click on the right pointing arrow. Notice that nothing happens. The ADD instruction is not entered on line 02 of the assembly program. This is because you must specify a memory address by typing it in the Address test box. The memory cell with the address you specify must contain the value that you want to add to whatever is already in the register. You'll probably realise there is something wrong here, but try typing say 99 in the Address text box, and then click on the arrow button. The instruction ADD along with the address 99 is entered as line 02 of your program.

Next click on the PRINT instruction and the arrow, and then the STOP instruction and the arrow. These two will appear on lines 03 and 04 of your program.

This is your first attempt at an assembly program that will add two numbers. To put it in the computer's memory click on the Compile button at the bottom left of the window.

Running the Program

The tab switches to Little Man Computer and you'll see that the memory cells with addresses 00 to 04 contain some data. The data is in the form of 3 digit numbers. These numbers represent the assembly program you just created. 901 is a code for the READ instruction, 399 is a code for the "ADD value from address 99" instruction (3 is called the op code, representing ADD, and 99 is the address part of this particular instruction), 902 is the PRINT instruction and 000 is the code for the STOP instruction.

Each of the instructions in the instruction set has a numeric code. In fact the words LOAD, ADD, STORE etc. are useless to the computer since they cannot be stored in the computer's memory. Consequently there has to be a numeric code for each instruction. In a real computer the code is in binary form, and you could switch the codes in the simulator to binary if you like, but it makes no difference to the principle.

Now that the program is stored in the computers memory we can try to run (or execute) it. Notice first that the Instruction Counter is set to 00 indicating that the next instruction (the first in this case) of the program can be found in the memory cell with address 00.

Click on the Run button and notice that the Input Stream activates indicating that you should type an integer. Supposing that you want to add 20+10, you should type 20 and press the Enter key. You'll see that the value 20 appears in the register and the Instruction Counter advances to 01. The computer has just executed the first instruction - a READ instruction that places a value from the input stream into the register.

Now click on the Step button and you'll find that the Input Stream activates again. Type the value 10 and press the Enter key. Observe that the Instruction Counter advances and that the value you just typed appears in the Register.

Click on the Step button again. This time you'll see an error window appear with the message "ERROR: instruction ADD, garbage loaded at address 99". I warned you when you created the program that there was something wrong!

You are hoping to add 20 (the first value you entered) to 10 (the current value in the register). But when the first READ instruction obtained the value 20 and placed it in the register, the second READ instruction then replaced the value 20 with the value 10. That is the first value, 20, was lost. Because the ADD instruction requires a memory address where the value to be added can be found, we had arbitrarily entered 99 as that memory address when we created the program. Now we find that the memory cell at address 99 contains garbage (i.e. no value).

What we need is to store the first value that is read, rather than having it lost/replaced by the second value. The place to store it is in the memory cell with address 99, since that is the address used with the ADD instruction. We don't have to choose cell 99 - we could choose almost any cell, provided the address used in the STORE instruction is the same as the one used in the LOAD instruction. Notice I said "almost any cell" - we'll come back to that.

Fixing the Program

Click on the OK button in the error message window and then return to the Code Builder tab, since we need to change the program.

Click on the STORE instruction from the Instruction Set, type the value 99 as the address (i.e. the value in the register will be stored in the memory cell with address 99) and click the right pointing arrow. You'll notice that the STORE 99 instruction is added to the program as line 05.

Line 5 is the wrong position for this instruction. It should come right after the first READ instruction. So click on the STORE instruction (to select/highlight it) and click on the up pointing arrow on the far right to move the selected instruction up to a new position. Keep moving it up until it is placed after the first READ, i.e. as line 01.

Now click the Compile button again to return to the Little Man Computer tab and begin to execute the program.

As soon as you click the Step button to execute instruction code 299 (2 is the op code meaning STORE, and 99 is the address part) check out the contents of memory cell 99 by moving the vertical slider in the memory panel. You should find that the cell with address 99 contains whatever value you just entered for the first READ instruction.

After you enter the value for the second READ instruction you'll step on to the instruction at address 03, the ADD instruction. This time the instruction will succeed and you'll see the contents of the register change. Whatever value you entered for the second READ (which was currently in the register) will have added to it whatever value was stored in cell 99 (i.e. whatever value was entered for the first READ instruction).

Step on to instruction 04 and you'll see the value in the register (i.e. the sum of the two numbers) appear in the Output Stream as a result of the PRINT instruction (code 902). And then at instruction 05 the program will stop.

You now have a working addition program written in this very simple assembly language.

Memory Areas for Program Instructions and Data

I mentioned above that the memory address for storing the first value that is read by the READ instruction could be "almost any cell". Let's see what this means.

Return to the Code Builder tab so that you can change the memory cell address 99 used in the program. Double click on the STORE 99 instruction and you 'll find that a small Assembly Code Edit window appears. Change the Address value, currently 99, to 03 and click on the Ok button. Do the same for the ADD 99 instruction and then "compile" the program again.

Now run the program. When you come to the instruction at address 01 (i.e. instruction code 203 representing STORE 03) you'll encounter an ERROR: Memory corruption, storing over code!!! message. This is telling you that you have tried to store a data value in a memory cell that contains a program instruction. Obviously you should not do that since the program would then become meaningless.

A computer's system software typically checks to ensure that no program tries to write data in an area of memory that contains program instructions. This is a crucial but fundamental requirement.

Change the addresses used in the STORE and ADD instructions to 08 and compile the program again. It should work properly when you run it now.

Understanding the Data Representation

With the address of the STORE and ADD instructions set to 08 you can easily see the contents of the memory cell 08 as the program runs (without having to scroll down the Memory panel).

As you run the program enter 99 as the first number to be read and observe the value that appears in cell 08. You'll see that it is 099. This is because the memory cells MUST contain 3 decimal digits. The decimal values are actually being represented in a 10's complement code. This is like a 2's complement code except in base 10. You probably observed that the value in the register was also 099. The data representation also applies to the register.

The 2's complement code was discussed in detail in the Data Representation module. One basic fact is that given say 8 bits the number of values that can be represented is 256 and in a 2's complement code half of these are chosen to be positive integers (0 to 127) and half are chosen to be negative integers (-128 to -1).

In a 10's complement code also, half of the possible values are taken to represent positive values and half are taken to represent negative values. Given 3 decimal digits we have 1000 possible values (000 to 999). Half of these (000 to 499) are taken to represent the positive integers 0 to 499 and the remaining half (500 to 999) are taken to represent the negative integers -500 to -1 in that order. You'll see the analogy to the 2's complement code I hope.

So as you run the program enter 499 as the second value to be read. You'll find that this works and the program is able to step on to the next instruction - the ADD instruction (code 308). But this ADD instruction fails. The message is ERROR: instruction ADD, overflow. This is because you have tried to add 99 to 499 the result of which exceeds the possible values in the data representation code. The code 598 (the result of adding 499+099 if it was actually performed) in fact represents the negative number -402. You can see this by recompiling and running the program again, and typing -402 as the first value to be read. When the READ instruction is executed you'll see 598 appear in the register.

Explore the limits of the data representation by entering values that you expect to work - both positive and negative values - and values that you expect to fail. Here are some examples to try:

  1. 500 (positive)
  2. -1 (observe the value in the register) + -500 (observe the value in the register)
  3. 499 + -500 (observe the value in the register as a result of the addition)
  4. 0 + -500
There are many other possible values that you can experiment with.

Another Program

You will have noticed that the Instruction Set panel contains instructions named BRANCH, BRANCHZ and BRANCHP. These instructions make the kind of programs we can write considerably more useful.

The BRANCH instruction requires an address and when executed it causes the Instruction Counter to be set to that address. This means that the next instruction to be executed is the one found in the memory cell at this new address. In other words the instructions of the program are not simply executed in sequence, one after the other, but instead it is possible to specify which instruction to execute next. Since that would not be the next one in sequence the execution of instructions has "branched" to a different instruction.

The BRANCHZ and BRANCHP instructions do much the same thing except that they only branch if the register contains 000 (BRANCHZ, i.e. branch if the register contains zero) or a positive number (BRANCHP, i.e. branch if the register contains a positive number).

Suppose that you want to perform more than one addition with the previous program without having to recompile and begin running it each time. To do this we could do one addition and then read another value which is meant to indicate whether to continue with a second addition or to stop. For example, if you entered 0 the program could interpret this to mean "no more additions", whereas if you entered any other value the program would proceed with another addition.

Modify the addition program by adding another READ instruction and placing it after the PRINT instruction. Next add a BRANCHZ instruction. You need to enter an address for this BRANCHZ instruction, i.e. the address of the cell containing the instruction that should be executed if the program branches. If when the program is running you enter 0 for this new READ instruction (meaning that you want to perform another addition) you want the program to go back to the beginning and execute the first READ again since that is the first step in doing a second addition. So the address you enter should be 00. Place this BRANCHZ instruction after the READ that you just added. The program should be as follows:

00  READ
01  STORE  08
02  READ
03  ADD  08
05  READ
06  BRANCHZ  00
07  STOP
Notice that the READ instruction on line 05 is not meant to read a value that is part of the addition. It is intended solely to be a flag indicating whether to do another addition or to stop.

Compile and begin running the program. As you step through the program instructions you will enter two values, perform the addition and print the result and then encounter the third READ instruction. Enter 0 and observe what happens to the Instruction Counter as you next step to execute the instruction in memory cell 06, ie. the instruction with code 600 (6 is the op code for BRANCHZ and 00 is the address code indicating where to branch to).

You'll see the Instruction Counter change to 00, and then on the next step the very first READ instruction will be executed, allowing you to enter a second set of numbers to be added. Enter two numbers, add then and print the result and this time for the READ instruction in cell 05 enter any value other than 0. This time the program does not branch back to the beginning (because the value in the register was not 0). Instead it proceeds with the instruction in cell 07 which is the STOP instruction.

Modifying Program Instructions

Suppose that we want to perform the following set of calculations:
xi - yi
for i from 1 to 5. That is we will have 5 numbers x1, x2, x3, x4, and x5 stored in the computer memory, and another 5 numbers y1, y2, y3, y4, and y5 also stored in the memory and we want to perform the subtract for each pair xi - yi. The result of each subtraction could be stored back in the same memory cells as the x's.

The following program instructions, although tedious, read the 5 values for the x's and the 5 values for the y's.

00  READ
01  STORE 90
02  READ
03  STORE 91
04  READ
05  STORE 92
06  READ
07  STORE 93
08  READ
09  STORE 94
00  READ
01  STORE 95
02  READ
03  STORE 96
04  READ
05  STORE 97
06  READ
07  STORE 98
08  READ
09  STORE 99
We can suppose that the values for the x's are read first and stored in the cells 90 to 94, and then the y's are read and stored in the cells 95 to 99.

It would be possible to perform the subtractions by writing the following program instructions:

10 LOAD 90
12 STORE 90
This would subtract the first y value (stored in cell 95) from the first x value (loaded into the register from cell 90) and then store the result back in the cell that had contained the first x value (cell 90). These instructions would have to be repeated 5 times changing the addresses of the memory cells each time. This would be quite a tedious program to write, totalling another 15 instructions.

It is possible to write this program with many fewer instructions. You need to realise that as you repeat this set of 3 instructions all that changes are the addresses of the memory cells. 90 and 95 should change to 91 and 96, then 91 and 96 should change to 92 and 97, and so on. Recall that the instruction LOAD 90 is represented by the code 190 (the op code is 1 and the address code is 90) and that this should change to 191 for the next set of 3 instructions.

The STOREA instruction is designed to modify the last two digits of the contents of a memory cell. It is intended to be used to modify a program instruction, i.e. to change the instruction 190 to 191 by modifying the address code 90 to 91 while leaving the op-code (1 in this case) unchanged.

We could thus write the instructions

13  LOAD 10
14  ADD 89
15  STOREA 10
where we need to have the value 1 already stored in cell 89. LOAD 10 will cause the instruction code 190 to be placed in the register, ADD 89 will cause 1 to be added (assuming the contents of cell 89 has the value 1) so that the register now contains 191, and STOREA 10 causes the last two digits in the register (i.e. 91) to replace the last two digits of the instruction in memory cell 10. The instruction has thus been changed from 190 to 191.

(to be continued ...)