Header Image - Randall Morgan

Tag Archives

7 Articles

Sweet16-GP Assembler – Part 2

by SysOps 0 Comments
This entry is part 7 of 8 in the series Sweet16-GP CPU: A Complete Development Cycle

Last time we left our assembler with only two functions. The strip_lines() function which removes comments and blank lines and the rename_registers() function which replaces the register’s friendly names with the cononical names i.e. Rx where x is a number between 0 and 7 (the register’s index value into the register file).

With these to pre-processing functions out of the way we are ready to start processing our assembly code.  We will use a two-pass method of assembly. The first pass will break each line into its constituent parts and then locate the instruction’s addressing mode, stores it in a field called mode, and calculate the instruction’s actual opcode value and length. In addition it will store a labels in a symbol table to record their value for later use in the second pass or the assembler.

Our main assembler method will take our clean lines, one at a time, and parse them as described above. We will need a table of instructions that includes details about each instruct including it’s addressing mode(s), opcode(s) and length. You may be wondering why the “(s)” above. This is because some of our instructions have multiple addressing modes and therefore multipple opcodes and perhaps even multiple instruction lengths. We will use our table to disambiguate the instruction format into the proper values. 

We will implement our opcode_table as a dict. Each entry will be  indexed (key) by the instruction mnemonic.  The value for each key in the table will be a dict indexed by the addressing mode. Each addressing mode will then have a list of values containing the instruction’s opcode. If the opcode requires a one or two by value, they will be provided as VALUE_L and VALUE_H in the list. This description sounds much more complex than it really is. Refere to the code for our table below and re-read the above paragraph. It should all become clear.

opcode_table = {
    'HALT': {
        'implicit': ['0x00'],
    },
    'BRA': {
        'immediate': ['0x01', VALUE_L],
        'offset': ['0x01', OFFSET],
    },
    'BRC': {
        'immediate': ['0x02', VALUE_L],
        'offset': ['0x02', OFFSET],
    },
    'BRZ': {
        'immediate': ['0x03', VALUE_L],
        'offset': ['0x03', OFFSET],
    },
    'BRN': {
        'immediate': ['0x04', VALUE_L],
        'offset': ['0x04', OFFSET],
    },
    'BRV': {
        'immediate': ['0x05', VALUE_L],
        'offset': ['0x05', OFFSET],
    },
    'BSR': {
        'immediate': ['0x06', VALUE_L, VALUE_H],
        'offset': ['0x06', LABEL_L, LABEL_H],
    },
    'RTS': {
        'implicit': ['0x07'],
    },

    'SET': {
        'direct': ['0x08', VALUE_L, VALUE_H],
    },
    'LD': {
        'register': ['0x10'],
        'indirect': ['0x20'],
    },
    'ST': {
        'register': ['0x18'],
        'indirect': ['0x28'],
    },
    'LDD': {
        'indirect': ['0x30'],
    },
    'STD': {
        'indirect': ['0x38'],
    },
    'POP': {
        'indirect': ['0x40'],
    },
    'STP': {
        'indirect': ['0x48'],
    },
    'ADD': {
        'register': ['0x50'],
    },
    'SUB': {
        'register': ['0x58'],
    },
    'MUL': {
        'register': ['0x60'],
    },
    'DIV': {
        'register': ['0x68'],
    },
    'AND': {
        'register': ['0x70'],
    },
    'OR': {
        'register': ['0x78'],
    },
    'XOR': {
        'register': ['0x80'],
    },
    'NOT': {
        'register': ['0x88'],
    },
    'SHL': {
        'register': ['0x90'],
    },
    'SHR': {
        'register': ['0x98'],
    },
    'ROL': {
        'register': ['0xA0'],
    },
    'ROR': {
        'register': ['0xA8'],
    },
    'POPD': {
        'indirect': ['0xE0'],
    },
    'CPR': {
        'register': ['0xE8'],
    },
    'INC': {
        'register': ['0xF0'],
    },
    'DEC': {
        'register': ['0xF8'],
    },
}

As you can see, our instructions can fall into one of four addressing modes. These include implicit (when the instruction itself indicates the addressing mode),  immediate (when the value needed by the instruction directly follows it), indirect (when the register specified contains the value needed to complete the instructions, and register (when the instruction operates directly on the register specified). To tell the truth, these could be further broken down but that would only complicate things.

OK, now we have our opcode_table we are going to start writing our parser. Our parser is a little bit unorthodox. Since our input language is so constrained we can forego all the formal theory and much of the discipline or parser development. 

I do believe however, that any developer worth their weight should write at least a simple parser at some point. Understanding even a very simple parser expands your understanding of how your tools work whether you’re a web developer, AI master, ETL programmer, or Embedded Systems developer. If you call yourself a Software Engineer or Programmer and you have written at least a simple parser. compiler, or interpreter, then you owe it to yourself to do so! To get you started I recomend DR. Jack Crenshaw’s “Let’s Build A Compiler” and Ruslans’s “Let’s Build A Simple Interpreter” blog serries. If you’re a seasoned developer checkout the book “Language Implementation Patterns”. For those mathematically oriented, checkout the Dragon Book

So what I am going to do is simply plow forward. We will tackle one issue at a time and add functions to handle each issue as it arises. Note that this technique wont work if you’re trying to write a high level language compiler or interpreter. There you will need all the formal methods and a have good design before writing code. What we are going for here is a quick and simple boot-strap assembler. We can write better tools later. For now, we just need something to get us up and running.

OK, the first thing we are going to need is some set of methods to break up the line of source code we feed to our assembler into it’s various fields. We will create an entry point function called assemble which will do our heavy lifting by calling other functions.

We will need to produce some data about the current source line. This will include any label, register, addressing mode, and value. Additionally, we may also want to keep track of the original instruction mnemonic and any intermediate code we produce. So we will need a supporting function to handle this. We will also need some storage for any lables or symbols we encounter during parsing.

The code below is our entry point. It will be responsible for handling our two apasses over our code and  will call the functions that do the heavy lifting.

def assemble(lines, lc=0):
    """Breaks the line into its constituent parts.
        it then locates the instruction's addressing mode,
        stores it in mode, and calculates the instruction's
        actual opcode value and length. """
    objcode = []
    symbols = {}

    # Pass 1 : Parse instructions and create intermediate code
    for lineno, label, (value, register, mode, mnemonic, icode) in parse_lines(lines, symbols):
        # Try to evaluate numeric labels and set the lc (location counter)
        if label:
            try:
                lc = int(eval(label, symbols))
            except (ValueError, NameError):
                symbols[label] = lc

        # Store the resulting object code for later expansion
        if icode:
            objcode.append((lineno, lc, value, register, mode, mnemonic, icode))
            lc += len(icode

Note the call to parse_lines() we will write this function next. But, let’s step back and think about what we need it to do…

Given a line of code like:

start:    SET   R1, 0xFA

We can see we need to break up this line into it’s smaller parts. Luckily, this is pretty easy to do. Also, since each line will follow a similar pattern for the fields, we can easily deduce what each field contains. 

# Exception used for errors
class AssemblyError(Exception):
    pass

def parse_lines(lines, symbols):
    """ Determine line number, mnemonic, register, value, and address mode"""
    for lineno, line in enumerate(lines, 1):
        # Handle labels
        label, *colon, statement = line.rpartition(":")
        try:
            # parse the line into ir (intermediate representation)
            data = lineno, label, parse_opcode(statement, symbols) if statement else (None, None, None, None, None)
            yield data
        except AssemblyError as e:
            print("{0:4d} : Error : {1}".format(lineno, e))

Refering to the code above, we use the enumerate function to keep track of our line numbers. One drawback to doing this approach is that our comment and blank lines wont be counted. Something we could fix by tracking line numbers in the strip_lines() function. But for now, we will do it this way.

Enumerate returns to us an integer value for the line number (lineno) and the line of source code. Next, we use rpartition(“:”) on on the source line to break it into three parts. These parts are return in a tuple containing the part before the seperator (colon in our case, which ends a label declaration), the seperator, and the part after the seperator. If any part is not found (for example, if there is no colon in the source line) then that part returns an empty string value. So if our line does not contains a colon, we will get an empty string for the label and for the seperator followed by the initial line of code which we save in the statement variable.

Once we have gotten any label, we need to take our statement and further parse it into its various parts. If it turns out that our statement is empty, we need to return a default tuple of ‘None’ values otherwise, we make a call to parse_opcode() passing in our statement and our symbol table so that any symbols found in our source line can be handled. If this all goes horribly wrong, we throw an exception and print the error message. We defined a simple exception class above for this purpose.

Our next task is to implement a function to parse the line into opcodes. We called this function pasre_opcode() above. The implementation is shown below:

def parse_opcode(line: str, symbols) -> tuple:
    """ Break the line into its constituent parts.
        Locate any labels and store their definition.
        Then locate the instruction's addressing mode,
        and stores it in mode. Calculate the instruction's
        actual opcode value.

        Returns: tuple(value, register, mode, objcode) where
        value is a string to be evaluated in the second pass.
        register is the register value or empty string if no
        register exists. "mode" is the addressing mode and
        objcode is a dict containing the base opcode value,
        and a list of functions needed to process the
        instruction
    """
    fields = line.split(None, 1)
    nofields = len(fields)
    if nofields > 1:
        extra = fields[1].split(',')
        if len(extra) > 1:
            fields[1] = extra[0]
            fields.extend(extra[1:])
    nofields = len(fields)
    mnemonic = fields[0]
    register = ''
    value = ''

    """ Examples:
            HALT
            BRA 0x1F
            SET R0, 0xFFFE
            LD R2
            STD @R3  
    """
    # Get register and value
    if nofields > 1:
        register = parse_register(fields[1])

    if register == '' and nofields > 1:
        value = parse_value(fields[1])

    if nofields > 2:
        register = parse_register(fields[1])
        value = parse_value(fields[2])

    # Get address mode
    mode = parse_address_mode(fields)

    # Get all addressing modes for this mnemonic
    opcodemodes = opcode_table.get(mnemonic)
    if not opcodemodes:
        raise AssemblyError("Unknown opcode '{}'".format(mnemonic))

    # Get the address mode used in the instruction
    objcode = opcodemodes.get(mode)
    if not objcode:
        raise AssemblyError("Invalid addressing mode '{0}' for {1}".format(mode, mnemonic))

    return value, register, mode, mnemonic, list(objcode)

There is a lot going on here so let’s unpack it. First, we use the split() method to split the statement into two portions. The first parameter to split() gives the character to split on and the second parameter is used to limit the number of splits. What we are doing here is seperating the instruction mnemonic from the rest of the statement. Since some instructions (i.e.: HALT) only have the mnemonic we need to check the number of fields we get back. If it’s only one, then we have an instruction like HALT or RTS. 

However, if we have more than one field returned from the split we need to further decompose the line. We now have a mnemonic in fields[0] and the remainder of the line in fields[1]. We now know we may have something like Rx, @Rx, or Rx, 0xFA following the instruction mnemonic. So the next move is to tray and seperate the remaining portion of the statement on the comma seperator. If the comma is found, one of two cases will be returned i.e.: (register, value) or @register, value). These are stored in the extra variable. We then add the extra fields to the fields variable and assign mnemonic the value in fields[0] which contains the instruction mnemonic.

Next, using the value of ‘nofields’ which indicate what we should expect next, we sort out the remaining fields and store their values into their perspective variables. In each case, we call parse functions which perform a specific sub-task.

Our first sub-task is to locate any register in the fields. At this point we know if a register exists it should be in fields[1]. So we pass this value to the parse_register() function shown below:

def parse_register(field: str) -> str:
    """ Examples:
            R2
            @R3
            R2, 0x1F
    """
    register = field.upper()
    if register.__contains__('R'):
        pos = field.find('R')
        register = field[pos + 1:pos + 2]
    return register

The first thing we do is convert the register value to upper case. Then we check if the string contains ‘R’ and return the character following the ‘R’ which should be the register index value (a single digit between 0 and 7 inclusively). We then return that value to the caller.

It’s possible that we didn’t have a register. In this case, there may be a value in the fields[1] position. So we check if the call to parse_register() returned a value of an empty string. If the latter was returned, we try and parse a value using the function parse_value() passing in fields[1] as a parameter. This situation occurs with the branch instructions where the mnemonic is directly followed by the offset value for the jump.

def parse_value(val: str) -> str:
    """ Try to parse a value, which can be an integer
        in base 2, 8, 10, 16 or a register.

        Returns: A string representation of the integer
        value in base 10, or empty string if the value
        cannot be converted to an integer.
        On Error: return original value.
    """
    # convert value to int from various bases
    if isinstance(val, str):
        # Get any possible prefix
        val = val.strip(' ')
        base = val[0:2]
        if val.isnumeric():
            return str(int(val, 10))
        if base == '0x':
            return str(int(val, 16))
        elif base == '0c':
            return str(int(val, 8))
        elif base == '0b':
            return str(int(val, 2))
        elif val.__contains__('R'):
            return ''
        else:
            return val

As you can see above the parse_value() function must handle many types of values. It first strips off any prefix to the value and tests if this matches any of the supported numerical base indicators. If a match is found the value is converted to an integer and then cast into a string and returned to the caller.

Next, we call parse_address_mode() passing in all the fields as we may need more than one to identify the addressing mode. The parse_address_mode() function can only be understood in the context of the opcode_table. So refer to the table as you gork this code:

def parse_address_mode(fields: list) -> str:
    """ Example inputs:
                ['HALT']                    : implicit mode
                ['BRA', '0x1F']             : immediate mode
                [LD, R2]                    : register
                [STD, @R3]                  : indirect
                ['SET', 'R0,', '0xFFFE']    : direct
                ['BRC' 'end']               : offset
    """
    numfields = len(fields)
    if numfields == 1:
        return 'implicit'
    elif numfields == 2:
        if fields[0] in directives:
            return 'immediate'
        if fields[1].__contains__('@R'):
            return 'indirect'
        elif fields[1].__contains__('R') and fields[1].__contains__(','):
            return 'direct'
        elif fields[1].__contains__('R'):
            return 'register'
        elif parse_value(fields[1]).isnumeric():
            return 'immediate'
        elif parse_value(fields[1]).isalnum() or parse_value(fields[1]).isalpha():
            return 'offset'
    elif numfields == 3 and parse_value(fields[2]).isnumeric():
        return 'direct'
    elif numfields == 3 and '(' in fields[2] and ')' in fields[2]:
        return 'direct'
    else:
        raise ValueError('Expected numeric value got: {expected}'.format(expected=fields[2]))

The parse_address_mode() function uses the data contained in the fields to sort out the addressing mode of the instruction and return it to the caller, parse_opcode() which then stores this value in the mode variable. 

Next we use the mnemonic value to get the instruction data from the opcode_table and then use the mode variable to get the proper opcode value and instruction format.

If all has gone well, we return a tuple containing the value, register, mode, mnemonic, variables and the object code taken from the opcode_table.

At this poiont our first pass is complete. We have all the data necessary to assemble our instruction code with the possible exception of any forward declared symbol. 

We have one last issue before we can try this out. Our opcode_table has method names in some of the fields. We need to implemnt these helper methods:

All of these methods are pretty strainght forward. They simply take the value from the source code and separe it out into high bytes and low bytes in the case of VALUE_H, VALUE_L, LABEL_H, and LABEL_L. The OFFSET function calculates the offset from the current position.

# Functions used in the creation of object code (used in the table below)
def VALUE_L(pc: int, value):
    val = value_to_int(value) & 0xff
    return str(val & 0xff)


def VALUE_H(pc: int, value: int) -> int:
    val = (value_to_int(value) & 0xff00) >> 8
    return str(val & 0xff)
 
  
def LABEL_L(pc, value):
    print(f'LABEL_L PC: {pc}, value: {value}')
    return (value) & 0xff


def LABEL_H(pc, value):
    print(f'LABEL_H PC: {pc}, value: {value}')
    return ((value) & 0xff00) >> 8


def OFFSET(pc, value):
    print(f'OFFSET: {value}')
    return ((value - pc)) & 0xff


def value_to_int(value):
    # convert value to int from various bases
    if isinstance(value, str):
        if value[1:3].lower() == '0x':
            return int(value, 16)
        elif value[1:3].lower == '0c':
            return int(value, 8)
        elif value[1:3].lower() == '0b':
            return int(value, 2)
        else:
            print('Invalid value.')
    elif isinstance(value, int):
        return value
    # else:
    #     raise ValueError("Expected numerical value, got: {0}".format(value))


def register_to_int(reg):
    if isinstance(reg, str):
        if reg.startswith('@R'):
            return int(reg[2:], 16)
        elif reg.startswith('R') and reg.endswith(','):
            return int(reg[1:-1], 16)
        elif reg.startswith('R'):
            return int(reg[1:])
        else:
            raise ValueError("Cannot parse the register: '{0}'".format(reg))

PK, I think we’re ready to try this! Below is the final code. Note I made some adjustments to the code from the last installment so we could test what we’ve done so far. 

"""Sweet16-GP Assembler"""


# Exception used for errors
class AssemblyError(Exception):
    pass


# Functions used in the creation of object code (used in the table below)
def VALUE_L(pc: int, value):
    # print(f'VALUE_L value: {value}')
    #val = value_to_int(value) & 0xff
    #print(f'VALUE_L val: {val}')
    #return str(val & 0xff)
    pass


def VALUE_H(pc: int, value: int) -> int:
    val = (value_to_int(value) & 0xff00) >> 8
    return str(val & 0xff)


def LABEL_L(pc, value):
    return (value) & 0xff


def LABEL_H(pc, value):
    return ((value) & 0xff00) >> 8


def OFFSET(pc, value):
    return ((value - pc)) & 0xff


def value_to_int(value):
    # convert value to int from various bases
    if isinstance(value, str):
        if value[1:3].lower() == '0x':
            return int(value, 16)
        elif value[1:3].lower == '0c':
            return int(value, 8)
        elif value[1:3].lower() == '0b':
            return int(value, 2)
        else:
            print('Invalid value.')
    elif isinstance(value, int):
        return value
    # else:
    #     raise ValueError("Expected numerical value, got: {0}".format(value))


def register_to_int(reg):
    if isinstance(reg, str):
        if reg.startswith('@R'):
            return int(reg[2:], 16)
        elif reg.startswith('R') and reg.endswith(','):
            return int(reg[1:-1], 16)
        elif reg.startswith('R'):
            return int(reg[1:])
        else:
            raise ValueError("Cannot parse the register: '{0}'".format(reg))


opcode_table = {
    'HALT': {
        'implicit': ['0x00'],
    },
    'BRA': {
        'immediate': ['0x01', VALUE_L],
        'offset': ['0x01', OFFSET],
    },
    'BRC': {
        'immediate': ['0x02', VALUE_L],
        'offset': ['0x02', OFFSET],
    },
    'BRZ': {
        'immediate': ['0x03', VALUE_L],
        'offset': ['0x03', OFFSET],
    },
    'BRN': {
        'immediate': ['0x04', VALUE_L],
        'offset': ['0x04', OFFSET],
    },
    'BRV': {
        'immediate': ['0x05', VALUE_L],
        'offset': ['0x05', OFFSET],
    },
    'BSR': {
        'immediate': ['0x06', VALUE_L, VALUE_H],
        'offset': ['0x06', LABEL_L, LABEL_H],
    },
    'RTS': {
        'implicit': ['0x07'],
    },

    'SET': {
        'direct': ['0x08', VALUE_L, VALUE_H],
    },
    'LD': {
        'register': ['0x10'],
        'indirect': ['0x20'],
    },
    'ST': {
        'register': ['0x18'],
        'indirect': ['0x28'],
    },
    'LDD': {
        'indirect': ['0x30'],
    },
    'STD': {
        'indirect': ['0x38'],
    },
    'POP': {
        'indirect': ['0x40'],
    },
    'STP': {
        'indirect': ['0x48'],
    },
    'ADD': {
        'register': ['0x50'],
    },
    'SUB': {
        'register': ['0x58'],
    },
    'MUL': {
        'register': ['0x60'],
    },
    'DIV': {
        'register': ['0x68'],
    },
    'AND': {
        'register': ['0x70'],
    },
    'OR': {
        'register': ['0x78'],
    },
    'XOR': {
        'register': ['0x80'],
    },
    'NOT': {
        'register': ['0x88'],
    },
    'SHL': {
        'register': ['0x90'],
    },
    'SHR': {
        'register': ['0x98'],
    },
    'ROL': {
        'register': ['0xA0'],
    },
    'ROR': {
        'register': ['0xA8'],
    },
    'POPD': {
        'indirect': ['0xE0'],
    },
    'CPR': {
        'register': ['0xE8'],
    },
    'INC': {
        'register': ['0xF0'],
    },
    'DEC': {
        'register': ['0xF8'],
    },
}

def parse_value(val: str) -> str:
    """ Try to parse a value, which can be an integer
        in base 2, 8, 10, 16 or a register.

        Returns: A string representation of the integer
        value in base 10, or empty string if the value
        cannot be converted to an integer.
        On Error: return original value.
    """
    # convert value to int from various bases
    if isinstance(val, str):
        # Get any possible prefix
        val = val.strip(' ')
        base = val[0:2]
        if val.isnumeric():
            return str(int(val, 10))
        if base == '0x':
            return str(int(val, 16))
        elif base == '0c':
            return str(int(val, 8))
        elif base == '0b':
            return str(int(val, 2))
        elif val.__contains__('R'):
            return ''
        else:
            return val


def parse_address_mode(fields: list) -> str:
    """ Example inputs:
                ['HALT']                    : implicit mode
                ['BRA', '0x1F']             : immediate mode
                [LD, R2]                    : register
                [STD, @R3]                  : indirect
                ['SET', 'R0,', '0xFFFE']    : direct
                ['BYTE' '0x1f']             : immediate
                ['WORD' '0x3BFC']           : immediate
                ['STRING' 'This is a test'] : immediate
                ['BRC' 'end']               : offset
    """
    numfields = len(fields)
    if numfields == 1:
        return 'implicit'
    elif numfields == 2:
        # if fields[0] in directives:
        #     return 'immediate'
        if fields[1].__contains__('@R'):
            return 'indirect'
        elif fields[1].__contains__('R') and fields[1].__contains__(','):
            return 'direct'
        elif fields[1].__contains__('R'):
            return 'register'
        elif parse_value(fields[1]).isnumeric():
            return 'immediate'
        elif parse_value(fields[1]).isalnum() or parse_value(fields[1]).isalpha():
            return 'offset'
    elif numfields == 3 and parse_value(fields[2]).isnumeric():
        return 'direct'
    elif numfields == 3 and '(' in fields[2] and ')' in fields[2]:
        return 'direct'
    else:
        raise ValueError('Expected numeric value got: {expected}'.format(expected=fields[2]))


def parse_register(field: str) -> str:
    """ Examples:
            R2
            @R3
            R2, 0x1F
    """
    register = field.upper()
    if register.__contains__('R'):
        pos = field.find('R')
        register = field[pos + 1:pos + 2]
    return register


def parse_opcode(line: str, symbols) -> tuple:
    """ Break the line into its constituent parts.
        Locate any labels and store their definition.
        Then locate the instruction's addressing mode,
        and stores it in mode. Calculate the instruction's
        actual opcode value.

        Returns: tuple(value, register, mode, objcode) where
        value is a string to be evaluated in the second pass.
        register is the register value or empty string if no
        register exists. "mode" is the addressing mode and
        objcode is a dict containing the base opcode value,
        and a list of functions needed to process the
        instruction
    """
    fields = line.split(None, 1)
    nofields = len(fields)
    if nofields > 1:
        extra = fields[1].split(',')
        if len(extra) > 1:
            fields[1] = extra[0]
            fields.extend(extra[1:])
    nofields = len(fields)
    mnemonic = fields[0]
    register = ''
    value = ''

    """ Examples:
            HALT
            BRA 0x1F
            SET R0, 0xFFFE
            LD R2
            STD @R3
    """
    # Get register and value
    if nofields > 1:
        register = parse_register(fields[1])

    if register == '' and nofields > 1:
        value = parse_value(fields[1])

    if nofields > 2:
        register = parse_register(fields[1])
        value = parse_value(fields[2])

    # Get address mode
    mode = parse_address_mode(fields)

    # Get all addressing modes for this mnemonic
    opcodemodes = opcode_table.get(mnemonic)
    if not opcodemodes:
        raise AssemblyError("Unknown opcode '{}'".format(mnemonic))

    # Get the address mode used in the instruction
    objcode = opcodemodes.get(mode)
    if not objcode:
        raise AssemblyError("Invalid addressing mode '{0}' for {1}".format(mode, mnemonic))

    return value, register, mode, mnemonic, list(objcode)


def parse_lines(lines, symbols):
    """ Determine mnemonic, register, value, and address mode"""
    for lineno, line in enumerate(lines, 1):
        # Handle labels
        label, *colon, statement = line.rpartition(":")
        try:
            # parse the line into ir (intermediate representation)
            data = lineno, label, parse_opcode(statement, symbols) if statement \
                else (None, None, None, None, None)
            yield data
        except AssemblyError as e:
            print("{0:4d} : Error : {1}".format(lineno, e))


def assemble(lines, lc=0):
    """Breaks the line into its constituent parts.
        it then locates the instruction's addressing mode,
        stores it in mode, and calculates the instruction's
        actual opcode value and length. """
    objcode = []
    symbols = {}

    # Pass 1 : Parse instructions and create intermediate code
    for lineno, label, (value, register, mode, mnemonic, icode) in parse_lines(lines, symbols):
        # Try to evaluate numeric labels and set the lc (location counter)
        if label:
            try:
                lc = int(eval(label, symbols))
            except (ValueError, NameError):
                symbols[label] = lc

        # Store the resulting object code for later
        # expansion and adjust the location counter lc.
        if icode:
            objcode.append((lineno, lc, value, register, mode, mnemonic, icode))
            lc += len(icode)
    return objcode


def replace_register_names(line):
    """ Replace register names with 'R' + register index.
        Example: ACC becomes R0. STATUS becomes R6."""
    line = line.replace('ACC', 'R0')
    line = line.replace('RETSTACK', 'R4')
    line = line.replace('COMP', 'R5')
    line = line.replace('STATUS', 'R6')
    line = line.replace('PC', 'R7')
    return line


def strip_lines(lines):
    """ Takes a sequence of lines and strips comments and blank lines."""
    for line in lines:
        comment_index = line.find(";")
        if comment_index >= 0:
            line = line[:comment_index]
        line = line.strip()
        line = replace_register_names(line)
        yield line


if __name__ == '__main__':
    text = ';This is a comment\n' \
           'start:  SET  ACC, 0xFFDE  ; This is also a comment\n' \
           'end:    HALT  ; end of program'
    lines = text.split('\n')

    # Remove comments and blank lines
    lines = strip_lines(lines)
    print(assemble(lines))

If you run the above code you should get an output like the following:

 

sweet16-articles-code/articles/part-07/assembler_02.py
 [
  (2, 0, '65502', '0', 'direct', 'SET', ['0x08', , ]), 
  (3, 3, '', '', 'implicit', 'HALT', ['0x00']) 
 ]

As you can see we have the intermediate representation for two instructions. All the info we will need to assemble these instructions into actual machine code. Reading the tuples the first value (2) is the line number from the source. The second value (0) is the location counter at the start of the instruction. The next value (65502) is the value provided in the instruction, followed by the register index (0), the addressing mode (direct), and the mnemonic. The final fields holds the opcode information from the opcode_table.  Try this out on several instructions and see that the results make sense. Write some unit test for the various functions and for the assemble as it stands. 

This has been a long post. Next time we will tackle the second pass of our assembler. Until then, keep coding!

 

Sweet16 CPU Emulator

Part 4

I hope you played around with the emulator and tried to implement some of the CPU instructions yourself. As promised, in this post we will continue our work on the emulator. First, let me show you my implementation of decode_register_type_instr(). 

def decode_register_instr(self, opcode: int, reg_id: int):
    if opcode == 0x08:
        self.exec_set(opcode, reg_id)
    elif opcode == 0x10:
        self.exec_ld(opcode, reg_id)
    elif opcode == 0x18:
        self.exec_st(opcode, reg_id)
    elif opcode == 0x20:
        self.exec_ld_ea(opcode, reg_id)
    elif opcode == 0x28:
        self.exec_st_ea(opcode, reg_id)
    elif opcode == 0x30:
        self.exec_ldd_ea(opcode, reg_id)
    elif opcode == 0x38:
        self.exec_std_ea(opcode, reg_id)
    elif opcode == 0x40:
        self.exec_pop_ea(opcode, reg_id)
    elif opcode == 0x48:
        self.exec_stp_ea(opcode, reg_id)
    elif opcode == 0x50:
        self.exec_add(opcode, reg_id)
    elif opcode == 0x58:
        self.exec_sub(opcode, reg_id)
    elif opcode == 0x60:
        self.exec_mul(opcode, reg_id)
    elif opcode == 0x68:
        self.exec_div(opcode, reg_id)
    elif opcode == 0x70:
        self.exec_and(opcode, reg_id)
    elif opcode == 0x78:
        self.exec_or(opcode, reg_id)
    elif opcode == 0x80:
        self.exec_xor(opcode, reg_id)
    elif opcode == 0x88:
        self.exec_not(opcode, reg_id)
    elif opcode == 0x90:
        self.exec_shl(opcode, reg_id)
    elif opcode == 0x98:
        self.exec_shr(opcode, reg_id)
    elif opcode == 0xA0:
        self.exec_rol(opcode, reg_id)
    elif opcode == 0xA8:
        self.exec_ror(opcode, reg_id)
    elif opcode == 0xE0:
        self.exec_popd_ea(opcode, reg_id)
    elif opcode == 0xE8:
        self.exec_cpr(opcode, reg_id)
    elif opcode == 0xF0:
        self.exec_inc(opcode, reg_id)
    elif opcode == 0xF8:
        self.exec_dec(opcode, reg_id)
    else:
        print(f'Unknown OpCode code {opcode}')
        self.halt_flag = True

As you can see, this is nothing but a long set of “if”, “elif” statements checking for a particular opcode and then dispatching to the exec_xxx method. Finally, if we reach the “else” clause then we have an error. So we print a message and halt. Pretty simple right. The harder part is in the exec_xxx methods. But, only slightly harder. We spent some time building up a set of housekeeping methods that do all the heavy lifting for us. So we only need to call a subset of them passing the correct values to implement any of the exec_xxx methods. 

I’ll present the code for each of the register type instructions below with a short description of what it does. We’ve already seen the code for SET and LD so I’ll skip those.

ST is our next instruction that needs implemented. The description says:

 

The contents of ACC are stored in Rn and the branch conditions are set to reflect the new value of Rn. The carry flag is cleared, and the contents of ACC are left undisturbed.

So this just loads the value in ACC (R0) into Rn. Then it sets the flags to reflect the new value of Rn. Let’s see how we might do this:

def exec_st(self, opcode: int, reg_id: int):
    val = self.get_register(self.ACC)
    self.set_register(reg_id, val)
    self.set_value_flags(val)
    self.clear_flags('C')

Moving on, next we need to handle LD @Rn. Any instruction that uses the @ symbol is using indirect addressing. This means the register Rn is storing an address that we will either write new content into or read the content from. For example, let’s say that R3 contains the value 0x0A (10 in decimal) and we want to get the value located at address 0x0A. We could use LD @R3. R3 contains the memory address we want to get the value from. This shouldn’t be too difficult either:

def exec_ld_ea(self, opcode: int, reg_id: int):
    val = self.peek_byte(self.get_register(reg_id))
    self.inc_register(reg_id)
    self.set_register(self.ACC, val)
    self.set_value_flags(val)
    self.clear_flags('C')

I’ve appended _ea to all instructions that use indirect or effective addressing modes. 

Our housekeeping methods make implementation quick and simple. Now on to ST @Rn. 

As you might have guessed, ST @Rn simply stores the low byte value in the ACC into the address pointed to by Rn and increments Rn. Here’s the official description.

The low order byte of ACC is stored into memory location whose address is resides in Rn. Branch conditions are set to reflect the two byte contents of ACC (R0). The carry flag is cleared and Rn is incremented by 1.

Here’s the code:

def exec_st_ea(self, opcode: int, reg_id: int):
    val = self.get_register_lsb(self.ACC)
    self.poke_byte(self.get_register(reg_id), val)
    self.inc_register(reg_id)
    self.set_value_flags(val)
    self.clear_flags('C')

Most of the register instructions come in two forms, byte and word and are distinguished by the trailing “D” for the word sized version. For example LD is load a byte and LDD is load a word. Respectively, ST is store a byte and STD is store a word. 

Our nest set of instructions are just word versions of their byte sized versions we’ve already dealt with. Their main difference is the need to fetch two bytes and combine them together to form the correct word value. So we’ll just present the code and leave it to you to compare the two versions.

def exec_ldd_ea(self, opcode: int, reg_id: int):
    val = self.peek_word(self.get_register(reg_id))
    self.inc_register(reg_id)
    self.inc_register(reg_id)
    self.set_register(self.ACC, val)
    self.set_value_flags(val)
    self.clear_flags('C')

def exec_std_ea(self, opcode: int, reg_id: int):
    val = self.get_register(self.ACC)
    self.poke_word(self.get_register(reg_id), val)
    self.inc_register(reg_id)
    self.inc_register(reg_id)
    self.set_value_flags(val)
    self.clear_flags('C')

Our next instruction is the POP @Rn instruction. It’s description is:

Rn is decremented by 1. Then the low order byte of ACC (R0) is loaded from the memory locations whose address resides in (the decremented) Rn. The high order byte of ACC is cleared and the status bits are set to reflect the final value of ACC whose content will always be positive. The carry flag is cleared. Because Rn is decremented before prior to loading the ACC, single byte stacks can be implemented using ST @Rn and POP @Rn.

And here’s the code to implement it:

def exec_pop_ea(self, opcode: int, reg_id: int):
    self.dec_register(reg_id)
    val = self.peek_byte(self.get_register(reg_id))
    self.set_register(self.ACC, val)
    self.set_value_flags(val)
    self.clear_flags('C')

STP is a bit unique in that it has a special purpose. It is used for moving memory. It’s a store instruction with indirect addressing followed by a decrement of the pointer value. Here’s the description:

Rn is decremented by 1 and the low order byte of ACC (R0) is stored in to the memory location that resides in Rn. Status bits are set to reflect the final 16 bit value of ACC. The contents of ACC are left undisturbed. STP @Rn and POP @Rn can be used together to move blocks of memory beginning with the greatest address and working downward. 

And the code:

def exec_stp_ea(self, opcode: int, reg_id: int):
    lo = self.get_register_lsb(self.ACC)
    self.dec_register(reg_id)
    self.poke_byte(self.get_register(reg_id), lo)
    self.set_value_flags(lo)
    self.clear_flags('C')

The rest of the register type instructions are pretty easy to figure out. So in the interest of saving time and space I’ll just show the code below. It’s important, however, that you read them and see how they are implemented. Truly, if you read just the first couple you’ll understand them all. So here is the code for the remaining register type instructions:

def exec_add(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    isum = v1 + v2
    self.set_value_flags(isum)
    self.set_overflow_flags(v1, v2, isum)
    self.set_carry(v1, v2, isum)
    val = isum & 0xffff
    self.set_register(self.ACC, val)

def exec_sub(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    isum = v1 - v2
    self.set_value_flags(isum)
    self.set_overflow_flags(v1, v2, isum)
    self.set_borrow(v1, v2)
    val = isum & 0xffff
    self.set_register(self.ACC, val)

def exec_mul(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    isum = v1 * v2
    self.set_value_flags(isum)
    self.set_overflow_flags(v1, v2, isum)
    val = isum & 0xffff
    self.set_register(self.ACC, val)

def exec_div(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    isum = v1 // v2
    self.set_value_flags(isum)
    self.set_overflow_flags(v1, v2, isum)
    val = isum & 0xffff
    self.set_register(self.ACC, val)

def exec_and(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    val = (v1 & v2) & 0xffff
    self.set_register(self.ACC, val)
    self.set_value_flags(val)

def exec_or(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    val = (v1 | v2) & 0xffff
    self.set_register(self.ACC, val)
    self.set_value_flags(val)

def exec_xor(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    val = (v1 ^ v2) & 0xffff
    self.set_register(self.ACC, val)
    self.set_value_flags(val)

def exec_not(self, opcode: int, reg_id: int):
    val = ~self.get_register(reg_id)
    self.set_register(reg_id, val)
    self.set_value_flags(val)
    self.clear_flags('C')

def exec_shl(self, opcode: int, reg_id: int):
    val = self.get_register(reg_id) << 1
    self.set_value_flags(val)
    if ((val & 0x10000) >> 16) == 1:
        self.set_flags('C')
    else:
        self.clear_flags('C')
    self.set_register(reg_id, val)

def exec_shr(self, opcode: int, reg_id: int):
    v1 = self.get_register(reg_id)
    val = v1 >> 1
    self.set_value_flags(val)
    if (v1 & 0x1) == 1:
        self.set_flags('C')
    else:
        self.clear_flags('C')
    self.set_register(reg_id, val)

def exec_rol(self, opcode: int, reg_id: int):
    val = self.get_register(reg_id) << 1
    if ((val & 0x10000) >> 16) == 1:
        val |= 0x1
    self.set_value_flags(val)
    self.clear_flags('C')
    self.set_register(reg_id, val)

def exec_ror(self, opcode: int, reg_id: int):
    v1 = self.get_register(reg_id)
    val = v1 >> 1
    if (val & 0x1) == 1:
        val |= 0b1000_0000_0000_0000
    val = val & 0xffff
    self.set_value_flags(val)
    self.clear_flags('C')
    self.set_register(reg_id, val)

def exec_popd_ea(self, opcode: int, reg_id: int):
    self.dec_register(reg_id)
    hi = self.peek_byte(self.get_register(reg_id))
    self.dec_register(reg_id)
    lo = self.peek_byte(self.get_register(reg_id))
    val = (hi << 8) + lo
    self.set_register(self.ACC, val)
    self.set_value_flags(val)
    self.clear_flags('C')

def exec_cpr(self, opcode: int, reg_id: int):
    v1 = self.get_register(self.ACC)
    v2 = self.get_register(reg_id)
    val = (v1 & 0xFFFF) - (v2 & 0xFFFF)
    self.set_value_flags(val)
    self.set_borrow(v1, v2)
    val = val & 0xFFFF
    self.set_register(self.COMP, val)

def exec_inc(self, opcode: int, reg_id: int):
    self.inc_register(reg_id)
    val = self.get_register(reg_id)
    self.set_value_flags(val)
    self.clear_flags('C')

def exec_dec(self, opcode: int, reg_id: int):
    self.dec_register(reg_id)
    val = self.get_register(reg_id)
    self.set_value_flags(val)
    self.clear_flags('C')

Ok, include this code into your skeleton and try some of the instructions out. Remember to write tests for each of them. Both succeeding and failing tests. 

I’ll end this post here today. Next time we’ll implement the remaining 6 or 7 instruction (which are all non register types) and gain a working emulator. 

For completeness, here’s the complete sweet16gp.py code for this post:

"""
 Sweet16GP an implementation of an 8/16 bit CPU Emulator
 based on Steve Wozniak's SWEET16 virtual machine.
"""

from time import sleep, time
import status_bits


class Sweet16GP:
    MEM = []
    REGFILE = [0, 0, 0, 0, 0, 0, 0, 0]
    ACC = 0
    RETSTACK = 4
    COMP = 5
    STATUS = 6
    PC = 7
    cur_instr = 0
    halt_flag = False

    def __init__(self):
        self.cold_boot()

    # booting routines
    def cold_boot(self, mem_size=256):
        self.init_ram(mem_size)
        self.warm_boot()

    def warm_boot(self):
        self.init_regfile()

    # Memory handling routines
    def init_ram(self, mem_size=265):
        self.MEM = []
        for addr in range(mem_size):
            self.MEM.append(0x00)

    def poke_byte(self, addr: int, val: int):
        self.MEM[addr] = val & 0xff

    def peek_byte(self, addr: int) -> int:
        return self.MEM[addr] & 0xff

    def poke_word(self, addr: int, val: int):
        self.poke_byte(addr, (val & 0xff))
        self.poke_byte(addr+1, (val & 0xff00) >> 8)

    def peek_word(self, addr: int) -> int:
        lo = self.peek_byte(addr)
        hi = self.peek_byte(addr+1)
        val = ((hi << 8) + lo) & 0xffff
        return (hi << 8) + lo

    # Register handling routines
    def init_regfile(self):
        for id in range(len(self.REGFILE)):
            self.REGFILE[id] = 0x00

    def get_register(self, id: int) -> int:
        return self.REGFILE[id] & 0xffff

    def get_register_lsb(self, id: int) -> int:
        return self.REGFILE[id] & 0xff

    def get_register_msb(self, id: int) -> int:
        return (self.REGFILE[id] & 0xff00) >> 8

    def set_register(self, id: int, val: int):
        self.REGFILE[id] = val & 0xffff

    def set_register_lsb(self, id: int, val: int):
        hi = self.REGFILE[id] & 0xff00
        self.REGFILE[id] = (hi | (val & 0xff)) & 0xffff

    def set_register_msb(self, id: int, val: int):
        lo = self.REGFILE[id] & 0xff
        self.REGFILE[id] = ((val & 0xff) << 8) | lo

    def inc_register(self, _id: int):
        val = self.get_register(_id)
        val = (val + 1) & 0xffff
        self.set_register(_id, val)

    def dec_register(self, id: int):
        val = self.get_register(id)
        val = (val - 1) & 0xffff
        self.set_register(id, val)

    # Memory handling routines
    def init_ram(self, mem_size=265):
        self.MEM = []
        for addr in range(mem_size):
            self.MEM.append(0x00)

    def poke_byte(self, addr: int, val: int):
        self.MEM[addr] = val & 0xff

    def peek_byte(self, addr: int) -> int:
        return self.MEM[addr] & 0xff

    def poke_word(self, addr: int, val: int):
        self.poke_byte(addr, (val & 0xff))
        self.poke_byte(addr + 1, (val & 0xff00) >> 8)

    def peek_word(self, addr: int) -> int:
        lo = self.peek_byte(addr)
        hi = self.peek_byte(addr + 1)
        val = ((hi << 8) + lo) & 0xffff
        return (hi << 8) + lo

    # Bit twiddling routines
    def bit_set(self, val: int, bit: int) -> int:
        return val | (1 << bit)

    def bit_clear(self, val: int, bit: int) -> int:
        return val & ~(1 << bit)

    def bit_toggle(self, val: int, bit: int) -> int:
        return val ^ (1 << bit)

    def bit_test(self, val: int, bit: int) -> int:
        return (val & (1 << bit)) >> bit

    # STATUS Flags routines
    def set_flags(self, flags: str):
        flags = flags.upper()
        if flags.__contains__('C'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.C)
        if flags.__contains__('Z'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.Z)
        if flags.__contains__('V'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.V)
        if flags.__contains__('N'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.N)

    def clear_flags(self, flags: str):
        flags = flags.upper()
        if flags.__contains__('C'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.C)
        if flags.__contains__('Z'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.Z)
        if flags.__contains__('V'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.V)
        if flags.__contains__('N'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.N)

    def test_flag(self, flag: str) -> int:
        flags = flag.upper()
        if flags.__contains__('C'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.C)) >> status_bits.C
        if flags.__contains__('Z'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.Z)) >> status_bits.Z
        if flags.__contains__('V'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.V)) >> status_bits.V
        if flags.__contains__('N'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.N)) >> status_bits.N

    def set_value_flags(self, val: int):
        # Note we can't set the overflow
        # flag here as we need both input
        # and result values to computer
        # overflow.
        if val == 0:
            self.set_flags('Z')
        else:
            self.clear_flags('Z')

        if val > 0b0111_1111_1111_1111:
            self.set_flags('N')
        else:
            self.clear_flags('N')

        if (val & 0x10000) >> 17:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    def set_overflow_flags(self, v1: int, v2: int, result: int):
        if ((v1 & (1 << 15)) >> 15 & (v2 & (1 << 15)) >> 15) == 1:
            # Both values have the sign bit set
            # So the result should have the sign
            # bit unset. If not, we have overflow
            if not (result & (1 << 15)) > 0:
                self.set_flags('V')
            else:
                self.clear_flags('V')
        elif ((v1 & (1 << 15)) >> 15 | (v2 & (1 << 15)) >> 15) == 0:
            # Both values have the sign bits unset
            # So, the result should have the sign
            # bit unset.
            if ((result & (1 << 15)) >> 15) == 1:
                self.set_flags('V')
            else:
                self.clear_flags('V')

    def set_carry(self, v1, v2, result):
        # value must be unmasked or it won't include the 17th bit.
        if ((result & (1 << 16)) >> 16) == 1:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    def set_borrow(self, v1, v2):
        if v1 < v2:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    # Instruction Decoding
    def decode_inst(self, instr: int):
        if instr > 7:
            opcode = instr & 0b11111000
            reg_id = instr & 0b00000111
            self.decode_register_instr(opcode, reg_id)
        else:
            self.decode_non_register_instr(instr)

    def decode_register_instr(self, opcode: int, reg_id: int):
        if opcode == 0x08:
            self.exec_set(opcode, reg_id)
        elif opcode == 0x10:
            self.exec_ld(opcode, reg_id)
        elif opcode == 0x18:
            self.exec_st(opcode, reg_id)
        elif opcode == 0x20:
            self.exec_ld_ea(opcode, reg_id)
        elif opcode == 0x28:
            self.exec_st_ea(opcode, reg_id)
        elif opcode == 0x30:
            self.exec_ldd_ea(opcode, reg_id)
        elif opcode == 0x38:
            self.exec_std_ea(opcode, reg_id)
        elif opcode == 0x40:
            self.exec_pop_ea(opcode, reg_id)
        elif opcode == 0x48:
            self.exec_stp_ea(opcode, reg_id)
        elif opcode == 0x50:
            self.exec_add(opcode, reg_id)
        elif opcode == 0x58:
            self.exec_sub(opcode, reg_id)
        elif opcode == 0x60:
            self.exec_mul(opcode, reg_id)
        elif opcode == 0x68:
            self.exec_div(opcode, reg_id)
        elif opcode == 0x70:
            self.exec_and(opcode, reg_id)
        elif opcode == 0x78:
            self.exec_or(opcode, reg_id)
        elif opcode == 0x80:
            self.exec_xor(opcode, reg_id)
        elif opcode == 0x88:
            self.exec_not(opcode, reg_id)
        elif opcode == 0x90:
            self.exec_shl(opcode, reg_id)
        elif opcode == 0x98:
            self.exec_shr(opcode, reg_id)
        elif opcode == 0xA0:
            self.exec_rol(opcode, reg_id)
        elif opcode == 0xA8:
            self.exec_ror(opcode, reg_id)
        elif opcode == 0xE0:
            self.exec_popd_ea(opcode, reg_id)
        elif opcode == 0xE8:
            self.exec_cpr(opcode, reg_id)
        elif opcode == 0xF0:
            self.exec_inc(opcode, reg_id)
        elif opcode == 0xF8:
            self.exec_dec(opcode, reg_id)
        else:
            print(f'Unknown OpCode code {opcode}')
            self.halt_flag = True

    def exec_set(self, opcode: int, reg_id: int):
        self.inc_register(self.PC)
        lo = self.peek_byte(self.get_register(self.PC))
        self.inc_register(self.PC)
        hi = self.peek_byte(self.get_register(self.PC))
        val = (hi << 8) + lo
        self.set_register(reg_id, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_ld(self, opcode: int, reg_id: int):
        val = self.get_register(reg_id)
        self.set_register(self.ACC, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_st(self, opcode: int, reg_id: int):
        val = self.get_register(self.ACC)
        self.set_register(reg_id, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_ld_ea(self, opcode: int, reg_id: int):
        val = self.peek_byte(self.get_register(reg_id))
        self.inc_register(reg_id)
        self.set_register(self.ACC, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_st_ea(self, opcode: int, reg_id: int):
        val = self.get_register_lsb(self.ACC)
        self.poke_byte(self.get_register(reg_id), val)
        self.inc_register(reg_id)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_ldd_ea(self, opcode: int, reg_id: int):
        val = self.peek_word(self.get_register(reg_id))
        self.inc_register(reg_id)
        self.inc_register(reg_id)
        self.set_register(self.ACC, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_std_ea(self, opcode: int, reg_id: int):
        val = self.get_register(self.ACC)
        self.poke_word(self.get_register(reg_id), val)
        self.inc_register(reg_id)
        self.inc_register(reg_id)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_pop_ea(self, opcode: int, reg_id: int):
        self.dec_register(reg_id)
        val = self.peek_byte(self.get_register(reg_id))
        self.set_register(self.ACC, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_stp_ea(self, opcode: int, reg_id: int):
        lo = self.get_register_lsb(self.ACC)
        self.dec_register(reg_id)
        self.poke_byte(self.get_register(reg_id), lo)
        self.set_value_flags(lo)
        self.clear_flags('C')

    def exec_add(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        isum = v1 + v2
        self.set_value_flags(isum)
        self.set_overflow_flags(v1, v2, isum)
        self.set_carry(v1, v2, isum)
        val = isum & 0xffff
        self.set_register(self.ACC, val)

    def exec_sub(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        isum = v1 - v2
        self.set_value_flags(isum)
        self.set_overflow_flags(v1, v2, isum)
        self.set_borrow(v1, v2)
        val = isum & 0xffff
        self.set_register(self.ACC, val)

    def exec_mul(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        isum = v1 * v2
        self.set_value_flags(isum)
        self.set_overflow_flags(v1, v2, isum)
        val = isum & 0xffff
        self.set_register(self.ACC, val)

    def exec_div(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        isum = v1 // v2
        self.set_value_flags(isum)
        self.set_overflow_flags(v1, v2, isum)
        val = isum & 0xffff
        self.set_register(self.ACC, val)

    def exec_and(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        val = (v1 & v2) & 0xffff
        self.set_register(self.ACC, val)
        self.set_value_flags(val)

    def exec_or(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        val = (v1 | v2) & 0xffff
        self.set_register(self.ACC, val)
        self.set_value_flags(val)

    def exec_xor(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        val = (v1 ^ v2) & 0xffff
        self.set_register(self.ACC, val)
        self.set_value_flags(val)

    def exec_not(self, opcode: int, reg_id: int):
        val = ~self.get_register(reg_id)
        self.set_register(reg_id, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_shl(self, opcode: int, reg_id: int):
        val = self.get_register(reg_id) << 1
        self.set_value_flags(val)
        if ((val & 0x10000) >> 16) == 1:
            self.set_flags('C')
        else:
            self.clear_flags('C')
        self.set_register(reg_id, val)

    def exec_shr(self, opcode: int, reg_id: int):
        v1 = self.get_register(reg_id)
        val = v1 >> 1
        self.set_value_flags(val)
        if (v1 & 0x1) == 1:
            self.set_flags('C')
        else:
            self.clear_flags('C')
        self.set_register(reg_id, val)

    def exec_rol(self, opcode: int, reg_id: int):
        val = self.get_register(reg_id) << 1
        if ((val & 0x10000) >> 16) == 1:
            val |= 0x1
        self.set_value_flags(val)
        self.clear_flags('C')
        self.set_register(reg_id, val)

    def exec_ror(self, opcode: int, reg_id: int):
        v1 = self.get_register(reg_id)
        val = v1 >> 1
        if (val & 0x1) == 1:
            val |= 0b1000_0000_0000_0000
        val = val & 0xffff
        self.set_value_flags(val)
        self.clear_flags('C')
        self.set_register(reg_id, val)

    def exec_popd_ea(self, opcode: int, reg_id: int):
        self.dec_register(reg_id)
        hi = self.peek_byte(self.get_register(reg_id))
        self.dec_register(reg_id)
        lo = self.peek_byte(self.get_register(reg_id))
        val = (hi << 8) + lo
        self.set_register(self.ACC, val)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_cpr(self, opcode: int, reg_id: int):
        v1 = self.get_register(self.ACC)
        v2 = self.get_register(reg_id)
        val = (v1 & 0xFFFF) - (v2 & 0xFFFF)
        self.set_value_flags(val)
        self.set_borrow(v1, v2)
        val = val & 0xFFFF
        self.set_register(self.COMP, val)

    def exec_inc(self, opcode: int, reg_id: int):
        self.inc_register(reg_id)
        val = self.get_register(reg_id)
        self.set_value_flags(val)
        self.clear_flags('C')

    def exec_dec(self, opcode: int, reg_id: int):
        self.dec_register(reg_id)
        val = self.get_register(reg_id)
        self.set_value_flags(val)
        self.clear_flags('C')

    def decode_non_register_instr(self, opcode: int):
        if opcode == 0x00:
            self.exec_halt()

    def exec_halt(self):
        self.halt_flag = True

    # Misc Methods
    def dump(self):
        print('\n\nSweet16-GP CPU')
        print('---------------------------------------------------------')
        self.dump_status()
        self.dump_registers()
        self.dump_memory()

    def dump_status(self):
        result = ''
        status = self.REGFILE[self.STATUS]
        # print('Status 0b{:016b}'.format(status))
        if self.bit_test(status, status_bits.C):
            result += 'C, '
        if self.bit_test(status, status_bits.Z):
            result += 'Z, '
        if self.bit_test(status, status_bits.V):
            result += 'V, '
        if self.bit_test(status, status_bits.N):
            result += 'N, '
        result = result[:-2]
        print('Status Flags: 0b{:04b}'.format(self.REGFILE[self.STATUS]))
        print('---------------------------------------------------------')
        print(str(result) + ' \n')

    def dump_registers(self):
        idx = 0
        col = 0
        max_cols = 4
        print('Registers')
        print('---------------------------------------------------------')
        for r in self.REGFILE:
            if idx == 0:
                print('ACC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 4:
                print('RETPTR (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 5:
                print('COMPARE (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 6:
                print('STATUS (R{:02}):0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 7:
                print('PC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            else:
                # General purpose register
                print('R{:02}: 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            idx += 1
            col += 1
            if col >= max_cols:
                col = 0
                print()
        print()

    def dump_memory(self, max_bytes=256):
        col = 0
        row = 0

        max_cols = 16
        max_rows = int(max_bytes / max_cols)
        idx = 0
        print('\nAddr   Data                                  MEMORY DUMP')
        print('------------------------------------------------------------------------------------------------------')
        for b in self.MEM:
            if col == 0:
                print('0x{:04x}'.format(idx), end=':  ')
                print('0x{:02x}'.format(b), end=', ')
            else:
                print('0x{:02x}'.format(b), end=', ')
            col += 1
            if col >= max_cols:
                row += 1
                col = 0
                print()
            if row >= max_rows:
                print()
                break;
            idx += 1
        print()

    # CPU control routines
    def run(self):
        while not self.halt_flag and self.get_register(self.PC) < len(self.MEM):
            self.cur_instr = self.peek_byte(self.get_register(self.PC))
            print(f'PC: {hex(self.get_register(self.PC))}, cur_instr: {hex(self.cur_instr)}')
            self.decode_inst(self.cur_instr)
            sleep(1)
            self.inc_register(self.PC)
            

def main():
    cpu = Sweet16GP()
    cpu.set_register(0, 0xffae)
    cpu.poke_byte(0, 0x08)
    cpu.poke_byte(1, 0x05)
    cpu.poke_byte(2, 0xff)
    cpu.run()
    cpu.dump()

if __name__ == '__main__':
    # This method is for simple testing
    # and demonstration purposes.
    main()

Sweet16 CPU Emulator

Part 2

Last time I presented the Sweet16-GP and told you I would present an emulated version of the processor. That’s what I intend to do, so let’s get started.

First, what is a software CPU emulator. A software emulator is software that runs on a computer (called the host) and enables that computer to behave like another computer (called the guest). An emulator typically enables the host system to run software or use peripheral devices designed for the guest system. In our case, our emulator will enable our system to run software written for the Sweet16-GP on any computer capable of running python 3.x. In the near future I may rewrite this emulator in PHP and host it so that students it in my workshops can use it online. But for now, I’ll present the python 3 version. 

So why would we want an emulator? Why not just build the hardware? Having an emulator will give us a reference design for the hardware. It will allow us to work out issues that might arise in our instruction set and our cpu design concept. Additionally, it will allow software to be developed while the hardware is being developed. This means that we can have some working software to test the hardware on it’s first boot up.

So what do we need to create an emulator? Well, we need the same things a real system will need. First, we need a CPU. The CPU consists of the REGISTER FILE, ALU, and CONTROL LOGIC. In addition we need a place to store our code and data (memory). Memory is simple. We can use an array or list. The ALU consists mainly of logic routines, the register file is again just a list. Then we need a few helper functions.  Writing an emulator sounds like it would be really hard. However, once you see the code you realize it’s not difficult at all.

Let’s start by defining a CPU class for the Sweet16-GP.

class Sweet16GP:

def __init__(self):
pass

The first thing our CPU will do is boot. The initial boot up that occurs when power is first applied is called a cold boot. To preform the cold boot we’ll simply write a method. We don’t have this method yet so let’s add it and call it from the __init__() method. 

class Sweet16GP:

def __init__(self):
self.cold_boot()

# booting routines
def cold_boot(self, mem_size=256):
pass

Ok, that’s pretty simple. Now our CPU doesn’t do anything yet and it’s still lacking important parts. First, we need some RAM and our Register File. So let’s give our little CPU 16 bytes of RAM and create our Register File to store our registers R0 – R7. We will implement both the register file and RAM as a simple python list. We’ll also include some constants to allow us to refer to the registers by name (ACC, RETSTACK, COMP, STATUS, and PC).

class Sweet16GP:
MEM = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
REGFILE = [0, 0, 0, 0, 0, 0, 0, 0]
ACC = 0
RETSTACK = 4
COMP = 5
STATUS = 6
PC = 7

def __init__(self):
self.cold_boot()

# booting routines
def cold_boot(self, mem_size=256):
pass

Now it’s starting to take shape. Recall from part 1 that our accumulator is register 0, the subroutine return stack pointer is register 4, the compare results register is register 5, the status register is register 6 and the program counter is register 7. We define constants within the class for these values so we can refer to these registers by name when it is desirable to do so.

So now we have 16 bytes of ram we can place instructions in, and our 8 registers. Now we need a way to fetch the instructions we place in memory. For this we’ll need a flag variable to indicate if the CPU is in a halt state and a variable to store the current instruction.

class Sweet16GP:
MEM = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
REGFILE = [0, 0, 0, 0, 0, 0, 0, 0]
ACC = 0
RETSTACK = 4
COMP = 5
STATUS = 6
PC = 7
cur_instr = 0
halt_flag = False

def __init__(self):
self.cold_boot()

# booting routines
def cold_boot(self, mem_size=256):
pass


The halt_flag will be used to halt processor execution and cur_instr will be used to store our current instruction.The PC will be updated to point to the next instruction/data location.

Our next task is to write some code for fetching instructions from memory. We’ll call this the run method as it’s job will be to walk through memory grabbing the instructions pointed to be the PC. 

# Run methods
def run(self):
while not self.halt_flag and self.get_register(self.PC) < len(self.MEM):
self.cur_instr = self.peek_byte(self.get_register(self.PC))
print(f'PC: {hex(self.get_register(self.PC))}, cur_instr: {hex(self.cur_instr)}')
self.decode_inst(self.cur_instr)
sleep(1)
self.inc_register(self.PC)

Now this method is making use of methods we don’t have but will soon need. I tend to code what I want and then implement it. Looking at the run method we can see we’ll need a few new methods. But before we talk about them let’s walk through what the run method is doing here.

Let’s see what we have so far:

First we enter a while loop and test if the halt flag is set and if the PC points to a valid memory location. Next, I included a print statement to display the value of the program counter and the current instruction. Then we make a call to decode_inst() passing this method the current instruction. decode_inst will be used to decode our instruction. Remember that our hardware processor includes an instruction decoder. We’ll be implementing all the parts of our hardware processor as method in our CPU class.

class Sweet16GP:
    MEM = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    REGFILE = [0, 0, 0, 0, 0, 0, 0, 0]
    ACC = 0
    RETSTACK = 4
    COMP = 5
    STATUS = 6
    PC = 7
    cur_instr = 0
    halt_flag = False

    def __init__(self):
        self.cold_boot()

    # booting routines
    def cold_boot(self, mem_size=256):
        pass

    # Register routines
    def self.get_register(self, reg):
        pass

    def inc_register(self, reg):
        pass

    # Memory access routines
    def peek_byte(self, addr):
        pass

    # Instruction decoding routines
    def decode_inst(self, instr):
        pass

    # CPU control routines
    def run(self):
        while not self.halt_flag and self.get_register(self.PC) < len(self.MEM):
            self.cur_instr = self.peek_byte(self.get_register(self.PC))
            print(f'PC: {hex(self.get_register(self.PC))}, cur_instr: {hex(self.cur_instr)}')
            self.decode_inst(self.cur_instr)
            sleep(1)
            self.inc_register(self.PC)

OK, at this point we can start to fill in a couple of these methods. But before we do there is an important issue we need to discuss. Real hardware has a finite number of bits used to store it’s contents. Python however, can store almost any value in our registers and memory. This isn’t good as it doesn’t represent a real CPU. We need a way to restrict the values stored in our memory and registers to only those values that can be stored in our hardware version of the CPU.

To accomplish this we will use python’s bitwise logical operators. If you’re not familiar with them, you may want to head on over to the python website and read up them: &, |, ~, <<, and >>. First, & is the bitwise logical AND operator. The pipe | character is the bitwise OR operator, and the tilde is the bitwise NOT operator. The double less-than is a bitwise shift left, and the double greater than is a bitwise shift right operator. If you don’t understand these completely and their typical uses, then read up on them.

We can use these operators to manipulate the values we store in our memory and register lists. If we & (and) the values we place in memory or registers with the values 0xff or 0xffff we can limit the values stored to 8 and 16 bits respectively. This is just what we need to restrict our values to a range we can store on our hardware. Let’s implement the get_register() method. This method simply needs to take in an integer value and store it in the proper location in our register file.

def get_register(self, id: int) -> int:
    return self.REGFILE[id] & 0xffff

Simple, isn’t it!

Now we’ve left a couple of things undone. Remember our cold_boot() method? Well it doesn’t do anything right now. In an real processor this would reset things like clocks and counters and put everything into a known state so we have a regular starting point. In our emulator we need it to do the same task. However, most systems also have what is called a warm boot. A warm boot occurs when you reset the system using the reset signal. This signal causes the CPU registers to be set into a known initial state. Just as the cold boot does. In real hardware there may be some differences. In our software version we will take a little bit of leeway here do something the real hardware won’t do. We will clear the ram (set it to all zeros values) in the cold boot and leave it untouched during a warm boot. This will help us in keeping track of what values in ram we’ve changed and make debugging easier. However, it’s important to document this deviation from true emulation of the real hardware which will not zero memory for us. So let’s write the code to initialize the memory to all zero values.

# Memory handling routines
def init_ram(self, mem_size=265):
    self.MEM = []
    for addr in range(mem_size):
        self.MEM.append(0x00)

We also need to fill in our peek_byte() method and while we’re at it, we should just complete our memory manipulation methods. We know we’re going to need to be able to look (peek) at memory addresses and see what value they contain. We will also need to be able to store (poke) values into an address location, and we should be able to do this with both 8 and 16 bit values. So that leaves us needing the following methods: peek_byte(), poke_byte(), peek_word(), and poke_word(). So our full memory manipulation repertoire now looks as follows: 

# Memory handling routines
def init_ram(self, mem_size=265):
    self.MEM = []
    for addr in range(mem_size):
        self.MEM.append(0x00)

def poke_byte(self, addr: int, val: int):
    self.MEM[addr] = val & 0xff

def peek_byte(self, addr: int) -> int:
    return self.MEM[addr] & 0xff

def poke_word(self, addr: int, val: int):
    self.poke_byte(addr, (val & 0xff))
    self.poke_byte(addr+1, (val & 0xff00) >> 8)

def peek_word(self, addr: int) -> int:
    lo = self.peek_byte(addr)
    hi = self.peek_byte(addr+1)
    val = ((hi << 8) + lo) & 0xffff
    return (hi << 8) + lo

For our boot process we will have __init__() call cold_boot() which will call our init_ram() method to clear memory. But we still need to implement a method to clear the register files. We will have cold_boot() call our warm_boot() method which will in turn call init_regfile() which will zero out all our registers. We could do this all in one method but it’s important to keep each step separate as when the hardware is built some aspects of this process may change, and we’ll want the structure in place to support those changes. I realize to some of you with a lot of programming experience you may be saying “Wait! We shouldn’t be trying to anticipate the future. We should just write the least amount of code to handle the current situation.” And in most cases this is true. However, experience has taught me that we want to call out each process our hardware may go through or down the road we may end up refactoring a lot of code. Also, it’s important from a teaching perspective to have each of these processes in place so those just learning can see each process represented. So let’s see what our boot code looks like now.

def __init__(self):
    self.cold_boot()

# booting routines
def cold_boot(self, mem_size=256):
    self.init_ram(mem_size)
    self.warm_boot()

def warm_boot(self):
    self.init_regfile()

# Register handling routines
def init_regfile(self):
    for id in range(len(self.REGFILE)):
        self.REGFILE[id] = 0x00

Ok, while we’re at it, we are going to need a handful of register manipulation methods similar to those for memory manipulation. Recall our registers are 16 bits wide but some instructions work on byte values. We will need to be able to read and write our registers in both word and byte modes. In byte mode we need to be able to read/write both the high and low order bytes in the registers.  We also need to be able to increment and decrement registers. Finally, don’t forget we need to restrict any value going into a register to it’s byte or word size. Below is our complete set of register methods:

# Register handling routines
def init_regfile(self):
    for id in range(len(self.REGFILE)):
        self.REGFILE[id] = 0x00

def get_register(self, id: int) -> int:
    return self.REGFILE[id] & 0xffff

def get_register_lsb(self, id: int) -> int:
    return self.REGFILE[id] & 0xff

def get_register_msb(self, id: int) -> int:
    return (self.REGFILE[id] & 0xff00) >> 8

def set_register(self, id: int, val: int):
    self.REGFILE[id] = val & 0xffff

def set_register_lsb(self, id: int, val: int):
    hi = self.REGFILE[id] & 0xff00
    self.REGFILE[id] = (hi | (val & 0xff)) & 0xffff

def set_register_msb(self, id: int, val: int):
    lo = self.REGFILE[id] & 0xff
    self.REGFILE[id] = ((val & 0xff) << 8) | lo

def inc_register(self, id: int):
    val = self.get_register(id)
    val = (val + 1) & 0xffff
    self.set_register(id, val)

def dec_register(self, id: int):
    val = self.get_register(id)
    val = (val - 1) & 0xffff
    self.set_register(id, val)

OK, all through this process I’ve been keeping something from you. I’ve presented code as if I have been magically writing each method and simply know instinctively that it works! Well, most of these methods did indeed work the first time I wrote them, but not all of them. What I have been leaving out is the testing that was done during development of each method. I am a big fan of Test Driven Development (TDD, however it isn’t for every project).  While developing the emulator, I wrote tests even before writing my methods. This allows you to slow down and think about what you need and how you’re going to implement it. 

I use “unittest” to write unit tests for each method. I test the methods with tests that should succeed and tests that should fail. I try to work out edge cases where there may be issues. Like what happens if I multiply two 16 bit values and place the results into a register without limiting the value to an 8 or 16 bit range as needed? Unit tests are very important in developing code that can be proven to work. However, remember your code is proven only as far as you test it. If you don’t test for a particular input, you can’t prove your code won’t fail given that input. Being human, we will rarely test for all possible values of input. In many cases this would be impractical and in some cases impossible. But we should do our best, within reason, to test each method in every class and every function in our code. I bring this up here as we are going to get quite a bit of code built up here and it’s best to know that your methods work properly before adding more complexity to the project. Below I show a couple test methods. However, I’d like you to write your own tests to test each method we have so far and each method I present in the future. Writing tests is a good habit you should get into doing for all your code.

def test_warm_boot(self):
    from sweet16gp import Sweet16GP
    cpu = Sweet16GP()

    cpu.REGFILE[cpu.ACC] = 0xffff
    cpu.REGFILE[cpu.RETSTACK] = 0xffff
    cpu.REGFILE[cpu.COMP] = 0xffff
    cpu.REGFILE[cpu.STATUS] = 0xffff
    cpu.REGFILE[cpu.PC] = 0xffff

    cpu.warm_boot()
    self.assertEqual(0, cpu.REGFILE[cpu.ACC], 'Failed to clear ACC register (R0)')
    self.assertEqual(0, cpu.REGFILE[cpu.RETSTACK], 'Failed to clear RETSTACK register (R4)')
    self.assertEqual(0, cpu.REGFILE[cpu.COMP], 'Failed to clear COMP register (R5)')
    self.assertEqual(0, cpu.REGFILE[cpu.STATUS], 'Failed to clear STATUS register (R6)')
    self.assertEqual(0, cpu.REGFILE[cpu.PC], 'Failed to clear PC register (R7)')

def test_init_regfile(self):
    from sweet16gp import Sweet16GP
    cpu = Sweet16GP()
    self.assertEqual(8, len(cpu.REGFILE), 'Failed to initialize the register file')

def test_get_register(self):
    from sweet16gp import Sweet16GP
    cpu = Sweet16GP()
    cpu.REGFILE[cpu.ACC] = 0xfafc
    self.assertEqual(0xfafc, cpu.get_register(cpu.ACC), 'Failed to get register ACC (R0)')

The above code is only a short snippet from my tests, which contain multiple tests for each method written. Testing your code this way will make development easier and it can also help people get up to speed with your code by providing some sample calls.  However, you shouldn’t take the tests as absolute proof the code is without error. It’s only proven as far as it is tested! 

Also, note that when possible I try to limit the use of additional methods to setup the pre-test conditions. For example, I directly set the values of registers in the test_warm_boot() method and then call warm_boot() which should reset them all to zero. Then I test with assert methods whether the registers were indeed reset to zero. If I had used the set_register() method to place the pre-test values into the registers and a test failed, I wouldn’t know if it was the set_register() method or the warm_boot()/init_register() methods that failed. However, once I have a good set of tests in place to prove that set_register() works as expected, I can then use it in later tests. This also brings up the point that if you are new to a code base, you should be careful about how you use test code to interpret how to use the code base. In production code you wouldn’t want to directly set values in the registers, as I did in the warm_boot() method above. You would instead want to call the set_register() method to do that for you and get the added bonus of being guaranteed that the value will be limited to the acceptable range of values for that call.

 

Ok, enough of the sidebar on testing. Let’s get back to implementation. If we look over the instructions in part 1 we can see some instructions will require bit manipulation, so we’re going to need a set of method for performing a set of common bit twiddling tasks. For example, given a value we will need methods to set, clear, toggle, and test a particular bit in the value. Let’s see what these look like:

# Bit twiddling routines
def bit_set(self, val: int, bit: int) -> int:
    return val | (1 << bit)

def bit_clear(self, val: int, bit: int) -> int:
    return val & ~(1 << bit)

def bit_toggle(self, val: int, bit: int) -> int:
    return val ^ (1 << bit)

def bit_test(self, val: int, bit: int) -> int:
    return (val & (1 << bit)) >> bit

Go ahead and write some tests for the above routines. Make sure you know what they do and how to use them. These routines mirror the processes that occur in real hardware so it is important to have a good grasp of bit manipulations and how to use them.


These routines lead me to a closely related subject. That of status bits. Recall from part 1 that the CPU contains a STATUS register. This register contains bits that act as flags for certain predefined conditions. These flags almost always represent the state or value of the accumulator (ACC, R0) register. However, they may also represent the results of a CPR instruction. The CPR (Compare) instruction places it’s results in the COMP register (R5) and sets the flags in the Status register to represent the results. The flags bits consist of the N flag which is set if the result of an operation is negative, the V flag which is set if the result of an operation overflows the range of values that can be stored in the machine, the C flag which is set when a carry or borrow occurs in an arithmetic operation, and the Z or zero flag which indicates the previous operation resulted in a zero value.

These flags are used to control program flow via branch instructions. The BEQ instructions is usually preceded with a CPR instruction which compares two values by subtracting one from the other. If the result is zero, the two values were equal and the branch occurs. The overflow and carry flags are used in mathematical operations to test if a result is valid. The N flag can be used to indicate if one value is less or greater than another. 

We need to implement methods to set and test these bits in the STATUS register. First, we need to define in which bit of the STATUS register each flags resides. To do this I created a new file called status_bits.py. It simply holds the definition of each flag’s bit position.

C: int = 0
Z: int = 1
V: int = 6
N: int = 7

As you can see, the two flags reside at each end of the low order byte of the STATUS register. Once we import this file into our main CPU file we will have access to these values. We’ll need these in our STATUS flag routines which follow:

# STATUS Flags routines
def set_flags(self, flags: str):
    flags = flags.upper()
    if flags.__contains__('C'):
        self.REGFILE[self.STATUS] |= (1 << status_bits.C)
    if flags.__contains__('Z'):
        self.REGFILE[self.STATUS] |= (1 << status_bits.Z)
    if flags.__contains__('V'):
        self.REGFILE[self.STATUS] |= (1 << status_bits.V)
    if flags.__contains__('N'):
        self.REGFILE[self.STATUS] |= (1 << status_bits.N)

def clear_flags(self, flags: str):
    flags = flags.upper()
    if flags.__contains__('C'):
        self.REGFILE[self.STATUS] &= ~(1 << status_bits.C)
    if flags.__contains__('Z'):
        self.REGFILE[self.STATUS] &= ~(1 << status_bits.Z)
    if flags.__contains__('V'):
        self.REGFILE[self.STATUS] &= ~(1 << status_bits.V)
    if flags.__contains__('N'):
        self.REGFILE[self.STATUS] &= ~(1 << status_bits.N)

def test_flag(self, flag: str) -> int:
    flags = flag.upper()
    if flags.__contains__('C'):
        return (self.REGFILE[self.STATUS] & (1 << status_bits.C)) >> status_bits.C
    if flags.__contains__('Z'):
        return (self.REGFILE[self.STATUS] & (1 << status_bits.Z)) >> status_bits.Z
    if flags.__contains__('V'):
        return (self.REGFILE[self.STATUS] & (1 << status_bits.V)) >> status_bits.V
    if flags.__contains__('N'):
        return (self.REGFILE[self.STATUS] & (1 << status_bits.N)) >> status_bits.N

def set_value_flags(self, val: int):
    # Note we can't set the overflow
    # flag here as we need both input
    # and result values to computer
    # overflow.
    if val == 0:
        self.set_flags('Z')
    else:
        self.clear_flags('Z')

    if val > 0b0111_1111_1111_1111:
        self.set_flags('N')
    else:
        self.clear_flags('N')

    if (val & 0x10000) >> 17:
        self.set_flags('C')
    else:
        self.clear_flags('C')

def set_overflow_flags(self, v1: int, v2: int, result: int):
    if ((v1 & (1 << 15)) >> 15 & (v2 & (1 << 15)) >> 15) == 1:
        # Both values have the sign bit set
        # So the result should have the sign
        # bit unset. If not, we have overflow
        if not (result & (1 << 15)) > 0:
            self.set_flags('V')
        else:
            self.clear_flags('V')
    elif ((v1 & (1 << 15)) >> 15 | (v2 & (1 << 15)) >> 15) == 0:
        # Both values have the sign bits unset
        # So, the result should have the sign
        # bit unset.
        if ((result & (1 << 15)) >> 15) == 1:
            self.set_flags('V')
        else:
            self.clear_flags('V')

def set_carry(self, v1, v2, result):
    # value must be unmasked or it won't include the 17th bit.
    if ((result & (1 << 16)) >> 16) == 1:
        self.set_flags('C')
    else:
        self.clear_flags('C')

def set_borrow(self, v1, v2):
    if v1 < v2:
        self.set_flags('C')
    else:
        self.clear_flags('C')

Again, write some tests for these routines. Read up on overflow condition detection and binary arithmetic and be sure you understand what is going on here.  Note that for overflow we can’t just do a test and set, as the overflow condition occurs in two’s complement arithmetic and requires that we know the state of the values previous to the operation performed in order to properly calculate the overflow.

Also, the carry flag can only be set using the raw (unmasked) result of an operation, as it depends on the 17th bit of the result being tested. If we use a mask value to limit our results to 16 bits then there will be no carry bit to test, so this method must be called before masking the results to 16 bits.

Now it would be nice to have some methods to allow us to peek into the processor and see its state. The debugger in your IDE is good for this, but having a nice dump of the registers, memory and flags would allow us to visually check the results of an operation without needing the debugger.  Let’s implement a few routines for this just to make our lives easier.

# Misc Methods
def dump(self):
    print('\n\nSweet16-GP CPU')
    print('---------------------------------------------------------')
    self.dump_status()
    self.dump_registers()
    self.dump_memory()

def dump_status(self):
    result = ''
    status = self.REGFILE[self.STATUS]
    #print('Status 0b{:016b}'.format(status))
    if self.bit_test(status,status_bits.C):
        result += 'C, '
    if self.bit_test(status, status_bits.Z):
        result += 'Z, '
    if self.bit_test(status, status_bits.V):
        result += 'V, '
    if self.bit_test(status, status_bits.N):
        result += 'N, '
    result = result[:-2]
    print('Status Flags: 0b{:04b}'.format(self.REGFILE[self.STATUS]))
    print('---------------------------------------------------------')
    print(str(result) +' \n')

def dump_registers(self):
    idx = 0
    col = 0
    max_cols = 4
    print('Registers')
    print('---------------------------------------------------------')
    for r in self.REGFILE:
        if idx == 0:
            print('ACC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
        elif idx == 4:
            print('RETPTR (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
        elif idx == 5:
            print('COMPARE (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)),end=',  ')
        elif idx == 6:
            print('STATUS (R{:02}):0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
        elif idx == 7:
            print('PC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
        else:
            # General purpose register
            print('R{:02}: 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
        idx += 1
        col += 1
        if col >= max_cols:
            col = 0
            print()
    print()

def dump_memory(self, max_bytes=256):
    col = 0
    row = 0

    max_cols = 16
    max_rows = int(max_bytes/max_cols)
    idx = 0
    print('\nAddr   Data                                  MEMORY DUMP')
    print('------------------------------------------------------------------------------------------------------')
    for b in self.MEM:
        if col == 0:
            print('0x{:04x}'.format(idx), end=':  ')
            print('0x{:02x}'.format(b), end=', ')
        else:
            print('0x{:02x}'.format(b), end=', ')
        col += 1
        if col >= max_cols:
            row += 1
            col = 0
            print()
        if row >= max_rows:
            print()
            break;
        idx += 1
    print()

That’s quite a bit of code for a post. So let’s see it all together.

"""
 Sweet16GP an implementation of an 8/16 bit CPU Emulator
 based on Steve Wozniak's SWEET16 virtual machine.
"""

from time import sleep, time
import status_bits


class Sweet16GP:
    MEM = []
    REGFILE = [0, 0, 0, 0, 0, 0, 0, 0]
    ACC = 0
    RETSTACK = 4
    COMP = 5
    STATUS = 6
    PC = 7
    cur_instr = 0
    halt_flag = False

    def __init__(self):
        self.cold_boot()

    # booting routines
    def cold_boot(self, mem_size=256):
        self.init_ram(mem_size)
        self.warm_boot()

    def warm_boot(self):
        self.init_regfile()

    # Memory handling routines
    def init_ram(self, mem_size=265):
        self.MEM = []
        for addr in range(mem_size):
            self.MEM.append(0x00)

    def poke_byte(self, addr: int, val: int):
        self.MEM[addr] = val & 0xff

    def peek_byte(self, addr: int) -> int:
        return self.MEM[addr] & 0xff

    def poke_word(self, addr: int, val: int):
        self.poke_byte(addr, (val & 0xff))
        self.poke_byte(addr+1, (val & 0xff00) >> 8)

    def peek_word(self, addr: int) -> int:
        lo = self.peek_byte(addr)
        hi = self.peek_byte(addr+1)
        val = ((hi << 8) + lo) & 0xffff
        return (hi << 8) + lo

    # Register handling routines
    def init_regfile(self):
        for id in range(len(self.REGFILE)):
            self.REGFILE[id] = 0x00

    def get_register(self, id: int) -> int:
        return self.REGFILE[id] & 0xffff

    def get_register_lsb(self, id: int) -> int:
        return self.REGFILE[id] & 0xff

    def get_register_msb(self, id: int) -> int:
        return (self.REGFILE[id] & 0xff00) >> 8

    def set_register(self, id: int, val: int):
        self.REGFILE[id] = val & 0xffff

    def set_register_lsb(self, id: int, val: int):
        hi = self.REGFILE[id] & 0xff00
        self.REGFILE[id] = (hi | (val & 0xff)) & 0xffff

    def set_register_msb(self, id: int, val: int):
        lo = self.REGFILE[id] & 0xff
        self.REGFILE[id] = ((val & 0xff) << 8) | lo

    def inc_register(self, id: int):
        val = self.get_register(id)
        val = (val + 1) & 0xffff
        self.set_register(id, val)

    def dec_register(self, id: int):
        val = self.get_register(id)
        val = (val - 1) & 0xffff
        self.set_register(id, val)

    # Memory handling routines
    def init_ram(self, mem_size=265):
        self.MEM = []
        for addr in range(mem_size):
            self.MEM.append(0x00)

    def poke_byte(self, addr: int, val: int):
        self.MEM[addr] = val & 0xff

    def peek_byte(self, addr: int) -> int:
        return self.MEM[addr] & 0xff

    def poke_word(self, addr: int, val: int):
        self.poke_byte(addr, (val & 0xff))
        self.poke_byte(addr + 1, (val & 0xff00) >> 8)

    def peek_word(self, addr: int) -> int:
        lo = self.peek_byte(addr)
        hi = self.peek_byte(addr + 1)
        val = ((hi << 8) + lo) & 0xffff
        return (hi << 8) + lo

    # Bit twiddling routines
    def bit_set(self, val: int, bit: int) -> int:
        return val | (1 << bit)

    def bit_clear(self, val: int, bit: int) -> int:
        return val & ~(1 << bit)

    def bit_toggle(self, val: int, bit: int) -> int:
        return val ^ (1 << bit)

    def bit_test(self, val: int, bit: int) -> int:
        return (val & (1 << bit)) >> bit

    # STATUS Flags routines
    def set_flags(self, flags: str):
        flags = flags.upper()
        if flags.__contains__('C'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.C)
        if flags.__contains__('Z'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.Z)
        if flags.__contains__('V'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.V)
        if flags.__contains__('N'):
            self.REGFILE[self.STATUS] |= (1 << status_bits.N)

    def clear_flags(self, flags: str):
        flags = flags.upper()
        if flags.__contains__('C'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.C)
        if flags.__contains__('Z'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.Z)
        if flags.__contains__('V'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.V)
        if flags.__contains__('N'):
            self.REGFILE[self.STATUS] &= ~(1 << status_bits.N)

    def test_flag(self, flag: str) -> int:
        flags = flag.upper()
        if flags.__contains__('C'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.C)) >> status_bits.C
        if flags.__contains__('Z'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.Z)) >> status_bits.Z
        if flags.__contains__('V'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.V)) >> status_bits.V
        if flags.__contains__('N'):
            return (self.REGFILE[self.STATUS] & (1 << status_bits.N)) >> status_bits.N

    def set_value_flags(self, val: int):
        # Note we can't set the overflow
        # flag here as we need both input
        # and result values to computer
        # overflow.
        if val == 0:
            self.set_flags('Z')
        else:
            self.clear_flags('Z')

        if val > 0b0111_1111_1111_1111:
            self.set_flags('N')
        else:
            self.clear_flags('N')

        if (val & 0x10000) >> 17:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    def set_overflow_flags(self, v1: int, v2: int, result: int):
        if ((v1 & (1 << 15)) >> 15 & (v2 & (1 << 15)) >> 15) == 1:
            # Both values have the sign bit set
            # So the result should have the sign
            # bit unset. If not, we have overflow
            if not (result & (1 << 15)) > 0:
                self.set_flags('V')
            else:
                self.clear_flags('V')
        elif ((v1 & (1 << 15)) >> 15 | (v2 & (1 << 15)) >> 15) == 0:
            # Both values have the sign bits unset
            # So, the result should have the sign
            # bit unset.
            if ((result & (1 << 15)) >> 15) == 1:
                self.set_flags('V')
            else:
                self.clear_flags('V')

    def set_carry(self, v1, v2, result):
        # value must be unmasked or it won't include the 17th bit.
        if ((result & (1 << 16)) >> 16) == 1:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    def set_borrow(self, v1, v2):
        if v1 < v2:
            self.set_flags('C')
        else:
            self.clear_flags('C')

    # Instruction decoding routines
    def decode_inst(self, instr):
        pass

    # Misc Methods
    def dump(self):
        print('\n\nSweet16-GP CPU')
        print('---------------------------------------------------------')
        self.dump_status()
        self.dump_registers()
        self.dump_memory()

    def dump_status(self):
        result = ''
        status = self.REGFILE[self.STATUS]
        # print('Status 0b{:016b}'.format(status))
        if self.bit_test(status, status_bits.C):
            result += 'C, '
        if self.bit_test(status, status_bits.Z):
            result += 'Z, '
        if self.bit_test(status, status_bits.V):
            result += 'V, '
        if self.bit_test(status, status_bits.N):
            result += 'N, '
        result = result[:-2]
        print('Status Flags: 0b{:04b}'.format(self.REGFILE[self.STATUS]))
        print('---------------------------------------------------------')
        print(str(result) + ' \n')

    def dump_registers(self):
        idx = 0
        col = 0
        max_cols = 4
        print('Registers')
        print('---------------------------------------------------------')
        for r in self.REGFILE:
            if idx == 0:
                print('ACC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 4:
                print('RETPTR (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 5:
                print('COMPARE (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 6:
                print('STATUS (R{:02}):0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            elif idx == 7:
                print('PC (R{:02}): 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            else:
                # General purpose register
                print('R{:02}: 0x{:04x}'.format(idx, (self.REGFILE[idx] & 0xFFFF)), end=',  ')
            idx += 1
            col += 1
            if col >= max_cols:
                col = 0
                print()
        print()

    def dump_memory(self, max_bytes=256):
        col = 0
        row = 0

        max_cols = 16
        max_rows = int(max_bytes / max_cols)
        idx = 0
        print('\nAddr   Data                                  MEMORY DUMP')
        print('------------------------------------------------------------------------------------------------------')
        for b in self.MEM:
            if col == 0:
                print('0x{:04x}'.format(idx), end=':  ')
                print('0x{:02x}'.format(b), end=', ')
            else:
                print('0x{:02x}'.format(b), end=', ')
            col += 1
            if col >= max_cols:
                row += 1
                col = 0
                print()
            if row >= max_rows:
                print()
                break;
            idx += 1
        print()

    # CPU control routines
    def run(self):
        while not self.halt_flag and self.get_register(self.PC) < len(self.MEM):
            self.cur_instr = self.peek_byte(self.get_register(self.PC))
            print(f'PC: {hex(self.get_register(self.PC))}, cur_instr: {hex(self.cur_instr)}')
            self.decode_inst(self.cur_instr)
            sleep(1)
            self.inc_register(self.PC)
            

if __name__ == '__main__':
    cpu = Sweet16GP()
    cpu.set_register(0, 0xffae)
    cpu.poke_byte(0, 0xbc)
    cpu.poke_byte(1, 0xd7)
    cpu.poke_word(4, 0xabcd)
    cpu.dump()

Now test this code. You only need to write some code to set values in the registers and memory, then call the dump methods and visually inspect the results to ensure they are as expected. This will give us a nice visual for the results of a program run.

Ok, that’s a lot for today. We now have a large set of housekeeping methods. Next time we will revisit the run() method and learn how to decode our instructions .

Salsa – An Operating System for Teaching – Part 1

by SysOps 0 Comments

It seems as one grows older you naturally take on the responsibility of teaching the next generation. Whether it be ranching, machining , or hardware/software development. What ever your background you will naturally find those younger who look to you for advice on one subject or another.

I too seem to have come to that point in my life when those younger than I seem to look to me for advice on hardware, software, and life issues. I find myself teaching how things came to be as they are now. Having some historical context helps many understand the why and how.

As you know, I’ve given many talks on hardware and software and regularly teach topics on the intersection of hardware and software to software developers. Many of these developers have little to no understanding of how software controls hardware or how hardware enables software manipulation of a system. I believe that having a basic understanding of these topics only makes for better developers. Just as I believe that having a knowledge of multiple programming languages and programming paradigms can only make you a better developer.

I have long taught that a good developer has one or two languages they are well versed in and another half dozen that they are familiar enough with to become productive in a weekend. One language doesn’t fit all problems well. In the same way a carpenter wouldn’t have a single hammer in his toolbox a software developer shouldn’t know only a single language. Who would want to write a simple text parsing script in assembly or C? Who would tackle a high speed computer vision problem in python without calling out to C/C++ libraries?

But I digress, the purpose of this series of articles is to introduce a basic Operating System for teaching OS development and the hardware boot process. As usual, I will start with some background. We will move on to a legacy BIOS (Basic Input Output System) boot system and then eventually cover the move to UEFI (Unified Extensible Firmware Interface). So let’s get started.

A Little History

To write your own OS you need to understand a few basics. We will concentrate on x86 and x86-64 hardware. My aim here is not to produce a fully functional OS that can replace Windows, Linux, or iOS. Instead, I only want to show you how to get a basic OS up and running. I’ll give you enough information and point you in the right direction so that if you’re interested, you can find further knowledge. Sometimes the trick to learning a complex subject is knowing just enough to be able to ask the right questions. But you have to have some knowledge to know what questions to ask, and where to ask them. That’s what Salsa is all about, teaching you what to ask next and help you identify where to learn more.

Before we get to far into this article let me say that for a novice developer OS development is difficult. I may leave a lot of details out that a novice may need. However, I’ll try to reduce each topic to the core essentials. Some of the material will require you to seek more information than I have provided here. At the very least you’ll need a familiarity with some kind of programming language.

The first thing we should cover is what is an operating system? In the early days of computing computers came as hardware only. You were provided a manual of the hardware and instruction set. These instructions where either binary machine code or later assembly mnemonics. The programmer had to develop an intimate knowledge of the computer’s hardware and how it all worked together. Each system was very different from the others. No one language ran on more than a single computer (later, a single computer manufacturer’s line). These computers had no operating system. When the machine was turned on, the programmer had to input a loader program that read in the program to be ran. This loader program was usually rewritten as the programs it loaded changed. Eventually, some programmers began to write more universal loaders that could load any program into the computer’s memory and then jump to the beginning of the program to start executing it.

One issue with loaders and the programs of the day was that the addresses of the various subroutines and indeed the address of the program itself was hard-coded into the program. So if the size or location of the loader changed the programs had to have all addresses recalculated. The addition of a single byte to the loader caused every line of the program to be recompiled. At this time these programs where compiled by hand! There were no compilers or interpreters.

To give you a feel for how this was done let’s look at a fictitious machine and it’s machine language:

The Fiction 1 Computational Machine

Instruction Set
Machine Code : Instruction
00000000 : No operation
00000001 : Load Accumulator A with the value following the current byte
00000010 : Load Accumulator B with the value following the current byte
00000011 : Add the values in Accumulators A and B and place the results in A
00000100 : Store the contents of Accumulator A at the memory address stored in the following byte
00000101 : Load the Accumulator A with the value stored at the address stored in the following byte
00000110 : Branch to program address stored in the following byte
00000111 : Subtract the value in Accumulator B from the value in Accumulator A and store in A
00001000 : Test Carry/Barrow Flag. The flag is set when an addition or subtraction results in a carry or barrow. When tested, sets the Accumulator A to zero if the flag was set, otherwise 255 is placed in Accumulator A.
00001001 : Branch to the address stored in the following byte if the Accumulator A contains 255
00001010 : Move the value in Accumulator A into Accumulator B
00001011 : Move the value in Accumulator B into Accumulator A

OK, looking at the machine instructions above let’s try to write a simple program that takes two numbers from memory locations 1000 and 1001 and multiplies them:

The first thing you may notice is that there is no multiplication instructions. So how do we multiply two numbers? Well, multiplication is just repetitive addition. So we simply use addition to do our multiplication. Here’s the pseudo code:
1. Load A with first value
2. Load B with second value
3. Add A and B, result is stored in A
4. Decrement B
5. Is B zero?
6. Yes, we’re done
7. No, go back to step 3

Our program:
Notes: Data Address 1011 will be used to store intermediate results
Data Address 1100 will be used for temporary storage of counter
————————————————————————
Address Instruction
00000000 00000101 # Load Accumulator A with second value
00000001 00001001 # Address of value to place in B
00000010 00001010 # Move value to B
00000011 00000101 # Load A with first value

00000100 00001100 # Address of value to place
00000100 00000011 # Add A and B and store result in A
00000101 00000100 # Save value in A
00000110 00001011 # Data Memory location to hold intermediate result
00000111 00001011 # Move value in B to A
00001000 00000010 # Load B with the value 1
00001001 00000111 # Subtract B from A and store result in A, Also sets the Carry/Borrow flag
00001010 00000100 # Save value in A
00001011 00001100 # Data Memory counter storage location
00001100 00001000 # Test flag and set A value, if flag was set, done!
00001101 00001001 # Branch if flag was cleared, destroys A
00001110 00000110 # Branch to address stored in next byte
00001111 00001110 # Forever loop… jump to self
00010000 00000101 # Retrieve counter in A
00010001 00001100 # Counter temp storage address
00010010 00001010 # Move A to B
00010011 00000101 # Retrieve current value in A
00010100 00000110 # Jump back to step 3 (Add)
00010101 00000100 # Address of step 3

The above machine program for our fictitious machine had to be hand assemble. You simply look up the value of each of the instructions in the instruction listing and then write them down. Once this was done, you then entered these values into the control panel of the machine. This was usually done via a set of switches. For our machine we would need to set 8 switches for the address and 8 more switches for the instruction byte. Then press some other switch to load the data into program memory.

Once our program is finished it simply loops at address 00001110 forever. In a real system you would store the result in a Data Memory location to be read by another program, or display it on a set of lamps. Note that all the addresses of the data and routines are hard-coded.

Eventually, programmers developed mnemonics for the machine instruction values. These came to be known as Assembly code. Assembly is a very low level abstraction on the actual machine code. A single assembly code instructions such as LDA #3 (load A with a value of 3) usually results in a single machine code instruction. However, some assembly language instructions can also result in a handful of machine instructions. Assembly language is much easier for humans to read and write. So it has become the standard language for ultra-low-level code and no-one but the chip designers use machine code anymore.

Fiction 1 Assembly Mnemonics

Mnemonic : Instruction
NOP : No operation
LDA <val> : Load Accumulator A with the value following the current byte
LDB <val> : Load Accumulator B with the value following the current byte
ADD : Add the values in Accumulators A and B and place the results in A
STA ADDR : Store the contents of Accumulator A at the memory address stored in the following byte
LDA [ADDR] : Load the Accumulator A with the value stored at the address stored in the following byte
BRA ADDR : Branch to program address stored in the following byte
SUB : Subtract the value in Accumulator B from the value in Accumulator A and store in A
TST : Test Carry/Barrow Flag. The flag is set when an addition or subtraction results in a carry or barrow. When tested, sets the Accumulator A to zero if the flag was set, otherwise 255 is placed in Accumulator A.
BRC ADDR : Branch to the address stored in the following byte if the Accumulator A contains 255
MOV A, B : Move the value in Accumulator A into Accumulator B
MOV B, A : Move the value in Accumulator B into Accumulator A

Using the assembly mnemonics our program becomes:

Address Instruction
LDA [00001001b] # Load Accumulator A with
MOV A, B # Move value to B
LDA [00001100b] # Load A with value
lp01: ADD # Add A and B and store result in A
STA [00001011b] # Save value in A
MOV B, A # Move value in B to A
LDB 00000001b # Load B with the value 1
SUB # Subtract B from A and store result in A, Also sets the Carry/Borrow flag
STA [00001100b] # Save value in A
TST # Test flag and set A value, if flag was set, done!
loop: BRC loop # Branch if flag was cleared, destroys A
LDA [0001100b] # Retrieve counter in A counter from temp storage
MOV A, B # Move A to B
LDA [00001100b] # Retrieve current value in A
BRA lp01 # Jump back to step 3 (Add

As you can see, using assembly mnemonics the code becomes much easier to read and understand. Looking through the code it doesn’t take long to understand that when a value is given in square brackets it refers to the value stored at the given memory address. If the value is not in brackets it is used as presented. These Mnemonics are much easier to remember and understand when compared to the machine code values shown above.

To see an example of what early computers were like visit: http://oldcomputers.net/altair-8800.html. You may also enjoy this video demonstration of the Altair 8800: https://www.youtube.com/watch?v=vAhp_LzvSWk. The Altair was my first computer. My father purchased a kit and put it together and allowed me to help. We then spent many hours writing simple programs for it and building our own peripherals. The Altair was the start of personal computers. While there were a couple of small computers before it (see http://www.computermuseum.20m.com/kenbak.htm), they never quite took off. The Altair however, found it’s way to the cover of Popular Electronics and found admiration from the likes of Bill Gates who wrote the first BASIC interpreter for the Altair 8800. Which resulted in the Microsoft Corporation.

Ok, so we’ve seen how it was done in the early days. Now our program doesn’t use a loader. However, for most computers loaders are required. The programmer would write a program that started just past the end of the loader program and would use addresses from that point on. Eventually, David Wheeler wrote a relocatable linker/loader for the EDSAC computer. This allowed programmers to forget about the program’s actual location in memory and let the linker/loader figure it out.

With the advent of larger more complex computer systems the need arose to have a set of routines that were permanently stored on the computer and could be called by a management program. These libraries of routines provided canned code for interacting with I/O devices like paper tape readers. They also contained the loaders to load programs from paper tape into memory and begin executing that code. These libraries grew as did the complexity and number of peripherals. Eventually, these machine were produced with a large library of hardware management routines and schedulers to allow compute jobs to be multiplexed in a time-sharing operation. It was these early libraries that eventually became the kernels of today’s operating systems.

Today’s Operating Systems now provide a host of services. However, there is still a small library of code that comes with the hardware and is responsible for initializing the hardware. This code, called the BIOS or UEFI and is executed when the hardware is first powered on. It is responsible for calling the OS loader or “Boot Loader”. The BIOS is older than the UEFI and was introduced in the first PCs back in the 1970’s. It can still be found in an emulation mode in current PCs using UEFI. UEFI was designed to replace the BIOS. It provides many more routines and services and is in step with today’s hardware. That said, we’ll first use the BIOS as it is important to understand the BIOS and how to boot using a BIOS system. In later articles we’ll examine the UEFI.

The Boot Process

The topics we’re going to cover are:

  • How a computer boots using the BIOS
  • How to write low-level programs where there is no operating system.
  • How to configure the CPU so that we can use it’s extended feature.
  • How to bootstrap code written in higher level languages so we can use it to write more complex code.
  • How to create some fundamental services such as device drivers, file systems, and multi-tasking systems.

When we boot our computer, it must start up without any notion of an operating system. Somehow, it must load the operating system. Whether it be Windows, Linux, or DOS. It must load the operating system from some form of permanent storage such as ROM, Hard Disk, Floppy Disk, SD Card, or USB thumb drive.

The pre-OS environment of our computers offers very little in the way of services. At this this stage even a simple file system doesn’t exist. What we do have is a simple collection of basic routines stored in a ROM (usually EEPROM these days). These routines provide access to the system’s basic I/O devices such as screen, keyboard, and disks.

The BIOS performs some low-level tasks such as initializing hardware and testing the basic system, it boots the operating system which is stored on a connected device such as a hard drive. However, the BIOS cannot simply load the operating system from the device. It has no idea of a file system. So the BIOS must read the operating system from a predetermined location. This location is called the Boot Sector. The boot sector is usually 512 bytes long. It is easiest for the BIOS to locate the OS at at the beginning of the disk at Cylinder 0, Head 0, Sector 0. This is the location known as the boot sector. However, most disks will not contain the OS and it would be a shame to waste this storage space on those disks when it could be used for data storage. So the BIOS needs a way to detect if an OS is present in the boot sector. To do this, the BIOS looks for a “Magic Number”, placed at the end of the 512 byte boot sector. This two byte magic number is 0xAA55. So on power up after some initialization work, the BIOS searches through the connected storage devices looking for the magic number. When it finds the magic number it knows that it has found the OS and can load it into memory and jump to the location to start execution of the OS.

If we use a program such as a Hex-Editor that will let us write raw byte values to a file, we can craft a simple boot sector.

e9 fd ff 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
...
00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa

Referring to the code above, the first three bytes are a simple infinite loop in machine code. The last two bytes are the magic number. You may wonder if I got this wrong? I thought you said the magic number was AA55?

Well, I did and it is. CPU’s can store data either Lest Significant Byte (LSB) first or Most Significant Byte (MSB) First . This data storage is known as Endianess. If a CPU stores the the data LSB first, it is said to be Little-Endian. If it stores the data MSB first it is Big-Endian. The x86 CPU family is little endian and so the LSB is stored first.

Most of the time you can ignore a CPU’s endianess. However, if you have to debug compiled code in a debugger you need to know how the machine code is stored or you may completely misread it.

This program is likely the smallest program you can write. But it is a valid bootable program. We can run this program either using an emulator such as qemu or Bochs. I’ll concentrate on using Qemu here.

Creating And Booting Our Stub

First you’ll need a Hex Editor. On Linux you can install Bless Hex Editor with the command: apt install bless On Window you can install wxHexEditor. It can be downloaded from SourceForge or, you may use the hex editor of your choosing. The only requirement is that it must allow you to save the raw bytes to a file.

On Windows 10 you can use the Linux Windows Sub-System. You can find more info here: https://docs.microsoft.com/en-us/windows/wsl/install-win10 This will give you access to most of the needed Linux commands and tools used here. On Windows 10 or earlier versions such as Vista , you can install Cygwin, or MinGW to gain access to the Unix-like tools we will be using. This includes the gcc compiler and the nasm assembler.

In addition to the hex editor you will need an emulator. For Linux, MacOS, and Windows I recommend using Qemu. However, Bochs Emulator can also be used. Note that Bochs requires a configuration file to be setup before using it. Qemu will work out of the box. Qemu can be installed on Linux using the command: apt install qemu On Windows you can download the Windows installer here: https://qemu.weilnetz.de/w64/ . On Windows you’ll need to add it to your path after installation. Their is a good Youtube video on this: https://www.youtube.com/watch?v=al1cnTjeayk&t=17s. You can find the CygWin Project home page here: https://www.cygwin.com/. When installing CygWin make sure to install the gcc compiler and the nasm assembler.

Now that you have your tools installed let’s create our stub program. This program will do nothing but loop infinity. However, it will prove that we can position code and the magic number in the right places to be found and executed by the BIOS.

Open your hex editor and the three hex values e9, fd, ff into the first three locations. Next fill the editor with 00 byte values up to the 0x1ff position (511 in hex). Then replace the last two byte with the magic number remembering to place them using little-endian format. So your file should look like the binary listing above with 55, aa being the last two bytes.

Create a project folder named “salsa”. In that folder create the sub-folders salsa/scr and salsa/bin. Save your hex file as “stub.bin” in the salsa/bin folder. Next, open a terminal in salsa/bin and run the command:

$ dd if=/dev/zero of=salsa.img bs=512 count=2880

Be careful using dd as you can corrupt or even erase your hard drives using it. dd is a utility on Linux for creating disk images. It is powerful and with power comes great responsibility! Read the docs on dd and get to know it. The command I presented above takes an input file defined by the if parameter. Here we passed /dev/zero which is is a special file that provides as many null characters (ASCII NUL, 0x00) as are read from it. A typical uses is to provide a character stream for initializing data storage. Which is exactly what we use it for here.

The bs parameter tells dd the block size to produce and the of parameter is the output file name. The count parameter tell dd how many blocks to copy. If we do the math, 2880 blocks of 512 bytes gives us 1474560. This is the size we need for our 1.44 MB disk image.

Next we need to copy our binary data to the boot sector of our disk image. To do this we use the command:

$ dd if=stub.bin of=salsa.img

Now we have a 1.44 MB floppy disk image file that we can use with Qemu or Bochs. So let’s try and boot it. Use the command:

$ qemu-system-i386 salsa.img

You should see something like:

Since our stub only runs an infinite loop nothing more will happen. No matter how long you wait. So the next step will be to get our stub to do something useful. For that we’re going to need an x86 Assembler.

We’ve covered a lot in this post. So we’ll end here. Be proud that you got the BIOS to locate your code and execute it. In the next post we will learn about the processor and the x86 assembly language. We;ll need to have some basic knowledge of these topics before moving on to build something useful.

Until next time; Happy Coding!

Android Native Development with Java Getting to know Android

Mobile platforms such as Android and IPhone are becoming the de facto computing devices.  Many people have left their laptops and even tablets behind as the power of smart-phones has increased. Now applications that once ran on the desktop are running on smart-phones. Software development is changing. Every product must contend with mobile support whether it be a website or service, or a desktop application. Users now expect that their data will travel with them and they will have access to it from all the devices they own. For the software developer, that means supporting multiple devices, often supporting multiple code bases and feature sets. Android has been around now for about a decade and has grown to own most of the mobile market. IOS is a close second. There are tools out their that will help the mobile developer develop products for both platforms using a single code base. I’ve written about my favorite, Dart/Flutter in other articles. However, these tools often lack either the flexibility to accomplish certain tasks, or they simply do not support a critical function of the device. When this occurs it is often necessary to fall back on native development. Either writing code that your main application can call, or writing the complete application in the platforms native code. This means that even a cross platform developer using tools such as Flutter or Xamarin often need to maintain a considerable amount of platform specific knowledge. Simply put, knowing native development for a platform is important even if you use cross platform tool sets.

I recently began taking some Android course (thanks to a Google Scholarship, THANK YOU GOOGLE!!! ) and thought I would share my knowledge and experience here. These are not the first Android courses I have taken. But I still feel I am far from an expert in Android development. Things change so quickly and I often am playing catch up just trying to stay abreast of new technologies. Sharing my knowledge with others and teaching others helps me ground my own understanding of a topic. So here I will share what I know a little at a time.

Tools Needed

Throughout this series I will be using Android Studio. At the moment it is version 3.0.1. The latest API is 27 revision 1. Android Studio is free and can run on Linux, Mac, and Windows.  It includes an emulator for running your compiled code though, other emulators from 3rd parties are available. AndroidStudio can be downloaded from: https://developer.android.com/studio/index.html If you would like to follow along I suggest you download and install AndroidStudio. Read the “User Guide” particularly the sections starting from “Meet Android Studio” to “Build And Run Your App”. These sections will help you get things setup and walk you through setting up the emulator. I will assume a familiarity with the Java programming language throughout this series. If you have never used Java but have C++ or C# experience, you should be able to follow along just fine.

Step One:

Ok, now that you have Android Studio setup and working and have an virtual device working for testing. It’s time to start coding.

The base for most Android Apps is the Activity. An activity typically loads the User Interface (UI) and provides some interaction with the user. An activity goes through a number of stages during it’s life cycle.  For most applications the activity is the heart of the application. Most applications have one or more activities however, it is possible for an application to be written without using an activity. These are rare however and are more advanced than we will cover here. You can think of an activity as a user task.

Fig. 1.1 Android Activity Life Cycle. Courtesy, www.developer.android.com

Fig. 1.1 Android Activity Life Cycle. Courtesy, www.developer.android.com

As stated, activities have a life cycle. The android framework provides 6 core callback methods that allow you to interact with the activity life cycle. These are: onCreate(), onStart(), onResume(), onPause(), onStop(), and onRestart() as shown in figure 1.1.

Android activities get created and destroyed based on the user’s interaction and system resource requirements. If the user navigates aways from your application and opens another app that requires resources your app is using, your application will be destroyed and then rebuilt when the user navigates back to it. The same actions can occur when the user rotates the device your app is installed on. Rather than doing a complicated dance to redraw the layout in the new orientation, android simply destroys your app and then recreates it when the user returns to it. This may seem like a drastic measure to take however, android works hard to limit power consumption and extend battery life, and this approach aids that goal.

The onCreate() method is fired when the activity is first created. Android requires that if we use an activity we must override this method in our code. In the onCreate() method we perform the activity logic that should only be completed once in the activities life cycle. Here we can load the layout, set class variables, and retrieve data that may be needed for the activity.

The onCreate() method receives an parameter of savedInstanceState of type Bundle. A bundle is a collection of state variables and data saved from the last run of the application. If the application has never been ran before the bundle will be null and the android framework will create it for us on the first run.

So let’s try some code! Open Android Studio and create a new project. Call it “ActivityApp” and select “Empty Activity” for the application type. Android Studio will create some boiler plate code for you. It should look like this:

 [cc lang="java" tab_size="4" lines="20"]
package me.randallmorgan.activityapp;

import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;

public class MainActivity extends AppCompatActivity {

  @Override
  protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);
  }

}
[/cc]

The first line provides our Java package name. This name comes from the concatenation of the app name and the reverse of the domain name you enter when creating the project. This is a typical java standard for package naming. It ensures that no two packages are named exactly the same. So that when these packages are distributed, there is no confusion as to which package is which or who is responsible for the package. Android follows this java standard package naming convention.

Lines 3 and 4 include the needed framework libraries AppCompatActivity and Bundle. Line 6 is the beginning of our application. We declare a public class MainActivity that extends the AppCompatActivity class found in the android framework code. The public, protected, and private prefixes to our class and methods are used to annotate the visibility of the code. We wont worry about these just yet. But we’ll come back to them in a later post.

Next we Override the onCreate() method from the framework’s AppCompatActivity class and provide our own code to initialize our activity. First we call the onCreate() method of our parent calls AppCompatActivity. This ensures that the framework does the work it needs to do to initialize our activity, such as extracting the state information from the saveInstanceState bundle. Finally, we call setContentView() passing in the resource id of our layout. In this case, the default activity_main.xml layout.

The setContentView() method takes the layout file and turns it into an actual view on the screen. We use the resource layout file to store our UI design. We’ll talk about UI design next time. For now, let’s continue with the activity life cycle.

Logging

Android Studio provides a logging system that allows us to see debug messages. These messages can be grouped into various categories such as Error, Warning, Info, Assert, Debug, Verbose, and WTF (What a Terrible Failure). The WTF level is typically reserved for the framework developers and under normal development conditions shouldn’t be used by mobile developers. To use the logging methods we first need to import the “android:util.Log” library. Next we set a tag value. This tag can be any string value but is usually set to the current class name. Then to print out information to the debug output window (Logcat) we only need to call the Log.d() method. The Log.d method takes to strings values and an option Throwable. A throwable is an exception. Exceptions are a type of error handling that allows a program to catch recoverable errors and do some processing on non recoverable errors  (like logging them).

Log.d() is used for debugging messages. There are other Log methods. One for each type of error message we listed earlier. The ones you might use are Log.d(tag, msg), Log.e(tag, msg), log.i(tag, msg), Log.v(tag, msg), Log.a(tag, msg). The Log.wtf(tag, msg) should only be used if you understand what you’re doing. This is for catastrophic errors only!!! Again, it is best left to the Android framework developers.

So let’s see how logging works. Go back to our ActivityApp project. Open the main_activiy.java file. Edit the code as follows:

[cc lang="java" tab_size="2"]
package me.randallmorgan.activityapp;

import android.os.Bundle;
import android.support.v7.app.AppCompatActivity;
import android.util.Log;

public class MainActivity extends AppCompatActivity {
  final String mTag = "MainActivity";
  
  @Override
  protected void onCreate(Bundle savedInstanceState) {
    
    super.onCreate(savedInstanceState);
    setContentView(R.layout.test_activity);
    
    Log.d(mTag, "onCreate called.");

  }
}
[/cc]

We’ve added line to declare a class wide string variable of “mTag” and set it’s value to “MainActivity” following standard convention. Next we added a line to the end of our onCreate() method to log our message passing Log.d() the variable mTag, and our message “onCreate called.”. If you run the app in your emulator and open the Logcat window you should be able to locate our message: … D/MainActivity: onCreate called. You may need to filter the output using the dropdown  to show only the messages in the Debug category.

Visualizing the Activity Life Cycle

Ok, now that we have debugging working, let’s try out some of the other functions that get called during the lifecycle of our activity. Edit your code to read as follows:

[cc lang="java" tab_size="2"]
package me.randallmorgan.activityapp;

import android.os.Bundle;
import android.support.v7.app.AppCompatActivity;
import android.util.Log;

public class MainActivity extends AppCompatActivity {
    final String mTag = "MainActivity";

    @Override
    protected void onCreate(Bundle savedInstanceState) {

        super.onCreate(savedInstanceState);
        setContentView(R.layout.test_activity);
        Log.d(mTag, "onCreate called.");

    }

    @Override
    protected void onStart() {
        super.onStart();
        Log.d(mTag, "onStart called.");
    }

    @Override
    protected void onResume() {
        super.onResume();
        Log.d(mTag, "onResume called.");
    }

    @Override
    protected void onPause() {
        super.onPause();
        Log.d(mTag, "onPause called.");
    }

    @Override
    protected void onStop() {
        super.onStop();
        Log.d(mTag, "onStop called.");
    }

    @Override
    protected void onRestart() {
        super.onRestart();
        Log.d(mTag, "onRestart called.");
    }
    
}
[/cc]

Now compile and run the app. Open the Logcat window and you should see the follow debug messages:

  • onCreate called.
  • onStart called.
  • onResume called.

Now rotate the view in the emulator. You’ll see a bunch of text scroll through the Logcat window. If you examine it you should see the following message:

  • onPause called.
  • onStop called.
  • onCreate called.
  • onStart called.
  • onResume called.

As you can see, our app was stopped by the Android framework and then recreated in the new orientation.

Ok, that’s it for today. Next time we’ll discuss layouts and the resource xml files. Then we’ll look at interacting with our layout.

Newsletter Powered By : XYZScripts.com