Link Search Menu Expand Document

Assembler “as-8080”

emuStudio version of assembler for Intel 8080 CPU is very similar to Intel assembler, but has some little differences. Features include:

  • macro support (unlimited nesting)
  • include files support
  • data definition
  • relative addressing using labels
  • literals and expressions in various radixes (bin, dec, hex, oct)
  • compiler output is in Intel HEX format

The features are very similar to those in “as-z80” assembler.

Running from command line

The assembler is provided as part of emuStudio, and usually it is run from GUI. But it can be run also from the command line, as follows:

  • on Linux:
    > bin/as-8080 [--output output_file.hex] [source_file.asm]
  • on Windows:
    > bin\as-8080.bat [--output output_file.hex] [source_file.asm]

All command line options include:

	--output, -o	file: name of the output file
	--version, -v	: print version
	--help, -h	: this help

Lexical symbols

The assembler does not differentiate between upper and lower case (it is case-insensitive). The token/symbol types are as follows:

Type Description
Keywords instruction names; preprocessor directives (org, equ, set, macro, endm, include, if, endif); data definitions (db, dw, ds); CPU registers
Identifiers ([a-zA-Z_\?@])[a-zA-Z_\?@0-9]* except keywords
Constants strings or integers
Operators +, -, *, /, =, mod, and, or, not, xor, shl, shr
Comments semi-colon (;) with text after it until the end of the line


Numeric constants can be only integers, encoded with one of several number radixes. The possible formats are written using regexes:

  • binary numbers: [0-1]+[bB]
  • decimal numbers: [0-9]+[dD]?
  • octal numbers: [0-7]+[oOqQ]
  • hexadecimal numbers: [0-9][0-9a-fA-F]*[hH]

Characters or strings must be enclosed in single-quotes, e,g,: MVI E, '*'


Identifiers must fit to the following regex: ([a-zA-Z_\?@])[a-zA-Z_\?@0-9]*. It means, that it has to start with a letter a-z (or A-Z) or the at-sign (@). Then, it can be followed by letters, at-sign, or numbers.

However, they must not equal to any keyword.

Instructions syntax

The program is basically a sequence of instructions. The instructions are separated by a new line. The instruction have optional and mandatory parts, e.g.:

Syntax part Required Notes
LABEL Optional Identifier of the memory position, followed by a colon (:).
  It can be used as forward or backward reference in instructions  
  which expect memory address (or 16 bit number).  
CODE Mandatory Instruction name.
OPERANDS It depends If applicable, a comma-separated (,) operands of the instruction.
COMMENT Optional semi-colonm (;) followed by any text until the end of the line.

Fields CODE and OPERANDS must be separated by at least one space. For example:

HERE:   MVI C, 0  ; Put 0 into C register
        DB 3Ah    ; Data constant of size 1 byte
LOOP:   JMP LOOP  ; Infinite loop

Labels are optional. Instructions and pseudo-instructions and register names are reserved for assembler and cannot be used as labels. Also, there cannot be more definitions of the same label.

Operands must be separated with comma (,). There exist several operand types, which represent so-called “address modes”. Allowed address modes depend on the instruction. The possibilities are:

  • Implicit addressing: instructions do not have operands. They are implicit.
  • Register addressing: operands are registers. 8-bit general-purpose register names are: A, B, C, D, E, H, L. Stack pointer is defined as SP, and program status word (used by push / pop instructions) as PSW. When register pairs should be used in 16-bit instructions, the same register names are used. For example, DCX D which decrements pair DE.
  • Register indirect addressing: for the memory value specified by address in HL pair is used special register called M, for example: MOV A, M.
  • Immediate addressing: operand is the 8-bit constant. It can be also one character, enclosed in single-quotes.
  • Direct addressing: operand is either 8-bit or 16-bit constant, which is understood as the memory location (address). For example: SHLD 1234h.
  • Modified page zero: operand is 3-bit value (0-7). It represents a “index”, which is multiplied by constant 8, resulting in final memory address. Used in RST instruction.

Immediate data or addresses can be defined in various ways:

  • Integer constant
  • Integer constant as a result of evaluation of some expression (e.g. 2 SHL 4, or 2 + 2)
  • Current address - denoted by special variable $. For example, instruction JMP $+6 denotes a jump by 6-bytes further from the current address.
  • Character constants, enclosed in single-quotes (e.g. MVI A, '*')
  • Labels. For example: JMP THERE will jump to the label THERE.
  • Variables. For example:


An expression is a combination of the data constants and operators. Expressions are evaluated in compile-time. Given any two expressions, they must not be defined circularly. Expressions can be used anywhere a constant is expected.

There exist several operators, such as:

Expression Notes
+ Addition. Example: DB 2 + 2; evaluates to DB 4
- Subtraction. Example: DW $ - 2; evaluates to the current compilation address minus 2.
* Multiply.
/ Integer division.
= Comparison for equality. Returns 1 if operands equal, 0 otherwise. Example: DB 2 = 2; evaluates to DB 1.
mod Remainder after integer division. Example DB 4 mod 3; evaluates to DB 1.
and Logical and.
or Logical or.
xor Logical xor.
not Logical not.
shl Shift left by 1 bit. Example: DB 1 SHL 3; evaluates to DB 8
shr Shift right by 1 bit.

Operator priorities are as follows:

Priority Operator Type
1 ( ) Unary
2 *, /, mod, shl, shr Binary
3 +, - Unary and binary
4 = Binary
5 not Unary
6 and Binary
7 or, xor Binary

All operators work with their arguments as if they were 16-bit. Their results are always 16-bit numbers. If there is expected an 8-bit number, the result is automatically “cut” using operation result AND 0FFh. This may be unwanted behavior and might lead to bugs, but it is often useful so the programmer must ensure the correctness.

Defining data

Data can be defined using special pseudo-instructions. These accept constants. Negative integers are using two’s complement. The following table describes all possible data definition pseudo-instructions:

Expression Notes
DB [expression] Define byte. The [expression] must be of size 1 byte. Using this pseudo-instruction, a string can be defined, enclosed in single quotes. For example: DB 'Hello, world!' is equal to DB 'H', DB 'e', etc. on separate lines.
DW [expression] Define word. The [expression] must be max. of size 2 bytes. Data are stored using little endian.
DS [expression] Define storage. The [expression] represents number of bytes which should be “reserved”. The reserved space will not be modified in memory. It is similar to “skipping” particular number of bytes.


HERE:  DB 0A3H          ; A3
W0RD1: DB 5*2, 2FH-0AH  ; 0A25
W0RD2: DB 5ABCH SHR 8   ; 5A
STR:   DB 'STRINGSpl'   ; 535452494E472031
MINUS: DB -03H          ; FD

ADD1: dw COMP          ; 1C3B  (assume COMP is 3B1CH)
ADD2: dw FILL          ; B43E (assume FILL is 3EB4H)
ADD3: dw 3C01H, 3CAEH  ; 013CAE3C

Including other source files

It is both useful and good practice to write modular programs. According to the DRY principle, the repetitive parts of the program should be refactored out into functions or modules. Functionally similar groups of these functions or modules can be put into a library, reusable in other programs.

The pseudo-instruction include exists for the purpose of including already written source code into the current program. The pseudo-instruction is defined as follows:

INCLUDE '[filename]'

where [filename] is a relative or absolute path to the file which will be included, enclosed in single-quotes. The file can include other files, but there must not be defined circular includes (the compiler will complain).

The current compilation address (denoted by $ variable) after the file inclusion will be updated about the binary size of the included file.

The namespace of the current program and the included file is shared. It means that labels or variables with the same name in the current program and the included file are prohibited. Include file “sees” everything in the current program as it was its part.


Let a.asm contains:

mvi b, 80h

Let b.asm contains:

include 'a.asm'

Then compiling b.asm will result in:

06 80     ; mvi b, 80h

Origin address

Syntax: ORG [expression]

Sets the value to the $ variable. It means that from now on, the following instructions will be placed at the address given by the [expression]. Effectively, it is the same as using DS pseudo-instruction, but instead of defining the number of skipped bytes, we define concrete memory location (address).

The following two code snippets are equal:

Address Block 1 Block 2 Opcode
2C00 MOV A,C MOV A,C 79
2C04 DS 12 ORG $+12  


Syntax: [identifier] EQU [expression]

Define a constant. The [identifier] is a mandatory name of the constant.

[expression] is the 16-bit expression.

The pseudo-instruction will define a constant - assign a name to the given expression. The name of the constant then can be used anywhere where the constant is expected and the compiler will replace it with the expression.

It is not possible to redefine a constant.


Syntax: [identifier] SET [expression]

Define or re-define a variable. The [identifier] is a mandatory name of the constant.

[expression] is the 16-bit expression.

The pseudo-instruction will define a variable - assign a name to the given expression. Then, the name of the variable can be used anywhere where the constant is expected.

It is possible to redefine a variable. This effectively means to reassign a new expression to the same name and forgetting the old one. The reassignment is aware of locality, i.e. before the operation, the old value will be used, and after the operation, the new value will be used.

Conditional assembly


if [expression]
    i n s t r u c t i o n s

At first, the compiler evaluates the [expression]. If the result is 0, instructions between if and endif will be ignored. Otherwise they will be included in the source code.

Defining and using macros


[identifier] macro [operands]
    i n s t r u c t i o n s

The [identifier] is a mandatory name of the macro.

The [operands] part is a list of identifiers, separated by commas (,). Inside the macro, operands act as constants. If the macro does not use any operands, this part can be omitted.

The namespace of the operand identifiers is macro-local, ie. the operand names will not be visible outside the macro. Also, the operand names can hide variables, labels, or constants defined in the outer scope.

The macros can be understood as “templates” which will be expanded in the place where they are “called”. The call syntax is as follows:

[macro name] [arguments]

where [macro name] is the macro name as defined above. Then, [arguments] are comma-separated expressions, in the order as the original operands are defined. The number of arguments must be the same as the number of macro operands.

The macro can be defined anywhere in the program, even in any included file. Also, it does not matter in which place is called - above or below the macro definition.


LOOP: RRC      ; Right rotate with carry
      ANI 7FH  ; Clear MSB of accumulator
      DCR D    ; Decrement rotation counter - register D
      JNZ LOOP ; Jump to next rotation

The macro SHV can be used as follows:

MVI D,3  ; 3 rotations

Or another definition:

      MVI D,AMT   ; Number of rotations
      ANI 7FH
      DCR D
      JNZ LOOP

And usage:


Which has the same effect as the previous example.