I remember the time I used to know the 8088/8086, inside out!
Heck! I new the 80286 inside out!
The first program (if you wanna call it that) I wrote was in a Casio Pocket Computer; I then moved to a TRS-80 and then a Commodore 64; later the 128, and then an XT, and even on an HP 28 and on a 48 and… well, you get the picture: I’m old.
The important thing here is that the Casio Pocket Computer had to be programmed using a minimalist version of Basic and, for the most part, it did the job. But not for me. I simply wanted to do things that its Basic implementation couldn’t do.
Then, I learned about a programming language known as “assembler” or “assembly“.
This coincided with me having the Commodore 64.
One of the first programs I wrote for the C64 was an “assembler interpreter” that substituted its default Basic-based one.
It was cool and it was a nice way to familiarize myself with all the “hidden secrets” (thanks to the Transactor Book of Bits and Pieces #1) the C64 had, without having to rely on a bunch of PEEKs and POKEs.
Several years later (in 1990, to be specific) I got my first “real” computer: an x286 /w 1MB of RAM and a 20MB hard disk.
One of a kind, by those days standards.
At that very same time I was taking a class in the university that required you to learn (on your own) x86 assembly language, so I did… and I loved it, and one of the things I loved the most is that the more assembly language you learn, the more you understand the inner-workings of the computer you’re working on.
<irrelevant to the article>
As he dismisses the class, calls me and tells me that, since I have some (apparent) experience I should do a program that calculates the factorial of any whole/positive number… up to ten.
Just to mess with him I wrote a program that could calculate the factorial of any whole/positive number as long as the machine had enough RAM to perform the computation.
Since that day, I started coding many programs in assembly language, most of them useless, others quite interesting and complex.
</irrelevant to the article>
I remember understanding everything about the 8086 — everything!
The CPU itself, the way memory worked, the latches used to transfer information, the PIC, the PIT, the hard disk controller… everything!
Fast forward a couple of decades…
And so I thought, about 4 years ago, that I could write an 8086 emulator.
This is a project I work on when I’m, well, bored… but honestly, I never thought it’d turn to be one of my greatest nightmares!
One of the most important motivations for this project was to:
- Satisfy my ego (knowing that I can code an emulator)
- Hopefully, help others (specially youngsters) better understand how computers work by providing some visual feedback representing the way all the components in a (8086) computer work
So, highly motivated (and even more naively) I started coding away…
One of the first things I did was to create a machine code decoder. That is, a program that could read byte-code machine data and translate it into assembly language.
For example, given this byte-code sequence:
8B 0E 62 00
The decoder produces the following assembly code:
MOV CX, [0062h]
Remember the assembler interpreter for the C64 that I mentioned earlier? Well, that was extremely easy to do because the C64 has one byte-code per instruction.
LDA, STX, BRK, etc… they all have their unique byte-code representation.
Intel’s processors, however, use an encoding mechanism to represent all the available mnemonics, and their addressing modes, in just one or two bytes making its decoding incredibly complex.
Anyway, once I was able to properly decode (interpret byte-code into actual assembly code) I started coding the “emulator”.
The emulator is nothing more than a series of routines that mimic what the actual 8086 processor would do.
For example, this is the emulated code for the CWD mnemonic:
If (mRegisters.AX And &H8000) = &H8000 Then
mRegisters.DX = &HFFFF
mRegisters.DX = &H0
Then, I implemented the necessary code to also emulate the registers, the flags, the memory and the stack.
The registers were quite simple to emulate and I think I did a quite good job at doing so, as it is not a trivial thing.
All the registers in the 8086 are 16bits (2bytes) but not all of them can be split.
For example, AX (the accumulator) is a 16bit register that can be split into two additional registers: AH (for the high byte) and AL (for the low byte). But, the DS (data segment) register cannot be split so there’s no direct way of addressing its upper or lower bytes.
This poses a challenge: how do you create a unique object that can represent both splittable and non splittable registers and that, at the same time, is fast enough to work under such conditions?
Flags support is really easy to implement as long as you keep in mind that the 10 flags supported by the 8086 leave inside a 16 bit memory area.
I think I’ve done a pretty good job at emulating the memory.
It is done in such a way that it is very easy (and quite fast) for the emulator itself to easily read/write to/from it.
Nearly every single operation in the emulation process requires a memory access so this implementation is one of the most important aspects of the whole emulator.
Also, the implementation also uses some shortcuts (helper functions) to simplify the code so that it is easy to read and understand — remember that one of the purposes of this emulator is to serve as an educational tool, not to be the fastest.
The stack was a little trickier due the way it affects the SS (stack segment) and SP (stack pointer) registers.
Internally, the emulator provides two methods to either push or pop bytes from the stack with ease, and lets the memory manager handle the the way the SS and SP registers change their values based on the type of access performed on the stack.