|
||
---|---|---|
.vscode | ||
doc | ||
img | ||
jack | ||
rsc | ||
src | ||
.gitattributes | ||
.gitignore | ||
Cargo.lock | ||
Cargo.toml | ||
README.md | ||
hd1.blkdev | ||
hd2.blkdev | ||
license.txt |
README.md
HV2E32, the Hack CPU V2 emulator
This is the 'advanced' emulator derived from the HE32 original version. It relies on a somewhat upgraded instruction set from the original one (which I dubbed 'hack version 2' ...). Build from Rust sources, it out-performs even the original VM-Emulator, thus making jack games actually playable!
Register set
The new instruction set architecture (ISA) features 16 integer registers (R0..R15) and 16 float registers (F0..F15). Though basically orthogonal (i.e. all registers may be used in identical form w/ all instructions), my compiler suite uses pre-defined register names (which will be all too familiar to hack students!).
R0 - SP Stack pointer
R1 - LCL Local variables base pointer
R2 - ARG Function arguments base pointer
R3 - THIS Object base pointer
R4 - THAT Array base pointer
R5 - TEMP0 ->RAM[5] expression evaluation
R6 - GLOBAL Global variables base pointer
R7 - ZERO An 'empty' register (to be used w/ absolute addresses)
R8 - STATIC Static variables base pointer
R9 - ARG1/TEMP7 Argument #1/Temporary value #7
R10 - ARG2/TEMP6 Argument #2/Temporary value #6
R11 - ARG3/TEMP5 Argument #3/Temporary value #5
R12 - TEMP4/ARG5 Temporary value #4/Argument #5
R13 - TEMP3/ARG6 Temporary value #3/Argument #6
R14 - TEMP2/ARG7 Temporary value #2/Argument #7
R15 - TEMP1/ARG8 Temporary value #1/Argument #8
(Compiler shall decide ARG#/TEMP#/FTEMP# usage - well, currently not used ;)
F0 - First floating point register
...
F5 - FTEMP0 Fifth floating point register
...
F7 - FZERO A zero value floating point register
...
F9 - FTEMP7 Temporary value #7
F10 - FTEMP6 Temporary value #6
F11 - FTEMP5 Temporary value #5
F12 - FTEMP4 Temporary value #4
F13 - FTEMP3 Temporary value #3
F14 - FTEMP2 Temporary value #2
F15 - FTEMP1 Temporary value #1
Also, there are several more flags now:
R-QX ---- ---- ---T ---- -MID ---- --NZ
where:
R = Run flag (cpu/emulator active, not halted)
Q = Hardware interrupt (irQ) processing active
X = Exception handling active
T = Timer interrupt enabled
M = Memory access exception
I = Invalid instruction exception
D = Division by zero exception
N = Negative (result of last operation is negative)
Z = Zero (result of last operation is zero)
Instruction Set Architecture (ISA)
The A instruction
The original 'A' instruction mutated to:
0ddddsss sSIXpoCC CCCCCCCC CCCCCCCC
where
dddd = destination register (direct or indirect), <dr>
ssss = source register (direct or indirect), <sr>
S=0 Load / =1 Store (store flag)
I = indirect <sr> interpretation
X = post-increment on stores and pre-decrement on loads
p = use source register ssss
o = constant used as offset for 1st operand
<const.> = 18-bits constant (C..C)
- =not used
As this is an emulator instruction set, the distinction between Load/Store instructions & ALU instructions need not be made (which is not true for actual physical H/W!). Thus, combined load AND store in one instruction are possible, thereby reducing the shear number of instructions the emulator has to process. This should provide for a significant speed boost. As with RISC-V, additional pseudo-ops have been introduced for easier reading (& writing!).
The C instruction
The original 'C' instruction has been modified to:
1ddddssssSI-----rrrrcccccc---jjj
where
dddd = destination register (direct or indirect), <dr>
ssss = source register (direct or indirect), <sr>
S = <dr> indirect (store)
I = indirect <sr> interpretation
<rr> = 0..15 (rrrr, operand2 register)
c..c = ALU comp., const load or transfer
j..j = jump type
-=not used
(Yes, there are lots of empty entries for now ...)
Here is some example code (you get the idea!):
...
// L919 - function Memory.peek 0
Memory.peek: // Memory.peek entry point (Memory, line 00919)
// using 0 local variables
PUSH [ZERO + #STATIC_Memory.0] // Class Memory: pt ram
PUSH [ARG + #0] // Memory.peek( int address ..)
POP TEMP1 // Add [SP-1]+[SP-2] & remove TOS
S [ZERO], TEMP1
ADDTOS [ZERO]
POP THAT // THAT = TOS
PUSH [THAT + #0]
RETURN #1 // return 1 value(s)
...
Also, function calling is really simple now:
...
// L1594 - push constant 3
PUSH #3
CALL #Sys.error, #1
...
CALL & RETURN work in conjunction and provide for the calling contract all in one instruction! Should give it a speed boost as well.
The third improvement - compared to the original hack IS - is the introduction of ready made MUL & DIV instructions (instead of calling Math.mul & Math.div library functions).
The ICALL instruction is used w/ software interrupts ('syscall'). It's used w/ the inline assembly feature for jack like this:
function int test(int a, int b) {
asm { // Semicolons are used as assembly terminal symbols ...
l arg2, [ARG] // Load two arguments: a
l arg3, [ARG + #1] // & b
icall #15, #1 // Call S/W interrupt #15 subfunction #1 (passed as arg1)
// with arg2/arg3 registers
return #1 // The result will be returned via stack
}
}
Also some cpu/emulator control features like a HALT & BKPT (breakpoint) instruction have been added, as well as exception handling (int #0) & hardware interrupts (int #1).
Local variables - by default - will NOT be initialized on function entry any longer, only the space reservation will be made. Saves on memory & increases speed ...
Internally, the VM 'return' instruction has been changed to 'return n' (together with other stack handling optimizations), which maps nicely to the same updated CPU 'return #n' instruction - this omits superfluous push & pops for void function returns (& thus improves speed as well).
The code generation has changed from absolute to relative addressing on all sorts of jumps & calls. This permits for relocatable code.
See spec. for more information or the hack v2 assembly language reference.
'Hardware' Changes, Software extensions
The memory mapped I/O now ranges from RAM[16384] to RAM[24831].
-
A Random number generator may be queried @ RAM[24578], yielding integers in the range 0..=100 (may be redefined via RANDOM_RANGE_MAX in global_defs.rs)
-
Up to 4 USB-Gamepads are now supported @ RAM[24580..24583] respectively. Player #1 buttons are mapped to keyboard as well!
Mappings are (gamepad -> keyboard #code):
// Joystick / axis cross
Joystick axis Y down -> [Cursor down] (#133)
Joystick axis Y up -> [Cursor up] (#131)
Joystick axis X right -> [Cursor right] (#132)
Joystick axis X left -> [Cursor left] (#130)
// Color buttons
X/Blue -> [X] (#88)
Y/Green -> [Y] (#89)
A/Red -> [A] (#65)
B/Yellow -> [B] (#66)
// Center buttons
Select -> [Enter] (#128) (Somewhat irregular ...)
Start -> [Space] (#32)
// Frontal 'levers'
Right lever -> [End] (#135)
Left lever -> [Home] (#134)
-
The timer interrupt may run at the specified interval @ RAM[24584]. The demo (batteries included!) now shows 3 threads running 'simultaneously' - the thread scheduling is now part of the 'OS' & driven by the timer interrupt (the - adjustable - tick runs @50ms)! To support multi-threading, an atomic semaphore instruction has been introduced.
-
A block device has been added @ RAM[24586..24594]. En détail:
RAMOFFS. DESCRIPTION 24586 CMD 24588 BLOCKLEN 24590 OFFSET 24592 TARGETBUFFER 24594 STATUS
The demo uses the block device & file system libraries to save & restore the high score. The file system specification is provided here. To create an empty harddisk, simply delete 'hd1.blkdev', a new persistent storage device will be created & formatted automatically. The usual file handle operations like create(), open(), read(), write(), seek() & close() are supported.
-
A RTC can now be found @ RAM[24596..24603]. The Layout is:
RAMOFFS. DESCRIPTION 24596 TIMESTAMP /<- See file system spec. for details! 24597 YEAR 24598 MONTH 24599 DAY 24600 HOUR 24601 MINUTE 24602 SECOND 24603 MILLISECS. -
The initial OS boot delivers the code length (i.e. the next free FLASH/ROM address) @ RAM[24604]. In conjunction with the SROM instruction, ROM above the OS section may then be filled by the user. After initial emulator load, this register may be modified by the user (used by the linkloader to permit program 'stacking' etc.). RAM[24605] contains the previous load address, i.e zero for the OS, the shell base after booting the shell, the app base after starting an application. These extensions permit loading additional (user) applications. Access to the HV2OS ('hack v2 operating system') functions is permitted, the API is described in this guide. RAM[24606] now is a pointer to a table of apps loaded w/ addresses, maintained by the OS relocator. Debug symbol support - incl. source line display(!) - is available now. RAM[24607] contains a flag, indicating changes in RAM[24606].
-
A sound device has been added. Supply wave sound file name (located within rsc/sounds folder) & it will play async.!
RAMOFFS. DESCRIPTION 24608 CMD 1=LOAD, 2=PLAY, 4=RESET 24609 SOUNDFILE Pointer to sound file name string -
Source level debugging is supported. If a relocatable is compiled w/ debug symbols, it is possible to step from line to line w/ [F10] within the source on display next to the output screen. Also it is possible now, to jump to a pre-selected target line (use mouse to select some chars. in the target line of the selected source) via the [F11] key. [F12] permits stopping after leaving the current function ('jump out'). To the right, all context variables will be shown (categorized as function arguments, local variables, object instance data & static variables - and yes, they may be modified as well now!). Stop can be enforced via 'do Sys.bkpt()' from within code or simply via pressing [F6], then one or more times [F8] (context dependent) until source code shows up (if available, that is ...). In the upper half, to the right of the 'HV2 CPU' status display, you'll find a separate stack display w/ some usage hints (say 'call stack' display etc.). Also, the disassembled context is now available (left hand side). A relocatable (see ./jack/apps/Pong folder) has been provided in source code for that purpose. There is a make file available in this folder as well. Use 'make clean' / 'make' to generate the relocatable with the provided compiler suite. Or just 'make' if you modified the source ...
Function key assignment:
[F4] Reset (program counter to zero, actually a complete restart now ...)
[F5] Run/continue running
[F6] Stop running
[F8] Single step (one assembly instruction)
[F9] Step over (function call w/ return)
[F10] Next line (source level debugger)
[F11] Stop at selected line (source level debugger)
[F12] Step out of function (source level debugger)
-
A message 'device' has been introduced (like a 'serial terminal'). This permits message output to the emulator console. Normal printing output can be redirected to this device or 'duplicate' the regular output.
RAMOFFS. NAME DESCRIPTION 24624 EMU_NEWSEQNUM (0 = Invalid) 24625 EMU_OLDSEQNUM (old previous value) 24626 EMU_CHAR Character to output -
An emulator control word @24628 now permits a more fine-grained influence on the processing behaviour. The Perf.bin executable demonstrates the speed influence. Don't try in (Rust) debug mode ... 🧘
-
A color graphics controller may now be used optionally, use the '--CGD' option. It may also be used together with the "--BigScreen" option. The shell automatically detects an available CGD and redirects all output to that screen, yet you may swap the current output screen with the shell command scr <n> where n=1 activates the B/W screen & n=2 the CGD. Also a clear screen command is now available at the command line ...
|RAMOFFS.| NAME | DESCRIPTION |
:--------|-----------------|-----------------------------------------------:
|24630 | CGD_N_PIPELINES | 1 (maybe later # of pipelines) |
|24631 | CGD_CHANGEFLAG | != 0? Do something & reset |
|24632 | CGD_BG_COLOR | Background colour |
|24633 | CGD_FG_COLOR | Foreground colour |
|24634 | CGD_CMD | See command guide |
|24635 | CGD_PARA1 | Coordinates x1(hi), y1(lo) or similar |
|24636 | CGD_PARA3 | Coordinates x2(hi), y2(lo) etc. |
| ... | ... | ... |
|24645 | CGD_PARA\<n\> | Coordinates x\<n\>(hi), y\<n\>(lo) or similar |
|CMD| NAME |PARAMETERS | DESCRIPTION |
:---|-------------------|-----------------------------|----------------------------------------------------:
| 1 | QUERY_SIZE | - | RETURN: CGD_PARA1 hi=x/lo=y |
| 2 | CMD_CLEAR | - | Clear screen w/ background colour |
| 3 | CMD_TEXT | PARA1:x/y | Text output w/ fore & background |
| | | PARA2:textsize | |
| | | PARA3:String pointer | |
| 4 | CMD_PIXEL | PARA1:x/y | Draw pixel w/ current foreground colour |
| 5 | CMD_LINE | PARA1:x1/y1, PARA2:x2/y2 | Draw line w/ current foreground colour |
| 6 | CMD_RECTANGLE | PARA1:x1/y1, PARA2:x2/y2 | Draw filled rectangle w/ current foregr. colour |
| 7 | CMD_CIRCLE | PARA1:x/y, PARA2:r | Draw filled circle w/ current foreground colour |
| 8 | CMD_SCROLLUP | PARA1:n pixels | Scroll up screen by \<n\> pixels |
| 9 | CMD_SPRITELOAD | PARA1:handle/n, PARA2:file | Load a png file at \<handle\> (0..15) |
| | | PARA3:x/y, PARA4:w/h | pick from offset x/y with w/h size on z plane |
| | | PARA5:hi=z order | \<n\> sprites to use (w/ w as delta) |
| | | PARA6:ref_x, PARA7: ref_y | RAM[\<ref_x\>], RAM[\<ref_y\>] current pos. return |
| | | PARA8:range_x, PARA9:range_y| optional movement range limits |
| | | PARA10:ran._w, PARA11:ran._h| |
| 10| CMD_SPRITESET | PARA1:hiword: hi=z order | drawcmd: 0=hide sprite, 1=draw sprite |
| | | lo=drawcmd | 2=draw repeat w/ loop, 3=draw w/ anim. |
| | | loword: hi=handle, | 4=Draw to range, 5=draw to range w/ anim. |
| | | lo=index | |
| | | PARA2:x/y, | Set sprite[\<handle\>][\<subindex\>] at x/y coord. |
| | | PARA3: dx/dy | Optional: Movement vector (hi=dx/lo=lo) |
| 11| CMD_SPRITESUPDATE | PARA1:flags | Enforce update (1=Draw all, 0=Hide all) |
| 12| CMD_SPRITECOLL | PARA1:handle | Test for collision with a range of other sprites |
| | | PARA2:hi=fromHandle | from/to are assumed subindex 0 sprites |
| | | lo=toHandle | RETURN:PARA1 = -1 -> No intersection, else handle |
The game source has been provided for reference.
Impressions
On YouTube (compare speed w/ hv2e32!):
Boing: https://www.youtube.com/watch?v=L_uQlRq6BhI
Hackenstein3D: https://www.youtube.com/watch?v=inFJ5EyOhpM
Space Invaders: https://www.youtube.com/watch?v=jKqC16h59aE
Lots of other jack games too ...
Installation
You will need to have Qt5 installed (ex. for Ubuntu/Mint):
sudo apt-get install cmake
sudo apt-get install qt5-default qttools5-dev qt5-qmake libqt5designer5
sudo apt-get install libasound2-dev libudev-dev
Next, follow the Rust installation instructions.
Before downloading this repository, make sure you have git-lfs installed!
Make sure you have qt5ct installed ('Qt5 Settings') & set the following env. variable
QT_QPA_PLATFORMTHEME="qt5ct"
To reproduce my viewing experience, adjust Qt5 settings (command: qt5ct) to:
1st tab 'Appearance':
Style: 'Fusion'
Standard dialogs: 'Default'
Palette: 'Custom'
Color scheme: 'darker'
2nd tab 'Fonts':
General: 'Ubuntu 9'
Monospace: 'Ubuntu Mono 9'
3rd tab 'Icon theme':
Mint-Y-Dark-Sand
4th tab 'Interface':
-- no changes --
5th tab 'Style Sheets':
-- none selected --
Run
The HV2OS operating system is considered coming from FLASH/ROM memory. After initialization, it loads & starts 'Shell.bin' from the local hard disk (blockdevice) 'hd1'. The provided '*.bin' games may be started by the user from the shell command line. Try 'help' for other commands. You may delete 'hd1.blkdev' any time, the emulator will create a new - though empty - 'disk'. Also, the games & demos are distributed on two disks (for size reasons). Rename the one to be tried to 'hd1.blkdev' (as the emulator currently only supports just this block device).
Use 'cargo run --release ./jack/OS/HV2OS/HV2OS.hv2' to run (as in debug mode the performance will be poor).
For autostart in full screen mode use 'cargo run --release ./jack/OS/HV2OS/HV2OS.hv2 --BigScreen --Run' ...
Try 'cargo run --release ./jack/OS/HV2OS/HV2OS.hv2 -CGD' as well, then - once the shell has started - type '$g' to use its variable substitution feature & start a little sprite game!
TSRs ('terminate & stay resident') are now supported by the API (see 'get_TSR_return') & are thus handled by the application itself (see Clock.bin sample). May be used to extend the OS via S/W interrupts ...
If you run 'Pong.bin', you'll notice a halt on startup & restart. This is to demonstrate the variable display & source level debugging features with some suitable data (all var. types for instance!).
Experiencing problems? Try the trace option "--Debug" ...
Profiling
There is an integrated profiling tool available. Every selection via the 'File' menu will transfer current run counters (@instruction level) to the profiler window. Processing percentages are shown from application down to source line level (see tooltip for actual counter data ;). Every 'power cycle' (F6/Stop .. F5/Run) will reset the counters.
(Pseudo-)Cache performance measurement
Via command line parameters, a (pseudo) cache may be activated. The cache size may be changed as well as the number of instruction queues used (--CACHE <bytesize> --QUEUES <n>). Minimum cache size is 4096 bytes. This facility permits to test the ICPL instruction (see docs). When activated, each stop lists the cache status counters, every start resets them, respectively. The cache queues do improve the situation massively, the ICPL instruction even further (but only under certain conditions, hence hand crafted only currently ... the automatic ICPL generation by the optimizer is still experimental). This feature shall permit for a viable decision about how to organize the instruction cache for a possible H/W solution (FPGA).
Compiling
The ./jack/apps/Pong directory contains a compilable sample program. See 'Makefile' for usage of the compiler suite. To compile do:
cd jack/apps/Pong
make clean
make
cd ../../..
To develop, a jack language reference including some new extensions has been provided here, the HV2 OS API reference can be found here.