(August 2020)

Fork me on GitHub

The backstory

I've been fooling around with SW-only 3D graphics for quite some time. A couple of years ago, I ported the main logic into an ATmega328P microcontroller, implementing "points-only" 3D rendering, and driving an OLED display via the SPI interface... at the magnificent resolution of 128x64 :-)

The challenge - run it on the Speccy

So the path to even more useless tinkering was clear:

I had to make this work for my ZX Spectrum 48K! :-)

And as you can see below... I did:

Here's the complete code. There's a simple Makefile driving the build process - so once you have z88dk installed, just type:

make clean all run

The resulting statue.tap file is also committed in the repo, in case you just want to quickly run this in your FUSE emulator.

Compiling z88dk

The cross-compiler used for the compilation is z88dk. If it's not packaged in your distribution, you can easily build it from source:

mkdir -p ~/Github/
cd ~/Github/
git clone https://github.com/z88dk/z88dk/
cd z88dk
git submodule init
git submodule update
./build.sh

You can now use the cross compiler - by just setting up your enviroment (e.g. in your .profile):

export PATH=$HOME/Github/z88dk/bin:$PATH
export ZCCCFG=$HOME/Github/z88dk/lib/config

On 3D projection and Z80 assembly

Since the Speccy's brain is even tinier than the ATmega328P's, I had to take even more liberties: I changed the computation loop to orbit the viewpoint (instead of rotating the statue), thus leading to the simplest possible equations:

int wxnew = points[i][0]-mcos;
int x = 128 + ((points[i][1]+msin)/wxnew);
int y = 96 - (points[i][2]/wxnew);

No multiplications, no shifts; just two divisions, and a few additions/subtractions.

But that was not the end - if one is to reminisce, one must go all the way!

So after almost 4 decades, I re-wrote Z80 assembly - and made much better use of the Z80 registers than any C compiler can.

The result?

Almost a 2x speedup... Reaching the phenomenal speed of 10 frames per sec :-)

Pre-calculating for maximum speed

I was also curious about precalculating the entire paths and the screen memory writes - you can see that code in the precompute branch.

As you can see in the video above, this version runs 4 times faster, at 40 frames per sec. It does take a couple of minutes to precompute everything, though. Since I had all the time in the world to precompute, I used the complete equations (for rotating the statue and 3D projecting) in 8.8 fixed-point arithmetic:

3D Algebra

3D Algebra

The reason for the insane speed, is that I precompute the target pixels' video RAM locations and pixel offsets, leaving almost nothing for the final inner loop, except extracting the memory access coordinates from 16 bits/pixel:

It's also worth noting that the inline assembly version of the "blitter" is 3.5 times faster than the C version. Since these are just reads, shifts and writes, I confess I did not expect to see that much of a difference... But clearly, C compilers for the Z80 need all the help they can get :-)

Next step - the real thing

Now all I need to do is wait for my retirement... so I can use my electronics knowledge to revive my Speccy, and test this code on the real thing, not just on the Free Unix Spectrum Emulator :-)

Then again, maybe you, kind reader, can try this out on your Speccy - and tell me if it works?

 

Discussion in Hackaday



profile for ttsiodras at Stack Overflow, Q&A for professional and enthusiast programmers
GitHub member ttsiodras
 
Index
 
 
CV
 
 
Updated: Sat Oct 21 20:37:23 2023
 

The comments on this website require the use of JavaScript. Perhaps your browser isn't JavaScript capable; or the script is not being run for another reason. If you're interested in reading the comments or leaving a comment behind please try again with a different browser or from a different connection.