Conway's Life in Z80 Assembler - How to optimise RC2014.

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 พ.ย. 2020
  • #z80 #programming #rc2014
    Various tricks and techniques I used to optimise the speed of my RC2014 implementation of Conway's Life that was written in Z80 assembly. So join me in a bit of fun retro programming!
    Conway's Life in Z80 Assembler - How to optimise code.
    Grab the source code: github.com/nco...
    Read the text version: ncot.uk/z80-ho...
    Donate: ko-fi.com/ncot...
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 15

  • @davidparkins1808
    @davidparkins1808 2 ปีที่แล้ว +2

    Extremely interesting. This is right down to the bare metal, so few, even among "expert" programmers and systems analysts, understand what is going on down in the engine room.

  • @craighart7837
    @craighart7837 2 ปีที่แล้ว +1

    Another optimization is to 'wrap around' the field rather than end at an 'edge' (ie exit left enter right, etc.). This eliminates all the edge case tests on one go.

  • @talideon
    @talideon 3 ปีที่แล้ว +1

    Nice work in optimisating it! On the other video, I mentioned using a BSP tree if you want an infinite grid. Upside of that it removes the edge cases and it can reduce the amount of computation once you're dealing with larger automata, but the downside is that implementation of a BSP tree is difficult in and of itself, and particularly difficult in assembly. The one I mentioned was implemented in ARM assembly, but that's a much easier medium than the Z80!

    • @alexloktionoff6833
      @alexloktionoff6833 2 ปีที่แล้ว

      Can you provide a good ink for BSP tree method?

  • @MikePerigo
    @MikePerigo 3 ปีที่แล้ว +1

    Ahh, the joys of optimising assembly. Nicely explained, simple yet comprehensive. Well done. Now you want to try programming processors with multiple data spaces, such as the Motorola 56000 DSP, which allow several instructions to be performed at the same time. Now optimising your code so that you get the best mix of overlapping instructions is fun!

    • @ncot_tech
      @ncot_tech  3 ปีที่แล้ว +1

      DSP programming is what GPU programmers do when their job is too easy, right? 😉

    • @MikePerigo
      @MikePerigo 3 ปีที่แล้ว +1

      @@ncot_tech Nice comparison, I had forgotten GPU parallelism. My other go-to was Occam programming, for Transputers, but that was a higher level so details were handled by the compiler.

  • @SpeccyMan
    @SpeccyMan 2 ปีที่แล้ว

    I ran this on my SC114 board - it currently only has a 9600 baud bit-banged serial port - and I'm getting 3 seconds per update, obviously due to the 9600 baud bottleneck. However, I've just purchased a SC139 serial module kit and will try it again once I've constructed it.

  • @flexairz
    @flexairz 3 ปีที่แล้ว

    Great job, but why does it need to use a joystick?

  • @SteveRaynerMakes
    @SteveRaynerMakes 2 ปีที่แล้ว

    Are you using any plugin for VSCode? I would really like to have syntax highlighting for Z80 in VSCode.

    • @ncot_tech
      @ncot_tech  2 ปีที่แล้ว

      There are some Z80 highlighting extensions for VSCode. There's one that explains the opcodes too. I can't remember their names though.

  • @SpeccyMan
    @SpeccyMan 2 ปีที่แล้ว

    I've been looking at your code for this on github and, given the subject title of this video, I couldn't help but notice something. You are using an awful lot of JP (absolute jumps that take 3 bytes) instructions when you could be using JR (jump relative that take just 2 bytes and execute faster) instructions instead. Practically every JP in your code can be replaced with a JR.

    • @ncot_tech
      @ncot_tech  2 ปีที่แล้ว

      Cool thanks, useful to know 🙂

    • @ncot_tech
      @ncot_tech  2 ปีที่แล้ว

      Cool thanks, useful to know 🙂

    • @jan10n
      @jan10n 2 ปีที่แล้ว

      A JR is smaller and SLOWER than a JP. JR = 12 t states, JP = 10.
      With conditional jumps, it's a different story: JR cc = 12 states when taken, but only 7 when not taken. JP cc = 10.