The Madness of Z80 I/O

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ต.ค. 2024

ความคิดเห็น • 479

  • @bread8070
    @bread8070 9 หลายเดือนก่อน +141

    To understand why the Z80 has separate memory and I/O spaces you have to look at its history. The Z80 is an enhanced 8080. The 8080 was an enhanced 8008. The 8008 was a single chip version of the Datapoint 2200. The Datapoint 2200 had a processor made from over 100 TTL logic chips. It used (in the original version) serial memory and shift registers, and had a 1-bit bus and 1-bit ALU.
    Accessing I/O was very different to accessing the memory so it needed separate instructions for doing that. The 8008, 8080, Z80, and even 8086 and Pentium have just inherited that separate I/O space for backwards compatibility.
    BTW accessing memory on the Datapoint was very similar to accessing registers. That’s why the instruction set bundles memory reads and writes into the same instructions as register reads and writes - as in LD r,(HL) and LD (HL),r in Z80 mnemonics. It’s fascinating that these quirks are still present in Pentiums 50 years later.
    On to Amstrads: I hate being pedantic, but the gate array doesn’t control RAM banking (that’s a personal soap box), but it does control ROM enables, hence why it needs access to A15 and A14.
    Also, the gate array port address and current settings of register 2 are permanently cached in the BC’ registers. (Register 2 controls video mode and ROM enables). Thus when calling into a ROM, or otherwise changing ROM enables, it just needs to swap to the alt registers, modify C’ and output the new value. This saves it having to read and write such values to/from memory and makes such operations much faster.
    But that’s enough waffle. Thank you for the always excellent videos.

    • @OscarSommerbo
      @OscarSommerbo 9 หลายเดือนก่อน +10

      The distinction of banking and ROM enable is a crucial one, I think your soapbox is entirely justified.

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +12

      Bonus advantage: The separate I/O space also simplified memory cache implementation years later.

    • @gcewing
      @gcewing 9 หลายเดือนก่อน +6

      It was quite common in early computer architectures for memory access and I/O to be done very differently. The DEC PDP-8 and Data General Eclipse come to mind. As pointed out, it does have the advantage that you can use all of the address space for memory, which was an important consideration in those days when address spaces were very small by today's standards.

    • @gcewing
      @gcewing 9 หลายเดือนก่อน +14

      Using "LD" as the mnemonic for almost all instructions that move data around is a feature of the Z80 assembly language only. The standard 8080 assembly language wasn't like that -- it used "MOV" for movements between registers, and various mmemonics starting with "LD" and "ST" for memory access. It was quite quirky in various ways, e.g. it confusingly referred to the 16-bit BC, DE and HL registers as just "B", "D", and "H". The Z80 assembly language did an excellent job of cleaning all that up.
      I guess they didn't unify the memonics for IN and OUT because English doesn't really have a single word that works for both directions.

    • @Curt_Sampson
      @Curt_Sampson 9 หลายเดือนก่อน +8

      The separate I/O address space isn't just for backwards compatibility; it's also convenient in that you can generally use less decoding hardware than you need with memory mapped I/O systems.

  • @TheEulerID
    @TheEulerID 9 หลายเดือนก่อน +11

    There is nothing mad about Z80 I/O that I can think of. That is from experience, having used the architecture for a fully interrupt driven embedded application complete with what was a pre-emptive multi-tasking operating system. That included programming CTC, PIO and SIO as well as a couple of backplane-connected vdu cards. It was all mode-2 interrupts, and the nature of the application was that it did not need DMA as it was low data rate, but a requirement for very fast interrupt handling as it was used for timing on a race track.
    It drove two printers, a giant 7 segment display, a single operator console, three timing beams, three car identification systems, start lights, jumped start and car redirection lights. A bit of an oddity, as the original commercial model failed, but the circuit remains, less the (mid 80s) control system, now being a rather large kart track in Milton Keynes, England.
    As the code all had to fit within a 32k ROM, it was compact and was in no way a general purpose OS, and all tasks (7 of them, including an "idle" task) had to be assembled and burned into the ROM with what was the OS core. Tasks were not re-entrant (no need), but there was a large number of shared utility and system routines, all of which were re-entrant and were also shared with the single level interrupt routines (nice on the Z80 with its partial alternative register set). Code that had to be single threaded was simply done under non-interruptible conditions which could be nested. I/O and inter-task communication was done via ring-buffers, although I/O could be direct, as was done in startup via redirecting system routines. The buffers could also be configured as to purpose during startup, although redirecting dot matrix output to a screen would lead to odd results.
    The serial port drivers for various peripherals were driven at (dedicated) task level, using wait states, with the core only handling the interrupt status.
    In short, you could do some very powerful things with Z80 peripheral chips, and some aspects reminded me of what I did as a day job, which was writing operating system code for IBM architecture machines.
    In any event, I was extremely pleased with what I could cram into the Z80s address space, without having to resort to page registers and memory banks. Of the I/O system, at least with standard Z80 peripheral chips, I was very happy.
    As for LDIR and the like, then I was used to SS - store, store instructions on the IBM, and it is a great way of reducing code size, although anathema to the load-store RISC crowd.

  • @VikOlliver
    @VikOlliver 9 หลายเดือนก่อน +65

    Amstrad dev here. It wasn't just a cost issue driving the weird I/O mapping. We also had to consider the number of logic gates available in commercial gate arrays. Side note: The prototype hardware used discrete logic to produce a Gate Array Simulator with the same pinout as the final chip, called the "GAS Board." This also held the EEPROMs that eventually became the ROM. Later we would hack off the EEPROM sections to use for ROM development and called these bits "Small GAS Board" which evolved into us calling them Smorgasbords. You're welcome.

    • @fygarOnTheRun
      @fygarOnTheRun 8 หลายเดือนก่อน +2

      Awesome insider view, thanks a lot!

  • @MarceloSilva-lh9mh
    @MarceloSilva-lh9mh 9 หลายเดือนก่อน +109

    Hey, big Z80 fan here, since ZX-80 times (TK-83 in Brazil). I wrote a good amount of assembly code back in the day, just for fun. Thank you for addressing a question that has boggled my mind for 40 years. The reason for the OTIR (rather than OUTIR) mnemonic is that the instruction mnemonics in ZX-80 Assembly language had to be kept under 4 characters, for display formatting reasons, when the code had to be shown as mnemonics (Disassembly programs).

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +6

      Exactly. Memory had a very high cost back then, so they had to limit the mnemonics to 4 characters.

    • @stephenamor7762
      @stephenamor7762 8 หลายเดือนก่อน +2

      And, that is exactly why I'm useless at "meaningful" variable names!

    • @snakezdewiggle6084
      @snakezdewiggle6084 4 หลายเดือนก่อน

      @MarceloSilva-lh9mh
      You may have meant "Limited to 4 characters".!
      Well turns out there are 8x 5 character mnemonics, undocumented, of course.!
      There was a book published just after the release of the 6128.

  • @EssArrB
    @EssArrB 9 หลายเดือนก่อน +73

    Coming soon, Noel has a total meltdown over Z80 Interrupt Mode 2 vectored interrupts with Zilog's peripheral chips (CTC,DMA,PIO,SIO) !

    • @ncot_tech
      @ncot_tech 9 หลายเดือนก่อน +7

      Here be dragons! Especially if you try that on a ZX Spectrum.

    • @laser-sj
      @laser-sj 9 หลายเดือนก่อน +8

      Mode 2 interupts worked fantastically. At the time though, I barely understood them 😂

    • @TheEulerID
      @TheEulerID 9 หลายเดือนก่อน +3

      I used all those apart from DMA in writing a race track control and timing system complete with pre-emptive multi tasking in what was an embedded system. I was used to writing OS code for IBM mainframes, so the approach of separating memory and I/O address spaces came naturally, albeit the programmability of Z80 peripheral chips was very limited compared to writing channel control programs for the mainframe.
      It was, of course, all mode 2 interrupts. However, the engineering of that sort of system is very different to mass market domestic computers.

    • @tvalenca
      @tvalenca 9 หลายเดือนก่อน

      @EssArrB that was what I was expecting when I saw the thumbnail/title.
      @ncot_tech Specially on any platform that isn't CPC.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +2

      @@gppsoftware the interrupt table couldn't be in contended ram and the table had to be 257 bytes long. In IM2 the value on the bus is a byte offset into a word table & because it was essentially random as there was no hardware in the spectrum that put a byte on the bus during an interrupt acknowledge cycle, then you had to make sure it could fetch a valid word for every possible byte. So for example you would put your code at 0xd1d1, then have 257 0xd1 in ram at 0xd000 and set I to 0xd0

  • @markrosenthal9108
    @markrosenthal9108 9 หลายเดือนก่อน +7

    Nostalgia time again.
    I wrote terminal emulators on the Z80 and the IN, OUT instructions were never any problem. The way they were implemented was with a separate 256 byte address space for I/O. They would typically occur only once each in any program, wrapped in assembly subroutines I would name "inchar" and "outchar". Very simple for async serial I/O.
    The real complexity challenge, especially for data communication, was setting up and responding to hardware interrupts from whatever communication controller chip the implementation used - for a Z80 typically a Z8530. The capabilities of the Z8530 SCC still impress me today.
    LDIR is one of my favorite instructions. Used it frequently for block moves with memory mapped video and such.
    Where OUTI and INI came into play was with "block mode" terminals or emulators and/or synchronous communication (again, the Z8530 as an example). For example, HP3000 mini computers could do block mode terminal I/O and had an application programming support layer for that called VPlus. And don't forget "green screen" terminal I/O from IBM. A defining characteristic of these terminals was the hidden data transfer followed by instantaneous refresh of the entire display (LDIR again on the Z80).

    • @bezbotek
      @bezbotek 2 หลายเดือนก่อน

      ZX Spectrum 128K by Amstrad used I/O ports #7FFD, #BFFD and #FFFD (it means 16 bit I/O address). Even instruction IN A,(NN) sends previous A content to A8-A15 first (this feature was actually used for scanning ZX Spectrum keyboard).

  • @johanderek3383
    @johanderek3383 9 หลายเดือนก่อน +18

    I always assumed that the reason why BC is output to the address lines instead of just C is because of how the register pairs are internally wired to the address bus. You would need additional logic for the IN/OUT instructions to mask B instead of just reusing whatever is being used for HL, IX, and IY. When designing a system it would have been better to just stick to the chip designer's intent of having just 256 ports addressable by an 8-bit register.
    PS. Personally I think the IN, OUT, and LD mnemonics were well chosen. Especially the OUT instruction make it quite clear that you are now messing with the state of an external device out there. I always thought the Intel MOV mnemonic is rubbish because you're copying, not moving. But it's all just mnemonics anyway; one could potentially fork one's favourite assembler on GitHub and come up with better mnemonics.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +1

      I agree, it's the 8080 OUT (d3)/IN (db) instructions that made no sense when Zilog implemented them on the Z80. On the 8080 it took an 8 bit port and uses it for bits 0-7 and 8-15. Zilog put the A register on bits 8-15, but the instructions use A for the data bus as well.

    • @markevans2294
      @markevans2294 9 หลายเดือนก่อน

      @@smf3472 I suspect that the behavior of A8-A15 with these two opcodes (D3 & DB) is an unintended side effect. Which happens to be different between the I8080/8085 and the Z80.
      Whilst Zilog intended the likes of IN A,(C)/OUT (C),A to be 16bit I/O instructions. But for some reason didn't document that the port is value is BC rather than just C. Not documenting ED70 & ED71 since neither IN (HL),(C) nor OUT (C),(HL) make any sense.
      IIRC the INI, OUTI, etc instructions also use the BC register pair to specify the port. But also use the B register as a counter. Effectively making these 8 bit IO instructions.
      Things would have been easier if Zilog had used DE instead of BC for 16 bit IO. Possibly that was too difficult since the instruction EX DE,HL (EB) renames/remaps the registers. With there being an additional level of remapping for the index registers using the DD & FD prefixes.

    • @Keldor314
      @Keldor314 9 หลายเดือนก่อน

      @@markevans2294 B is a special purpose register in that it's the only one that can be used as a counter in autodecrement instructions, such as OUTI, INI, DJNZ, LDI, and so forth. So BC was the only choice that would give them the option of DMA style burst transfers with OTIR and INIR.
      OUT (n),A is also an interesting beast.

    • @undercoveragent9889
      @undercoveragent9889 8 หลายเดือนก่อน

      Well, I always thought of the register pairs as 'HL' being the 'source' of data, DE being the 'destination' of the data and BC being a 'Binary Counter' that acted like a For/Next counter.
      I loved the Spectrum. _And_ the Z-80.

    • @Merilix2
      @Merilix2 3 หลายเดือนก่อน

      @@smf3472 8080 and z80 are quite different processors even if the z80 is designed binary compatible. Indeed the whole selected register pair (whatever instruction/phase it is) is passed to the address bus. In some cases (like IX+n/IY+n... instructions) addresses are assembled beforehand in the hidden WZ Register pair. Visual-Z80 will give some interesting insights how z80 actually works :).

  • @donaldcongdon9095
    @donaldcongdon9095 9 หลายเดือนก่อน +4

    That was deeply fascinating Noel! Reminds me why I always tell people that assembly language is the coolest way to program. Keep those deep dives coming. Z80 forever! Thanks.

    • @rty1955
      @rty1955 9 หลายเดือนก่อน +1

      It sure is. I've been programming in assembly for almost 60 yrs. Began on IBM 1401 then onto 360, 370, 4300 series, s/390, PDP, Data general, CDC and way too many micros to name here. I can do things in assembly that others programmers cant even dream of.
      I was an expert at reading core dumps as well. I see hex and think in binary

  • @rastersoft
    @rastersoft 9 หลายเดือนก่อน +22

    Just two details: the Sinclair Spectrum also uses the upper 8 bits of the address bus for addressing the keyboard's semirows. It also uses the same trick of using one bit for each device: A0 for the ULA, A1 for the ZX Printer, A2 for the memory pagination/AY-3-8192 in the 128K, and A3-a4 for the Interface 1/Microdrives, which also leaves only three bits for other devices (like A5 for the Kempston joystick).

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน

      So that's were the insane idea came from! 😅
      Thanks for the explanation.

    • @Aeroman66
      @Aeroman66 9 หลายเดือนก่อน +1

      Thanks God I'm not the only one who noticed this.

    • @Aeroman66
      @Aeroman66 9 หลายเดือนก่อน

      Thanks God I'm not the only one who noticed this.

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน +1

      Same with the Z81 and Z80, regarding keyboard scanning.

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน +2

      @@fr_schmidlin How is it "insane"? You cannot have seen many hardware designs 😉 Most cheap computers used unorthodox methods and/or (more or less) undocumented aspects of processors or others components to save cost.

  • @seankayll9017
    @seankayll9017 9 หลายเดือนก่อน +13

    I loved LDIR and used it a lot when Z80 asm programming back in the early 80s.

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +4

      Yes! Incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as that. DMA was horribly expensive back then.

    • @shaunhw
      @shaunhw 5 หลายเดือนก่อน

      LDIR was slower per byte transfer, than using long strings of LDI instructions, for example using 32 LDIs to copy over line of screen pixels out on a Sinclair ZX Spectrum.

    • @cheponis
      @cheponis 4 หลายเดือนก่อน

      @@shaunhw Yes, this is true. The convenience of LDIR was dented by slightly longer per-byte transfer times vs LDIs. Kinda sucked, but I still used LDIR most of the time.

  • @SimonEllwood
    @SimonEllwood 9 หลายเดือนก่อน +22

    Z80 is a superset of the Intel 8080/8085. Those only had 256 IO addresses.

    • @wearwolf2500
      @wearwolf2500 9 หลายเดือนก่อน +2

      That get send out twice on the address bus (top 8 bits and bottom 8 bits) when doing an in/out instruction

    • @Torbjorn.Lindgren
      @Torbjorn.Lindgren 9 หลายเดือนก่อน +10

      Yeah, trying to understand the Z80 after only experiencing 68xx/65xx and without first at least reading up on the 8080 is going to be mindbending. Zilog basically threw the kitchen sink at the already existing 8080 design which is the "why" for so many weird corners - they had to work around an existing design. The 16-bit I/O address is a good example, it had to be 100% compatible, but they saw an opportunity to allow the hardware designer more freedom IF necessary for their design.

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +2

      @@Torbjorn.Lindgren You had some of the few sensible/non-fanboyish explanations here.

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน

      @@Torbjorn.Lindgren It was their own design though!
      Faggin and Shima designed the 8080 at Intel as well as the Z80 at Zilog.

    • @etchedpixels
      @etchedpixels 9 หลายเดือนก่อน +5

      Pedantically 8080 but not 8085. 8085 added official different instructions (SIM/RIM) and hidden ones discovered later (LHLX, SHLX, LDSI etc) that were designed for compilers and added a load of 16bit and stack ops. Why Intel hid them nobody seems to know - perhaps to avoid 8086 competition because the 8085 with those instructions ran C and other high level code several times faster than an 8080

  • @static-san
    @static-san 9 หลายเดือนก่อน +3

    I remember learning years ago that OUT (C), A was really OUT (BC), A. But I also heard that Zilog didn't really intend to do that; as someone mentioned, masking out B would've been more complexity they couldn't initially fit on the chip. So the first Zilog manuals didn't mention it!
    How the Amstrad took advantage of this reminds me of how other home computers had partial decoding, too. TI did that in their 99/4a. In the 4a, some hardware was memory mapped and some was in the 9900's version of I/O ports. But neither were completely decoded, so all the hardware was accessible at multiple addresses.

    • @Merilix2
      @Merilix2 3 หลายเดือนก่อน

      There is indeed no address masking at all on the chip. The selected register pair is always fully passed through to the address bus, whatever instruction it is. Even during refresh cycle, the whole I/R pair is on the A-bus.

  • @laser-sj
    @laser-sj 9 หลายเดือนก่อน +5

    LDIR is a fantastic instruction. Used it many times in my past 😂

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +4

      Yes! Incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as that. DMA was horribly expensive back then.

    • @disdroid
      @disdroid 9 หลายเดือนก่อน

      Fewer clocks if you use a block of LDI and jump into the right position. LDIR increments the IP and does a compare then conditionally decrements it again, adding an additional cycle.

  • @fghsgh
    @fghsgh 9 หลายเดือนก่อน +3

    5:16: BC is output because it reuses the same circuitry that LD (BC),A uses. It isn't part of the mnemonic because hardware is _supposed_ to ignore that. OUT (C),A is actually a Z80 extension. The Intel 8080 only had OUT (imm8),A and IN A,(imm8) (using Z80 syntax). Note the 8-bit port number. The Z80 added a variable-port any-register variant to its extended (ED) instruction block.
    10:49: You can actually reach much faster speeds than LDIR (or even than an unrolled LDI loop) using an overengineered PUSH/POP loop, down to 12-13 clock cycles per byte copied, as opposed to 16 and 21 for LDI and LDIR respectively (assuming no wait states). The reason LDI is so slow is because the ED opcode fetch slows it down, the reason LDIR even slower is because instead of having the loop built-in, the z80 just decrements PC twice to go back to the start of the instruction on every iteration, which is a very inefficient operation. (EDIT correction: it may actually be doing a 16-bit+8-bit addition like JR uses. i need to look at the die shot.)
    21:13: FD FD is not empty, it is undocumented. A standalone FD byte actually does the following things:
    1. set a flag that disables all interrupts (including NMI) for one instruction. Other instructions that do this are all the other prefixes, and also EI.
    2. set a flag that replaces any instance of H or L with IY for one instruction, and if (HL) is accessed, read another byte for an index offset
    This means that every instruction that is valid in the base instruction set is also valid in the FD extension. Only some of these are useful though, because a lot of them won't be any different from the base instruction set.
    Therefore, when the Z80 encounters the second FD, it doesn't ignore it and keep reading, it ignores the _first_ FD. The second FD overwrites the first one.
    Another device that uses a Z80 and has it wired up to use all 16 bits is the Texas Instruments 84 Plus C Silver Edition graphing calculator. Except it actually uses all of the ports separately rather than just wiring up some bits to some chips. And it has an ASIC with an integrated CMOS Z80 rather than an original standalone NMOS Z80. Actually, it does 19:50 too: it has a special sequence of 5 instructions that needs to be executed from a "privileged" rom page before letting you access certain hardware ports. However, no check is performed to see if the bytes are executed. You could just read them as regular memory accesses, which means there are a few variants that use undocumented instructions, that can e.g. avoid the IM 1 that is part of the sequence.

    • @amidarius
      @amidarius 9 หลายเดือนก่อน

      Any example of "overengineered PUSH/POP loop" please ? 😃

    • @fghsgh
      @fghsgh 9 หลายเดือนก่อน +3

      @@amidarius Uh, alright, so, for a general purpose one:
      loop:
      ld hl,#
      add hl,sp
      ld sp,hl
      pop af
      pop bc
      pop de
      pop hl
      exx
      ex af,af'
      pop af
      pop bc
      pop de
      pop hl
      pop ix
      pop iy
      ld sp,#
      push iy
      push ix
      push hl
      push de
      push bc
      push af
      exx
      ex af,af'
      push hl
      push de
      push bc
      push af
      ld (#),sp
      ld a,r
      jp nz,loop
      It uses _all_ registers (aside from I, i guess), and uses R as a loop counter. Interrupts (obviously) have to be disabled. The first # is the difference between source and destination, the second # should be initialised to the address of the byte after the destination block, the third # is the address of the second #.
      As the loop increments R 37 times and R wraps around after 128 (which is coprime with 37), every number of iterations up to 128 can be achieved in this manner.
      Of course, the setup for using this method is very expensive, and even then it would copy at 15.9cc per byte (assuming no wait states), which is only marginally better than an unrolled LDI loop at 16cc per byte. And of course you also need to jump into an unrolled LDI loop to cover the remainder after the 20-byte blocks this loop copies. This is a lot of calculations, especially on a Z80 that can't do multiplication or division, but there are definitely cases where several of these values can be hard-coded.
      (i did say it was overengineered)
      However, where this method really shines, is if you know source & destination at assembly time, and are willing to unroll completely:
      ld sp,#
      pop af
      pop bc
      pop de
      pop hl
      exx
      pop bc
      pop de
      pop hl
      ld sp,#
      push hl
      push de
      push bc
      exx
      push hl
      push de
      push bc
      push af
      This can do 12.5cc per byte(!!), in 14-byte multiples. (and it doesn't use AF' because that would actually slow it down to 12.75cc per byte)
      LDI is slow because it's an extended instruction :(. PUSH/POP are crazy efficient like this because they're one-byte instructions that can do two memory accesses. So copying one byte takes 3 total memory accesses, instead of LDI's 4.
      It's even faster if you just need to zero out a block of memory or something:
      ld hl,0
      ld b,l
      ld sp,block+4096
      loop:
      push hl
      push hl
      push hl
      push hl
      push hl
      push hl
      push hl
      push hl
      djnz loop
      This clears a 4096-byte block starting at `block`, taking 25875cc in total, for a rate of 6.32cc per byte(!!!!!).
      But remember to save&restore SP. And remember you can use SMC for that too to save another 10cc :p.

    • @fghsgh
      @fghsgh 9 หลายเดือนก่อน +1

      Okay i did some math to figure out more about the initialisation. The number of iterations is BC/20. The number of bytes to be copied outside of the loop is BC%20, so the JR offset into a block of unrolled LDIs is (BC-1)%20*2. To find the number to set R to, multiply the number of iterations by 45, add 36, and AND with 127. Or something other than 36 depending on where during the initialisation you're setting it. However, this method only lets you copy up to 2560 bytes (+ 19 for the unrolled LDI loop), so you can save at most 256cc, which is less than the initialisation would probably take. Unless you added an outer loop counter too, or unrolled the inner loop some.

    • @amidarius
      @amidarius 9 หลายเดือนก่อน

      @@fghsgh Yes, the first metod is really overengineered. 😄 I knew for second (With 8-byte multiples (Without exx.) was already enough fast. 13cc/byte) and third one. Anyway, nice post for all Z80-every-cc-counts newcomers. 😂

  • @AndrewRump
    @AndrewRump 9 หลายเดือนก่อน +4

    You forgot one important detail!
    On Amstrad, io devices must also monitor the memory access pin - and - if it is active at the same time as the io access pin - ignorere the io access pin - because the cpu is doing memory banking!
    If was nearly bitten by this when I made an interface for Lego Mindstorms version 0.
    Luckily when I was about to give up because my board would be loaded with random values from time to time (when doing memory banking) I just read about the feature in an Amstrad book.
    Fortunately I had connected the io pin access pin to both pins on a NOR gate. I just had to remove the jumper and added the memory access pin to the chip and everything worked!!! 🎉

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +1

      Man, and I though the CPC architecture was hacky enough with this video.
      An honest thank you for giving more details!

  • @dhpbear2
    @dhpbear2 9 หลายเดือนก่อน +3

    12:38 - Easy, the 'B' register is sent to A8-A15 for those who need to decode more that 256 I/O addresses! :)

  • @Zeal8bit
    @Zeal8bit 9 หลายเดือนก่อน +14

    Great video! I love the Z80 and videos that dive deep in its instructions set😍
    There is one instruction that you have not talked about, which is very important in my opinion: OUT (n), A
    Instead of using BC, it uses an immediate value as the port. On the Amstrad, this instruction seems to be unusable since the upper address bits are also taken from A register.
    If we considered the I/O bus as a 16-bit bus, this instruction would need to be of the form OUT (NN), A and that would take even more clock cycle and code space since there is one more byte to fetch.
    Regarding the instruction table, does it also show the undocumented instructions? There are plenty of them, so some "empty" cells in the table would in fact result in undesired behavior on real hardware 😅

  • @bendertherobot910
    @bendertherobot910 9 หลายเดือนก่อน +1

    Clearly, your channel is one of the best about these topics in TH-cam, as well as your edition and storytelling skills. Thanks!!

  • @MarkOfBitcoin
    @MarkOfBitcoin 9 หลายเดือนก่อน +1

    That is the deepest dive on one instruction I’ve ever seen! Well done 😃

  • @drgusman
    @drgusman 9 หลายเดือนก่อน +18

    To me it makes sense. If you think in addressing a register-based external IC you need a /CE signal, an address to select a register and a value to set to that register. Then, with what Zilog did you can use an address decoder connected to the low 8 bits and /IOREQ to drive /CE, the high 8 bits to select the register and the 8 data lines for the value. With this you can have data tables for all the registers of the IC and a simple OTIR will load all of them, or read the full state of the chip with INIR :)

    • @smf3472
      @smf3472 9 หลายเดือนก่อน

      But you also need an address decoder for the RAM and ROM, so it doesn't particularly help. I don't believe Zilog ever considered how simple it would be to hookup. The decision had been made on the 8008 & that had been based on a TTL design that used a delay line instead of RAM. So you couldn't have memory mapped I/O. The intel 4004 also had a strange way of accessing RAM that would make memory mapped I/O impractical. Hence Intel went for non memory mapped I/O

  • @bryede
    @bryede 9 หลายเดือนก่อน +10

    Old NMOS chips uses the capacitance of its transistor gates like a free register from clock cycle to cycle. So, the same effect that creates DRAM cells also works to move data through the chip as long as your cycles aren't so far apart that the transistor discharges. The later CMOS versions are built differently and can be halted completely.

    • @dfs-comedy
      @dfs-comedy 9 หลายเดือนก่อน +1

      Some CMOS chips still use dynamic logic, so can't be completely halted. This is done to improve circuit density.

    • @cheponis
      @cheponis 4 หลายเดือนก่อน

      @@dfs-comedy Are you sure? I thought all CMOS Z80 chips could operate down to 0 Hz. In fact, that was a way to save power.

    • @electrosys
      @electrosys 2 หลายเดือนก่อน

      I think he needs a Z84C

  • @francescosacco4969
    @francescosacco4969 9 หลายเดือนก่อน +3

    Incredible! I loved how in depth you explained and electronically showed the behavior of the Z80. I would love to see more episodes like this!

  • @michaelhaardt5988
    @michaelhaardt5988 9 หลายเดือนก่อน +30

    For true madness, you could have included the hidden 1 bit output port by manually loading bit 7 in R to latch it during refresh. :) The Z80 just suffers from 8080 compatibility at some places.

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน +2

      Ohhh, I didn't know about that!

    • @andyhu9542
      @andyhu9542 9 หลายเดือนก่อน +1

      And an additional 8-bit port with the I register if you don't use Zilog's interrupt system!

    • @greenaum
      @greenaum 9 หลายเดือนก่อน

      Are you talking about output ports or register space?

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +1

      Nintendo's Popeye arcade PCB uses 1 bit of the I register for acknowledging vblank NMI (and for enabling/disabling vblank NMI). Until you look at the schematics and realize it's latching the bit during a refresh cycle, then it is kinda confusing trying to work out why it's changing the I register.
      OUT (c) makes sense as the 8080 had an OUT instruction that only took an 8 bit port, which was then used for bits 0-7 & also for bits 8-15 on the address bus & always outputs A register on the data bus. What really doesn't make sense is that zilog made this instruction output the A register on bits 8-15, which is OK for IN but quite useless for OUT. I don't know what they were thinking

    • @andyhu9542
      @andyhu9542 9 หลายเดือนก่อน +3

      @@greenaum If you latch A8~A15 during the memory refresh cycle, you get the contents of the I register. Therefore, this is like a super easy to use output port that can be loaded with a LD I,A instruction.

  • @rustandmagic
    @rustandmagic 9 หลายเดือนก่อน +17

    I like the Z80 assembly language, I find it logical, the Intel 8080 rubbish on the other hand is madness, I mean "MOV dest,source"!!!!, "LD dest,source" is much more logical, but I guess it is a matter of taste but ok, they could have done the IO instructions better ;)

    • @edgeeffect
      @edgeeffect 9 หลายเดือนก่อน +4

      Yeah, I did my first low-level programming on the Z80... when I moved to CP/M and everything was 8080 based (even though most of the actual hardware was Z80) I hated all those "horrible" mnemonics in 8080 assembler.

    • @LarryRobinsonintothefog
      @LarryRobinsonintothefog 9 หลายเดือนก่อน

      The change to the 80x86 assembly language was different, but you had more memory to access.

    • @deang5622
      @deang5622 8 หลายเดือนก่อน

      Clearly you never used the 6800, 6502, 6809. Far more elegant instruction sets.

    • @rustandmagic
      @rustandmagic 8 หลายเดือนก่อน

      @@deang5622 Used all of them and I agree, but we where talking about Z80's

    • @LarryRobinsonintothefog
      @LarryRobinsonintothefog 8 หลายเดือนก่อน

      I've programmed the 6800 and 6809 not the 6502. The 6809 had a MUL instruction and had to manually do that in the Z80. @@deang5622

  • @richardkelsch3640
    @richardkelsch3640 9 หลายเดือนก่อน +12

    The ZX Spectrum uses the full BC address for scanning keys.

    • @theALFEST
      @theALFEST 9 หลายเดือนก่อน

      but spectrum can use 'IN A, (FE)' instruction with A holding high byte of the port address. It is impossible on amstrad cpc.

    • @qno-oj3py
      @qno-oj3py 9 หลายเดือนก่อน +1

      Ah, yes. The ZX Spectrum. I used to have one in the 80s. Build an EPROM programmer with it to copy drum computer sounds. I remember the keyboard was in the way. Had to use high addresses to not get data garbled. I used a couple of 374 8 bit buffers and a 273 (from memory). Address decoder was a 138. Hardware debugging with a single channel oscilloscope.

  • @cjh0751
    @cjh0751 9 หลายเดือนก่อน +8

    Hi Noel, I was just watching your video about the Fujitsu FM-7 last night. I think the TH-cam algorithm suggested it because of the British Post Office Scandal with the faulty Fujitsu Horizon Post office software. Anyway it's always great to see a new video from you.

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน +3

      Haha, that's hilarious if that's the reason it's pushing that video. I should look to see if there's a recent bump :-)

    • @HenkvanHoek
      @HenkvanHoek 9 หลายเดือนก่อน +1

      I was also watching the postoffice episode. Funny how the TH-cam algorithm works.

    • @disdroid
      @disdroid 9 หลายเดือนก่อน

      Maybe Fujitsu forgot about the upper address line in the software?

  • @GodmanchesterGoblin
    @GodmanchesterGoblin 9 หลายเดือนก่อน +4

    I love the LDIR instruction. It's great for fast block copying. Less well known, is that if the two blocks overlap and begin only one memory location apart, it's possible to clear a large block of memory just by zeroing the first byte and then use LDIR to copy to the next byte repeatedly until the entire block is initialsed.

    • @jeromethiel4323
      @jeromethiel4323 9 หลายเดือนก่อน +3

      Yep. This is exactly how the TRS-80 cleared video memory. You can also use LDIR to shift the video memory one character up, down, left, or right. It's all in how you set up the pointers. Way back in the day i wrote a small machine code routine that piggy backed off of the DOS CMD command (i didn't have a disk system, so no DOS), and it could fill the screen with a character, move the screen as i noted above. Used it for writing games back in the day.

    • @TheEulerID
      @TheEulerID 9 หลายเดือนก่อน

      I was used to SS. (storage-storage) instructions on IBM mainframes, so I wasn't outraged by the existence of LDIR on the Z80, albeit that IBM MVC instruction encodes a fixed length. For variable lengths which iterates using registers, you have MVCL, which is closer to how LDIR works.
      There is another way, using an EX instruction to execute those SS instructions and, essentially, temporarily over-write the length encoded in the second byte of tge instruction (where the length-1 is encoded on an SS format instruction), but I fear that would cause explosive outrage...

    • @flatfingertuning727
      @flatfingertuning727 9 หลายเดือนก่อน

      If one wants to fill a bunch of 256-byte chunks of memory with a 4-byte pattern, using a sequence like LP1: LD (HL),A / INC L / LD (HL),C / INC L / LD (HL),D / INC L / LD (HL),E / INC L / JP NZ,LP1 / INC H / DEC B / JP NZ,LP1 will be a fair bit faster (about 13.5 cycles/byte) than LDIR. LDIR is handy, but it's tragically slow because of how much time it spends using a 4-bit ALU to update BC.

    • @mikafoxx2717
      @mikafoxx2717 9 หลายเดือนก่อน

      LDIR is almost like a hacked-in CPU bound DMA. It's very handy though, and makes for small binaries that can move a lot of data, even if not the fastest way to do it.

    • @PS-bp4ju
      @PS-bp4ju 9 หลายเดือนก่อน

      @@TheEulerID Both LDIR and MVCL are 2-byte instructions and both ... are slow as hell. In places where a performance matters, MVCL is changed to a bunch of MVCs and LDIR to LDIs. But anyway the fastest way to fill/copy data on z80 is via the stack.
      What's also funny, MVCL is also using more registers than specified in the instruction.

  • @captainboing
    @captainboing 9 หลายเดือนก่อน +3

    Long term professional Z80 programmer here. I prefer to think of things the other way around. Very simply the Z80 had 16bit IO addressing. The CPC approach capitalized on this properly and it allowed whole pages of IO to be used for your own hardware. Used this by decoding page FF as enable for existing 8bit addressed IO. Everything else that ignored the top 8bits was either sloppy or following the 8080/8085 method... Unless you were being clever and getting 16bits of data in your OUT (C),A by grabbing the B register from A8-A15.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน

      The z80 design doesn't really scream 16 bit i/o addressing to me. They messed up the IN/OUT instructions by putting the A register on the top 8 bits, which is fine for IN but for OUT it means the data lines and top 8 address lines have to have the same value. In reality the z80 has 8 bit I/O addresses, but an implementation detail means that you can achieve 16 bit if you write the software in a specific way.

    • @captainboing
      @captainboing 9 หลายเดือนก่อน +1

      ​@@smf3472 that "certain way meant choosing which you were going to use before you started designing the system". Not the word salad it is in Zaks or the datasheet, but just think of it that way and it works. If the documentation simply said BC appears on the address bus and A on the data bus, it is very simple - and defo 16 bit IO addressing. That is what actually happens, AND... With 65,000 IO addresses it did encourage "rationalised" designs resulting in lots of mirroring of addresses - but that's what you get when designing to a price-point. We used to "gut" 6128s and use them as embedded controllers in weather RADAR equipment with some older IO hardware using legacy 8 bit addressing. Used IO page 0FFxxh with 8 diodes and some simple logic to pick the top byte as the enable for existing hardware. tiddly little daughter board was all that was needed for upgrading from legacy.
      IN A,x and OUT x,A were legacy commands and provided code/operability continuity with the intel 8080/85 designs that were the main target of Zilog at the time. You are right that the A register appearing on A8-A15 is a bit odd - I am sure there must have been some reason for it - did any design ever use it?
      There are historically a lot of areas where the Z80 could have been a BIG improvement over the 8080... many of the complex instructions, often do not provide code-size or execution time advantages (SET, RES etc) and things like LDIR can be done much quicker by clever use of SP - albeit not as arbitrarily. The 4 bit ALU was a major area they could have spent time, the addition of a MUL command would be so easy and a real coup over Intel, and 16bit relative addressing for CALLs, JUMPs and IX/Y. There were great improvements over the 8080 but a lot were seldom used because they didn't go far enough. In ZILOG's defence,the actual deployment of philosophy of RISC was a decade away, even though the 6502 performed very well with a slower clock and only three registers. Motorola's 6800 series was even better.

    • @mibnsharpals
      @mibnsharpals 5 หลายเดือนก่อน

      @@smf3472 The question is whether the 16bit addressing was intentional. Some people simply assume that there is a glitch when using the IN/OUT(c), so that the B register then displays the upper bits.

  • @ErazerPT
    @ErazerPT 9 หลายเดือนก่อน +3

    Fun video. As for the last part, that sounds a damn lot like what the Graffiti "display adapter" for the Amiga did. It just "kept state" with the graphics data being outputted, and if it received a very specific setup that made very little sense in normal use, bam, it was not bitplane data but "chunky" data. Sure, it's ugly, but when you're repurposing existing stuff for new purposes, that's how you go...
    p.s. in the software side, it's the equivalent of when you have to keep using some data structure that already exists without changing it, you abuse it a bit on the set up side, and on the other end if it's set in the "right way", you know that the content is not straight "regular data" but "data that needs further decoding". The amount of abuse string data gets with "json/xml inside" kludges makes these hardware shenanigans look tame ;)

  • @jeroentaverne8232
    @jeroentaverne8232 9 หลายเดือนก่อน +2

    Sinclair ZX80,ZX81 and Spectrum actually used A15..A8 as row selector for the keyboard matrix when executing an IN instruction to read the columns. Just to save costs for a real output register to select the row.

  • @Randrew
    @Randrew 9 หลายเดือนก่อน +2

    Before watching your video, I'm gonna say: August 30, 2006 I edited the wiki Z80 page to add "undocumented 16-bit I/O-addressing" information. Actually the documentation was in the Z80 Hardware Reference Manual all along (still got it around here somewhere) but isn't too clear about it. It shows the BC register being asserted to the full 16 address bus in the OUT (C),A and inverse instructions. I actually took advantage of this feature in the late '80s in interfacing a some 16 bit (address) IO cards in a Z80 system.
    Now, on to see what you have to say ;)

  • @OscarSommerbo
    @OscarSommerbo 9 หลายเดือนก่อน +5

    This video all but convinced me to learn z80(e) assembler, so much fun stuff in the Z80, compared to the 6510, which I programmed for the last time in the 80s.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +1

      I have done both 6502 and Z80 in recent years, I find 6502 vastly more enjoyable than Z80.

  • @Lord-Sméagol
    @Lord-Sméagol 9 หลายเดือนก่อน +1

    On LDIR/LDDR if you need a bit more speed, use several LDI/LDD instructions and loop them; LDIR is 21 cycles per iteration, LDI is 16.
    On INIR/OTIR, I think they can be useful for RAM Disk; Using 256 byte 'sectors', the B countdown with INIT/OTIR can be used to address each byte of the sector, making the read and write code very small and fast.

  • @0cgw
    @0cgw 9 หลายเดือนก่อน +2

    Yep, having converted assembler to hex codes before I got a proper assembler for my ZX Spectrum back in the early 80s, I immediately shouted at the screen #C9 = ret. 😄

    • @Foersom_
      @Foersom_ 9 หลายเดือนก่อน

      I did the same but using Spectravideo and MSX.

  • @TheUtuber999
    @TheUtuber999 9 หลายเดือนก่อน +4

    6:18 You can absolutely single-step a Z-80. Just use static RAM instead of dynamic RAM in your implementation. There is a video from five years ago on my channel as one example. Cheers.

    • @melkiorwiseman5234
      @melkiorwiseman5234 9 หลายเดือนก่อน +5

      I think what he's saying is that just like the 6502, some early Z80s used CMOS dynamic RAM for their internal registers, meaning that a certain minimum clock speed needed to be maintained in order for the registers to be refreshed. That's something I didn't know about. I thought that all Z80 CPUs always used static RAM for their internal registers.

    • @johncochran8497
      @johncochran8497 9 หลายเดือนก่อน +1

      You can't single step the original Z80 design.
      The original Z80 introduced in 1976 used a dynamic design in NMOS and had a minimum clock frequency of 250kHz. Going any slower could cause a loss of state and a crash.
      The CMOS version of the Z80, which was introduced in 1985, had a static design and could be single stepped.

    • @deang5622
      @deang5622 8 หลายเดือนก่อน

      I wouldn't use SRAM over DRAM just so I could single step through the code. That is a very poor engineering decision. SRAM is a lot more expensive per byte compared to DRAM and not necessarily a good choice in a design where the production cost is sensitive.
      Just use a decent in-circuit emulator.
      Use the proper debugging tools.

    • @TheUtuber999
      @TheUtuber999 8 หลายเดือนก่อน +1

      @@deang5622 You're kidding, right? A single chip with 64kb of SRAM (the maximum addressable amount of memory for a Z-80) costs about $3.95 USD.

  • @korsibat
    @korsibat 9 หลายเดือนก่อน +6

    As this video is not about Z80 I/O being mad but instead as you state in the video it's the Amstrad CPC hardware being the odd

    • @mibnsharpals
      @mibnsharpals 5 หลายเดือนก่อน

      thats right :-)

  • @EricStringer
    @EricStringer 9 หลายเดือนก่อน +1

    Note there are two versions of the Z80 NMOS vs. CMOS
    CMOS you can single-step the clock input

  • @etmax1
    @etmax1 9 หลายเดือนก่อน +4

    This sort of operation is done by every IC manufacturer of the time in one way or another, they didn't do VHDL synthesis and and all that nice stuff we do now, and also transistors were limited to less than 10000 around this time so if an instruction had an artefact then so be it. What's important is that the things they say happen, actually happen. If you look at the 6502, it has stacks of undocumented features/instructions that came about by a chance of sorts.

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน

      Exactly, well put.

  • @bendertherobot910
    @bendertherobot910 9 หลายเดือนก่อน +1

    Oh, sorry. I'm not sure about this, but reading new Zilog Z80 manuals I realized that these have a lot of mistakes. I prefer to read old manuals (like those stored in Bitsavers website). By the way: Happy New Year, Noel!! Thanks for the awesome video (as always)!!!

  • @ojonasar
    @ojonasar 9 หลายเดือนก่อน +1

    Many many years ago my older brother and I built a Z80 system for a Geiger counter logging device that output to a thermal printer. It used a single EPROM, 1 I/O input port and 1 I/O output port - no RAM as there we sufficient registers to not need it, even using one of them to substitute for the stack.

  • @klausmoritzpeitzsch690
    @klausmoritzpeitzsch690 9 หลายเดือนก่อน +1

    Thx for your great content! I was already lost after a couple of minutes since I am still in the breadboard phase of my Z80 build and am still busy understanding the architecture as such. Therefore, good to know about the I/O tricks.

  • @Mistasparkaru
    @Mistasparkaru 9 หลายเดือนก่อน +5

    All the z80s ive used were quite happy running hertz speed i.e. one switch press per clock tick. Ive only test ed ~5 cpus though

    • @isaacmarinobavaresco7397
      @isaacmarinobavaresco7397 9 หลายเดือนก่อน +10

      CMOS Z80s run OK at zero clock, but for the old NMOS devices, the minimum speed is about 125 kHz.

    • @____________________________.x
      @____________________________.x 9 หลายเดือนก่อน

      ​@@isaacmarinobavaresco7397 "the Clock Pulse Width (Low) maximum is 2000 nsecs (2 usecs). so for
      a square-wave clock with a 50% duty cycle, the minimum clock frequency is 250 KHz" - I thought it was around 500Khz myself? but that's what the net says. hth

    • @cokesandwich1668
      @cokesandwich1668 5 หลายเดือนก่อน

      I'm pretty sure mine ran on +5 VDC only. I presume that means it was CMOS, not NMOS?
      In any case I had trouble when I first built the system. It ran all janky.
      So I replaced the 2.5 MHz clock with something around 2 Hz and it worked fine.
      So I pulled out the scope and found crazy high noise on the +5 V rail.
      I cleaned that up and things ran just fine at 2.5 MHz.

    • @cheponis
      @cheponis 4 หลายเดือนก่อน

      @@cokesandwich1668 The NMOS Z80 also used only +5V, which was one of its major HW selling points -- compared with the 8080 that needed, what +5V (Vcc): This was the primary supply voltage for the internal logic circuits. +12V (Vdd): This was used for the substrate bias. -5V (Vbb): This was used for the input level shifting

  • @cbmeeks
    @cbmeeks 9 หลายเดือนก่อน +7

    Man, the scratches on that Z80. That CPU has some stories to tell. lol
    Great video! (and that's coming from a hardcore 6502 fan)

    • @Flashy7
      @Flashy7 9 หลายเดือนก่อน +2

      Those might be from Noel's fingernails while figuring it out :)

    • @mikafoxx2717
      @mikafoxx2717 9 หลายเดือนก่อน

      6502 definitely seems a bit more elegant.. z80 might be easier to program, though. But it has some real wonky things. Shadow registers, output bus shenanigans, 8080 backwards compatibility you have to keep in mind if you want to run your software on both, like CPM software usually did..

  • @scottlarson1548
    @scottlarson1548 9 หลายเดือนก่อน +2

    I think one of the epiphanies of understanding my computer as a kid was figuring out how a chip like the 6850 serial port was at a certain memory address. I assumed it was something clever and elegant and I remember being disappointed when I saw it required a whole bunch of logic chips connected to *all* of the address lines just to enable the chip at that address.

    • @flatfingertuning727
      @flatfingertuning727 9 หลายเดือนก่อน

      If one is using e.g. 8Kx8 RAM and ROM chips, and can afford 8K of address spacing for eight I/O devices, one would use one 74LS138 to select the RAM chips, ROM chips, and the second 74LS138 to select a particular I/O device. When using separate I/O space, one would still need two 74LS138 chips.

    • @scottlarson1548
      @scottlarson1548 9 หลายเดือนก่อน

      @@flatfingertuning727 You mean have the 6850 registers filling up an entire 8K block of memory space? 😬

    • @flatfingertuning727
      @flatfingertuning727 9 หลายเดือนก่อน

      @@scottlarson1548 The second 74LS138 would partition an 8K chunk of address space into eight 1K chunks, one for each I/O device. That would reduce the amount of directly-addressable memory from 64K to 56K, but despite some people's excessive aversion to bank switching, having 64K of linear RAM address space is for many tasks not the most useful means for a system to support 64K of RAM.

  • @ojonasar
    @ojonasar 9 หลายเดือนก่อน +3

    The ZX Spectrum uses the full 16 bits when reading the keyboard.

  • @jrkorman
    @jrkorman 9 หลายเดือนก่อน +3

    Zaks' "Programming The Z80" was my bible when I got started back in 1980. Hand assembling my code for probably the first 6 months as well has hand disassembling also! This was on the Radio Shack Model 1 computer. You really LEARN when you're doing it that way!
    The "problem" probably came from Zilog "overloading" existing instruction "code". I later years I learned to be very cautious about overloading/extending code because of weird stuff in the base code.

    • @Plons0Nard
      @Plons0Nard 9 หลายเดือนก่อน +1

      Rodney Zaks was quite active in those days. I had two books of him for the 6502. 🤝

    • @retrozmachine1189
      @retrozmachine1189 9 หลายเดือนก่อน

      I had both the Z80 and 6502 books but only have the Z80 one still. I think mine is a 1982 edition, packed in a box somewhere so it's not at hand.

  • @ncot_tech
    @ncot_tech 9 หลายเดือนก่อน +1

    OK this is comforting to know and that the documentation is randomly inconsistent, and that when I was doing some Z80 on my Agon Light I wasn't in fact going insane when things were acting strangely. Also I guess they called it OTIR to keep the mnemonic as four characters?

  • @adilsongoliveira
    @adilsongoliveira 9 หลายเดือนก่อน +1

    Yeah, new Noel Labs video. Great way to close my Friday :)

  • @semibiotic
    @semibiotic 9 หลายเดือนก่อน +2

    16-bit bus extention also implicitly used in legacy IN A, (NN) / OUT (NN), A instructions.

    • @johncochran8497
      @johncochran8497 9 หลายเดือนก่อน

      Not really. In for the "in a,(n)" and "out (n),a" opcodes, the upper 8 bits of the address bus is the contents of the "A" register at the start of the operation. So you can sorta get the behavior you want using the "in a,(n)" operation by loading A with the desired upper 8 bits prior to the IN opcode. But unfortunately, that doesn't apply to the OUT operation in which if you attempt to use the upper 8 bits, that would simply result in a set of 256 port addresses, each of which will only receive a single possible value.

    • @semibiotic
      @semibiotic 9 หลายเดือนก่อน

      @@johncochran8497 I didn't say that it is always useful extention.

  • @michaelcrisp562
    @michaelcrisp562 8 หลายเดือนก่อน

    Hi Noel, great trip down memory lane, thanks. I cut my teeth on Z80 assembly back in the day, just like you I used a switch to toggle the clock line. Pretty sure it worked without issue ie totally static design. I think the original Zilog data even mentioned this in the clock specifications dc to 2MHz, I remember being impressed by this. keep up the good work 😊

  • @TSteffi
    @TSteffi หลายเดือนก่อน

    Fun fact:
    With the eZ80, Zilog embraced the 16 bit IO. They renamed the OUT (c),r instruction to OUT (bc),r. Same for the corresponding IN instructions. And they added new variants of OTIR (OTIRX) and INIR (INIRX) that use DE for the port address instead of BC.

  • @mogwaay
    @mogwaay 9 หลายเดือนก่อน +2

    Great video Noel, always here for a good nerdy deep dive! I feel like the 8088 does something similar with the high address lines with its it's IO OUT command, hmm might look into that some time... Cheers!

  • @MK-jo1gi
    @MK-jo1gi 9 หลายเดือนก่อน +6

    LDIR is not an eyesore! I used it to copy blocks around in my Schneider 6128 between memory banks. It was very handy, not having to write the loop. :)

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +3

      Yes! It's incredible how many people forgot that LDIR/LDDR/OTIR/OTDR were a cheap alternative for DMA memory transfers, and advertiser as such. DMA was horribly expensive back then.

    • @GodmanchesterGoblin
      @GodmanchesterGoblin 9 หลายเดือนก่อน +2

      Yes. I modded my Spectrum back in 1982-83 up to 80k (16k with two 32k banks) and used LDIR as a blitter to create some pretty slick animations by pre-drawing multiple images in RAM and then repeated LDIR to drop them into the active screen buffer.

  • @Leahi84
    @Leahi84 9 หลายเดือนก่อน +2

    Happy new year! Great to see a new video.

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน +1

      Thanks! Happy new year too!

  • @fr_schmidlin
    @fr_schmidlin 9 หลายเดือนก่อน +7

    Holy mother of cr*p! This video is not about the madness of the Z80 I/O, it's about the madness of the CPC I/O! And I thought the SMS already had some very unwise shortcuts. 😱
    And decoding I/O like "special" opcodes on instruction fetch states is not only an immense detour to workaround a nearsighted architecture decision, but it's very costly hardware-wise. To the point of being insanely expensive if it was to be implemented in the 80s. Much more than the damn single 74LS138 that Amstrad omitted from the design, and could have avoided this whole mess.
    At times like this it becomes clear how the MSX was a real computer architecture, instead of just a bunch of hackily wired together chips. And how clever, flexible and expansible its architecture was.

    • @Foersom_
      @Foersom_ 9 หลายเดือนก่อน +1

      Well said.

  • @stinchjack
    @stinchjack 9 หลายเดือนก่อน +3

    Yes Zilog could have picked a better mnemonic than "out (C),r". But I believe on the Intel 8080 "push b" was used instead of "push bc", so Zilog did improve some of the mnemonics. A lot of the rest of the complaints are Amstrad-specific.
    More interestingly/annoyingly for me is that the Z80 adds extra wait-state is added for IO cycles than RAM cycles. Devices I have connected to Z80 either dont need it all, or need much longer delays than 1 cycle.
    I am glad for the Z80's seperate I/O space, it means that IO selection can be done easily with a 74LS138 (address decoder chip)

    • @taxessux
      @taxessux 9 หลายเดือนก่อน

      The wait states were necessary for many parts in the day. Memory speed was all important.

    • @mikafoxx2717
      @mikafoxx2717 9 หลายเดือนก่อน

      Wasn't the wait state arbitrarily long, depending on when the ready was received? Checking would at least eat a cycle, though, even if it's instant.

    • @stinchjack
      @stinchjack 9 หลายเดือนก่อน

      the Z80 doesnt have ready pin ...

    • @mikafoxx2717
      @mikafoxx2717 9 หลายเดือนก่อน

      @@stinchjack Ah, must be thinking of another processor

    • @stinchjack
      @stinchjack 9 หลายเดือนก่อน

      @@mikafoxx2717 I have an 8253, an 8251, SAA1099, 58167 RTC , HT6542, and YM3812 connected to my project fine at 6MHz without needing additional wait states. LCD screen will only go as far as 1.5 Mhz tho.
      Im not sure which chips would have needed extra wait states at 3.5MHz/4Mhz ?

  • @john2001plus
    @john2001plus 8 หลายเดือนก่อน

    I have programmed Z80 in 4 different decades. First on a TRS80, and finally on Gameboy Color. I didn't know any of this stuff. Thank you.

  • @WacKEDmaN
    @WacKEDmaN 9 หลายเดือนก่อน

    Excellent stuff Noel... now i understand why CPCs OUT instruction is handled different to the rest of them!.. and why it doesnt like the LDIR (and other loop) instructions in certain circumstances..

  • @ian_b
    @ian_b 9 หลายเดือนก่อน

    This has puzzled me since 1982 or so. Thanks for addressing it!

  • @alzalame
    @alzalame 3 หลายเดือนก่อน

    Pretty nice explanation, thank you .

  • @domramsey
    @domramsey 9 หลายเดือนก่อน +1

    Literally no idea what you just said. But I watched It all, of course.

  • @cheponis
    @cheponis 4 หลายเดือนก่อน

    The LDIR instruction is genius. Same with LDDR. And, I rather much liked the LD instruction "LOAD WITH " It's as close to the "=" in C as possible. Very logical. I/O is just different, and that's OK. IN and OUT are how one thinks of I/O -- heck I/O is an abbreviation for IN and OUT. The 6502 didn't even have I/O instructions.

  • @chainq68k
    @chainq68k 9 หลายเดือนก่อน +6

    I'm not a Z80 guy, but I know the I/O space from old school, pre-protected mode x86 programming, and this video triggered my PTSD. ... And made me remember why I'm a Motorola 68k fan. I mean, all retro computers are fun in their own way, but in some cases, their creators were so busy to determine if they could, they did not stop for a moment to think if they should. :) Still, we love them for their faults too.

    • @andrewclegg9501
      @andrewclegg9501 9 หลายเดือนก่อน +2

      I’ve always found z80 and x86 clunky, probably as i started with 6502 then 68k

    • @fnjesusfreak
      @fnjesusfreak 9 หลายเดือนก่อน +2

      x86 has common origins with Z80 - that's why their ASM look so similar.

    • @ArneChristianRosenfeldt
      @ArneChristianRosenfeldt 9 หลายเดือนก่อน +2

      @@andrewclegg9501I hate this overflow flag pin hack on 6502. And IO range in zero page on later derivatives. Give me a 16 bit z address register instead to look for tape data at the normal location!

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน +3

      I don't understand your point really. What's wrong with a separate I/O-address space? To me, it feels pretty natural. And if you dislike complexity I cannot really see how you can be a 68000 fan... 😉

    • @melkiorwiseman5234
      @melkiorwiseman5234 9 หลายเดือนก่อน +1

      On the subject of protected mode, it occurred to me a year or two back that you could implement a "protected" mode on almost any CPU (but in particular on the Z80) by using bank switched RAM and some hardware to detect which banks were switched in and what addresses and ports were being accessed by what instructions and at what memory locations.
      Specifically, you could implement a NMI to force the program to return to the OS if the running program attempted to directly access ports, memory bank switching to any bank not assigned to it by the OS, or a direct call to an OS routine without going through the correct call address. Access to all of the above would have to be done by the program setting up data in registers and then jumping to a particular address (a-la CP/M) in order to perform the function via the OS.
      I wonder how often that was actually done IRL and how effective it was?

  • @lister_of_smeg6545
    @lister_of_smeg6545 9 หลายเดือนก่อน +2

    FD 70 xx is LD (IY+x), B
    FD 07 xx will do RLCA, followed by whatever instruction xx is.

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน

      Oh crap, did I reverse the opcode? Oops! You get what I meant though.

    • @lister_of_smeg6545
      @lister_of_smeg6545 9 หลายเดือนก่อน

      ​@@NoelsRetroLab Yep :) It's quite clever how they've chosen instructions that will put a particular register on the data bus, which can be snooped and read as a parameter by the Dandanator.

  • @ChrisWalshZX
    @ChrisWalshZX 8 หลายเดือนก่อน

    Fantastic Video.
    I'm a ZX Spectrum software developer and that's the best exclaimation I've seen for the IN (C) and OUT (C) not having B included in the mnemonic. As a software developer, I often ues LDI, LDIR, LDD, LDDR but rarely have I every used OUTI OTIR INI INIR, just the regular IN|/OUT and so the counter versions because the root reason hasn't really occurred.
    I'm glad modern hardware does full address decoding. :-)
    The Amstrad CPC info was new to me. Thanks.

  • @abyssal-space65
    @abyssal-space65 9 หลายเดือนก่อน +3

    Running Z80 on breadboard at 10Mhz without issues, even doing hardware serial trough z80 SIO chip... otherwise - very nice overview of IORQ :D

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน

      That's surprising. I never tried, but I would have thought that it would have all sorts of interference at that rate. Heck, look at my IORQ signal. It looks horrible already! :-)

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน +1

      Me too, although partly on veroboard, but using a vanilla 4 MHz rated Z80 (some manufacturer's "4 MHz" Z80 work at that speed, some don't). The same design was stable at 12 MHz too, when using a 6 MHz rated Z80. (Not really at 16 MHz though, that one was sensible to glitches.) It very much comes down to using fast enough RAM and ROM!

  • @etchedpixels
    @etchedpixels 9 หลายเดือนก่อน +2

    The out behaviour for A and the 256 port side also comes from the 8080 which the Z80 was trying to be some level of compatible with. This is also why the IN A and OUT A instructions don't affect flags but the (C) versions do.
    Guess what the 8080A did when you did an OUT instruction ? It put the contents of A on both the upper and lower halves of the address bus. This is why the Z80 has that behaviour, to be compatible with hardware that relied upon this (eg by decoding some I/O off each half to the bus to avoid the usual problem with fan-out limits on the low bits)
    The choice of having an actual I/O space with IN and OUT goes back the 8008 and so presumably the board of discrete logic in the terminal that it was supposed to replace.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +2

      Close, 8080 OUT/IN instructions took a byte which is the 8 bit port and that gets put on both bits 0-7 and 8-15 of the address bus, OUT puts the A register on the data bus and IN sets the A register from the value put on the data bus by the hardware. Zilog versions of those instructions are different, the 8 bit port goes on bits 0-7 and the A register is put on bits 8-15, which is pretty useless for the OUT instruction as it's also the value that will be put on the data bus.

  • @rebeccaabraham8652
    @rebeccaabraham8652 9 หลายเดือนก่อน

    Oh gods - this is a trip down memory lane! Used to work with the Hitachi 64180 - an enhanced Z80 - in assembly language - writing disk utilities…. Great fun - even if the company wasn’t up to much - and I then moved to GEC and started playing with unix workstations; but I’ll always have fond memories of the Z80 days!

  • @PebblesChan
    @PebblesChan 9 หลายเดือนก่อน +2

    The microbee computer also uses A[8:15] as supplementary data bus for 16-bit I/O instructions and the IN instructions for some output port & register functionality. 😊

    • @herrbonk3635
      @herrbonk3635 9 หลายเดือนก่อน

      Yes, and many embedded systems too (including some of my own).

  • @emesde
    @emesde 8 หลายเดือนก่อน

    On msx you have a choice, you can use I/0 ports or memory mapped. It has a slot select mechanism where you can choose out of 16 slots on 4 memory locations (16KB pages) . There is also some other I/o addressing in the standard which makes it possible to go beyond 256 I/0. All these things were standardized . I think probably that is why there are soo many extensions for this system.

  • @Blitterbug
    @Blitterbug 9 หลายเดือนก่อน

    Interesting take on what seemed perfectly reasonable to me back in 1982! In fact I remember feeling that the 6502 was crippled by comparison, until I realised how much more work it could do per clock, and became somewhat of a 6502 convert. Z80 code still seems sensible, tbh.

  • @timhill9039
    @timhill9039 9 หลายเดือนก่อน

    Another peculiarity of the Z80 is the added IX and IY registers. The Intel 8080 (which the Z80 was designed to be a superset of) had various registers (BC, DE, HL) that could be used as 16-bit registers, but each could also be used as two 8-bit registers (B,C,D,E,H,L). However, the new IX and IY registers could only be 16 bits, not 2x8 bits. In most cases, anywhere in the Z80 instruction set you could use the HL register, you could instead use the IX and IY registers (very useful). It didnt take too long to notice that the opcodes for these IX and IY instructions were the SAME as the equivalent HL instruction, but with one of two opcode prefix bytes that changed the next opcode meaning to use either IX or IY instead. Armed with that hackers (like me) experimented and discovered that if you added those opcode prefixes to instructions that accessed either the 8-bit H or L registers, the CPU would access the upper or lower 8-bits of the IX or IY registers. Again, very useful, and to this day I dont know why Zilog didnt document this feature (no doubt the instruction set designers didnt realize that this was hot the chip design ended up operating).

  • @zxborg9681
    @zxborg9681 9 หลายเดือนก่อน +2

    Very interesting analysis. I always treated IO as still fundamentally based on the 256 byte space of the 8080/8085, and saw the extra B register output on the address MSB as just a half-implemented idea for an abortive 16-bit IO space architecture idea. But the Amstrad explanation is interesting, how they made use of it after all. Thanks for the deep dive.

  • @tmbarral664
    @tmbarral664 9 หลายเดือนก่อน

    Brilliant explanation, full of details ! Kudos!

  • @fluiditynz
    @fluiditynz 9 หลายเดือนก่อน

    I had an Amstrad CPC664 many years ago and rolled my own DIY epansion for it. Looking at the data book for the Z80, I expected to be able to use interupt mode 3 with the most features( I connected a UART which I wrote a 68HC11 cross assembler for to program 68HC11E2 chips and inputs to scan a DIY mouse) Unfortunately, Amstrad chairman Alan Sugar had chosen interupt mode 1? (it was a long time ago), which did not support real time interupts. As a consequence, I dropped down to polled encoder inputs and my mouse only worked very slowly. Early days!😆 I remember writing assembler routines accessed via Amstrad Locomotive BASIC and setting up BASIC configurable screen colour redirects and automatic cycling pallette, then writing maths art 3D Z-scaled algorithms with some modulo cycling through the pallette. So much fun and my pot-head flatmates were awed by the psycadelic results! Writing the 68HC11 was a great experience. Took me a month of my spare time and I was so pleased the assembler mnemonics were not too CISC, Motorolla's micros had elegant mnemonics.

  • @scsirob
    @scsirob 9 หลายเดือนก่อน +2

    Great coverage of the subject, thanks! Using 'blank' or 'undocumented' opcodes is indeed a clever hack, but not without risk. There are Z-80 compatible CPUs that have additional opcodes in the unused space, such as the Hitachi HD64180. A program may auto-detect the use of that CPU by attempting an opcode sequence that behaves like you described on a genuine Z80, but has a different result on a Hitachi chip. Granted, the chances of running into such software on an Amstrad aren't that high ;)

  • @kensmith5694
    @kensmith5694 9 หลายเดือนก่อน

    IIRC, the ZX80 ran a line to the expansion connector that allowed you to prevent the address decode on the unit from doing its thing. This cost them very little extra hardware but also allowed some very creative things to be done with a ZX80.

  • @liontuga155
    @liontuga155 9 หลายเดือนก่อน

    Oh, the memories... Tried to make a 1 bit sound sampler of sorts with INIR back in the 80’s on my Spectrum. Good times! :D

  • @greenaum
    @greenaum 9 หลายเดือนก่อน +1

    The Z80 includes hardware to automatically refresh DRAM, where other CPUs needed extra hardware to do that. So it sends out the odd request to RAM that's not in the code it's running. If you were to put IO there, it might end up being triggered by those refresh cycles. Of course, a ROM chip won't mind an attempt to refresh it, it'll just do nothing. So it makes sense there's an extra IN / OUT instruction that just raises the right pins so some device can respond appropriately.

  • @gaku8108
    @gaku8108 9 หลายเดือนก่อน

    In the case of OUTI, B is decremented first.
    OUTI: B←B-1 (C)←(HL) HL←HL+1
    INI: (HL)←(C) B←B-1 HL←HL+1

  •  9 หลายเดือนก่อน

    Sinclair did the same. There are ports actively being used in ZX Spectrum 128 and Amstrad produced Spectrums as well. Port #1FFD (+2A, +3x mem ctl), #7FFD (128 mem ctl), #BFFD (AY port), #FFFD (AY port).
    Other usable method is memory mapped IO. For example for ZX Spectrum, there's a ROM at first 16K of address space. By pulling up the ROMCS signal (it has the required resistor already), ROM chip will not get selected and the whole 0-$4000 is now available for IO. Or external ROM/RAM/whatever. It can even be triggered but memory write to a location, where ROM resides. This is how most of the extensions for Spectrum work.

  • @herrbonk3635
    @herrbonk3635 9 หลายเดือนก่อน +4

    _"Totally different"_ Really? The Z80 manual as well as other sources are all saying that the full BC register is placed on the address bus. And if you use OUT (C),B the B register appears on the data bus as well, of course. Not sure what you are making a big fuss about?

  • @taxessux
    @taxessux 9 หลายเดือนก่อน

    You also need to take into account that I/O instructions made it into the x86 realm, so far that I/O cycles were a special cycle on the PCI bus. The limited amount of I/O addresses were a constant annoyance, mostly because a large swath was generally taken out of each page to account for aliasing of old ISA cards. The ability to have so many devices on the PCIe bus would cause legacy devices to run out of allocated I/O areas. I think that happened somewhere around 10 ethernet controllers.
    Fortunately, when NOT using legacy access modes, we were able to scrounge up more I/O ranges. It was amazing to me how long the I/O ranges lasted in x86 CPUs. As a BIOS programmer, it was hell.

    • @boptillyouflop
      @boptillyouflop 9 หลายเดือนก่อน

      The fact that the PC survived so many massive changes (protected mode, 32bits, out of order, 64bits, PCI, GUI, the insane variety in SVGA cards and sound cards, 3d acceleration, multi-core, VGA, paging) is definitely some kind of miracle.

  • @antoninkolouch5161
    @antoninkolouch5161 9 หลายเดือนก่อน

    The standard Sinclair Interface-1 was also using M1 to capture rst #8 to swap shadow ROM with a different code to handle it. The rst #8 was called by the standard BASIC interpreter in any case of syntax error so it allows easy expansion of commands.

  • @laser31415
    @laser31415 8 หลายเดือนก่อน

    i wish some of these videos were available in 1984, it would have saved me many many hours of trial and error. Every now and then I still code my old z80 based TS2068. A few years ago I breadboarded (TTL logic only) my own dot matrix printer interface for it. That had been in my bucket list for 37 years.

  • @ehsnils
    @ehsnils 9 หลายเดือนก่อน

    I remember the fact that the Z80 couldn't be single stepped, which caused me some headaches in debugging once. Took a second or two before the processor went socks up and a restart was needed. I realized pretty soon what the problem was, but it still was a headache.
    Since I started with the Z80 processor the I/O it has seems normal to me.
    A more interesting aspect is to use "IN (BC),A" (sorry if I borked it a bit), but then you could use the instruction to create a matrix decoder of a keypad.

  • @cbmeeks
    @cbmeeks 9 หลายเดือนก่อน +4

    OK, now I really know why I prefer the 6502. I designed a 6502 SBC a few years ago and it was pretty simple. In the 6502, you can get 32K of RAM and 32K of ROM with only 256 bytes of IO with THREE TTL chips. So you don't have to waste a lot of RAM for IO.

    • @retrozmachine1189
      @retrozmachine1189 9 หลายเดือนก่อน +2

      Neither do you need to waste a lot of memory space for IO if you decide this is the IO scheme you want on a Z80 either rather than port based. Guess what, you can do the decode with a low number of TTLs too. Not saying your like of the 6502 is flawed of course, but your errr logic, is.

    • @fr_schmidlin
      @fr_schmidlin 9 หลายเดือนก่อน +2

      Please (*1) don't mistake the CPC nonsense with the Z80 architecture. If you want something more sensible, take a look at the SMS architecture or, even better, the MSX architecture. (The SMS still had some cheap shortcuts, since it was a legacy from earlier cheaper designs of the SG-1000)
      *1: Honest remark, no offense meant

    • @cbmeeks
      @cbmeeks 9 หลายเดือนก่อน +2

      @@retrozmachine1189 not sure what you meant by my logic being flawed. I wasn't putting the Z80 architecture down. I was simply stating you don't have to waste a lot of memory space for IO on a 6502 design.

    • @cbmeeks
      @cbmeeks 9 หลายเดือนก่อน +1

      @@fr_schmidlin Yeah, I figured this was more a strangeness of the CPC. No offense taken.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน

      @@gppsoftware The 6502 was designed to compete with the intel 4004 or TTL, while the Z80 was an 8080 super set. So I think it's fair to say it was more primitive, however in many ways the design makes for much faster code. The 6502 zero page addressing allows for 256 8 bit registers, or 128 16-bit registers. You could argue that it's not really registers, but the equivalent z80 code is often slower. I've written an 8080 interpreter for 6502 and it takes average of 14 6502 clocks per 8080 clock. I have been looking at going the other way and it is nowhere near as efficient.

  • @jtsiomb
    @jtsiomb 9 หลายเดือนก่อน +1

    using the high address byte during I/O is far from a unique oddity. The ZX spectrum, which you mentioned as a counter-example, in fact does use the high byte as a key matrix row select. in a, (c) with c being feh will load the state of the row selected by b into a.
    In fact if you think the CPC is mad, you'll find the spectrum utterly insane, with its resistor-split bus, garbage on the data bus during interrupts forcing you to fill all 256 and A HALF slots of the vector table with the same address, and that address has to have boths its bytes identical... the ULA using a single bit to decode accesses thus grabbing all the even I/O address space.... it's so bad ... but it was my first computer....

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน

      Maybe that will have to be part 2 😃

  • @r00tyschannel52
    @r00tyschannel52 9 หลายเดือนก่อน

    Having very recently written a Z80 emulator (not deliberately, but more of a proof of concept for a framework for designing CPU emulation) I found that documentation is an interesting thing. But, the actual Zilog manual is verbatim the same as the one you showed and this was clear to be, for OUT (c) you just slap BC onto the address line and for OUT (n) you put the A register onto the high order bits.
    But! Find some consistent documentation for how the Half Carry flag is handled. Especially when it comes to the ADC/SBC instructions! That was a rollercoaster ride. For anyone embarking down this route it's actually not that bad.
    For 8 bit arithmetic operations, you want to take the first add/subtract value. Put it into a 16 bit storage and & with 0xf. Then take all values you want to add (or subtract) (INCLUDING THE 1 for carry set on the xBC instructions) one at a time to the value, with & 0xf applied to each. So if you have 1f + 2b + 49 you would add f + b + 9 = 23. At the end if the result is above 0xf then set half carry. However, for 16 bit operations you need to do pretty much the same except to use & 0xfff on each value and at the end of the value is > 0xfff then set half carry.
    Finding a single place with this actual information was a nightmare.
    Also, I want to say the ZX spectrum does NOT only use the low order values. While it's true the ULA is addressed by 0xFE, it does look at the high order bits when reading the keyboard. The high bits address the 5 keys you want to scan right now.

  • @cthutu
    @cthutu 9 หลายเดือนก่อน

    If I recall, IN A,(nn) and OUT (nn),A instructions put the A register on the upper 8 bits along with nn for lower 8 bits.

  • @Frisky0563
    @Frisky0563 9 หลายเดือนก่อน

    I would want a PCB. What a mess I learned about the Z80 in School and switched to MC6800. I enjoyed your video very much 🎈

  • @ingmarm8858
    @ingmarm8858 9 หลายเดือนก่อน +2

    I love LDIR, have since the 1970's 🙂

  • @DouglasFish
    @DouglasFish 9 หลายเดือนก่อน +4

    I think Ben would still be proud

    • @thek3743
      @thek3743 9 หลายเดือนก่อน

      No. Ben's videos are much easier to follow.

  • @michaelmoorrees3585
    @michaelmoorrees3585 9 หลายเดือนก่อน

    I spent the bulk of the 1980s using the Z80 almost exclusively, with a little deviation using the 8086. But used mostly Intel peripheral chips (8051, 8053, 8055, & 8059) with that Z80. Yeah, a little extra TTL "glue" to mate the signaling. Into the 90s, went to mostly x86, when a full processor was needed, and microcontrollers when it was simpler. Starting with the HC05, since it was a joy to code in assembly. Moved over to the AVR line, with a transient pass thru with a few 8051 projects.
    Dollar wise, the Z80 was still the biggest selling processor well into the 1990s. This was when an individual Z80 sold for under a buck, compared to x86, including early pentiums, selling in the $200 range, each !

  • @Bunny99s
    @Bunny99s 9 หลายเดือนก่อน

    As far as I remember the classical gameboy also used a slightly modified Z80 which didn't use the in / out instructions at all. Everything in the gameboy was memory mapped. Input, synthesiser, graphics output and even the "network link". I once wrote a simple assembler / disassembler to read and write gameboy roms in Delphi back in the days. I toyed around a bit with a gameboy emulator but nothing serious. Though it was a fun time. Since I'm kind of a "data-messi" I'm sure I still have the assembler somewhere as well as the GB emulator (I think it was "rew") ^^.

  • @matthouben4242
    @matthouben4242 9 หลายเดือนก่อน +3

    A few remarks:
    1. The loading of BC instead of just C on the address bus.
    This is most likely caused by the fact that the address bus is driven by one 16 bit internal register, that cannot be divided into an upper and lower 8 bit. So it must be supplied with a 16 bit value. That is why they have to use BC, not just C.
    Next question would be: why not OUT (BC),r instead of OUT (C),r ?
    I guess that the really only wanted an I/O space with 256 addresses, and the fact that B also ends up on the address bus is a side effect. It works just fine like that. Also: if you would use (BC),r, things like OUTI, INI, OTER etc are no longer working properly (as you stated yourself).
    Conclusion: it is really designed as a 256 port I/O space addresses by register C. The fact that B also ends up on the address bus is an irrelevant side effect.What Amstrad did is a travesty.
    2. LDI, LDIR, LDD, LDDR
    These block move instructions are superb: the first time a microprocessor had SIMD (Single Instructions Multiple Data) instructions. They are not strange and akward, but a step ahead in microprocessor design.
    Coming from the Z80, these SIMD instructions are one of the things I really missed on the 6502. Another is the lack of 16 bit memory pointer registers.
    3. Why I/O mapped I/O at all?
    Back then, memory space was scarce. Sacrificing pieces of your precious memory address space for I/O was a waste. You could argue that it would only take 256 bytes at most, but in those days, hardware was expensive and to really only loose 256 bytes you would need additional address decoding logic. This was often not done and you lost far more than 256 bytes for the memory mapped I/O, and as a side effect, the same I/O device could be addressed via multiple memory addresses.
    4. Using empty spaces in the opcode table
    As you pointed out, there are a lot of empty spaces in the opcode table when you have extended (2 bytes) opcodes. However, this is a risky business, as a lot of these unused opcodes actually really do something. These are known as "undocumented instructions" and the Z80 has quite a lot of them, and they are occasionally used as real instructions in software. So you must be very careful, and that makes it IMHO a terrible hack.

    • @NoelsRetroLab
      @NoelsRetroLab  9 หลายเดือนก่อน +1

      Great points all around! Thanks!

    • @Foersom_
      @Foersom_ 9 หลายเดือนก่อน

      @matthouben4242 Good comment, completely agree.

    • @smf3472
      @smf3472 9 หลายเดือนก่อน +2

      The Z80 uses memory mapped I/O because the 8080 did, this was a decision that came from the 8008 & was influenced by the 4004. It was less about memory space being scarce and more because it just made that particular design easier. The intel 4004 especially was a very odd design, due to pin count limitations.
      It's rather odd that Zilog didn't document all the IX/IY prefix opcodes. They only documented the opcodes that work on DE & HL as 16bit values. But you can also access the IX/IY registers as 8 bit opcodes. That is if you can live with the speed penalty of using IX/IY opcodes.

    • @matthouben4242
      @matthouben4242 9 หลายเดือนก่อน

      @@smf3472 I/O mapped I/O vs. memory mapped I/O goes way further back than microprocessors. Yes, the Z80 had I/O mapped I/O because of its 100% code compatibility with the 8080, and intel was the one to choose I/O mapped I/O.
      Neither way of doing I/O is really superior to the other. What can complicate memory mapped I/O is the usage of cached memory. You cannot cache I/O operations, so you must keep track to not cache the memory parts that are used for I/O (difficult if the cache is inside the CPU) or use dedicated memory operations/instructions that are not cached.
      Regarding the undocumented instructions: there is a nice theory about how this came to be in the video below:
      th-cam.com/video/DLSUAVPKeYk/w-d-xo.htmlsi=Xy1wKWptsgwUOUSy&t=867

    • @smf3472
      @smf3472 9 หลายเดือนก่อน

      @@matthouben4242I'm not going to say memory mapped I/O was invented by Intel, but they had wanted to make it easy to upgrade from 8008 to 8080. So the decision goes at least as far back as the 8008. If you look at the Intel 4004 and the origins of the 8008, then you would be surprised if Intel had chosen memory mapped I/O for the 8080. While the 6502 was designed to be compatible with the 6800 and that used memory mapped I/O (which made sense for the 6800).

  • @alanclarke4646
    @alanclarke4646 9 หลายเดือนก่อน

    Hi Noel. The CPC printer port only requires 1 address, and in fact the OS only uses &EF00 to communicate with the a printer. So a simple mod to the printer port so that the printer only receives the strobe signal if A7 is low means that a new peripheral can listen for A12 low, A7 high and has 128 addresses to choose from. Or the OS could be modified to use &EC00, printer port mod to listen for A10 low, and we have 512 I/O addresses available. Even better, if you don't intend to use a printer the whole of &EC00 to &EFFF can be used for a total of 1024 addresses.

  • @turbinegraphics16
    @turbinegraphics16 9 หลายเดือนก่อน +1

    Very interesting to see how amstrad does it as the master system does it in a quicker more efficient way with the vdp doing most of the work.

  • @briannebeker2119
    @briannebeker2119 9 หลายเดือนก่อน +1

    LD instructions always reads and writes data it is just a matter of what the source and destination are. That is why there is not a read and write instruction.
    The IO instructions are much more limited as they were an add-on to the instruction set and don't need to have numerous addressing modes like the LD instruction, thus they are targeted to what the function is. IO instructions where meant to be 8 bit addresses but there are 16 address bits and they need to be set to something. Using B as the upper address means you can use 16-bit addresses for IO if you choose.

  • @the-pink-hacker
    @the-pink-hacker 9 หลายเดือนก่อน

    This makes me so thankful that the TI-84 (EZ80) doesn't use the IN or OUT instructions at all. Good ol' memory mapped hardware.