C and Assembly Language: How To!

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ม.ค. 2025

ความคิดเห็น •

  • @Clank-j6w
    @Clank-j6w ปีที่แล้ว +91

    "Hey it's Dave, operating system developer for the Kim-1"

  • @spazda_mx5
    @spazda_mx5 ปีที่แล้ว +45

    I understood about 5% of this, but still enjoyed it and look forward to more 😊

    • @milk-it
      @milk-it ปีที่แล้ว +4

      Keep watching and reading up on it. It's like learning a foreign language. Once you've seen, heard and practiced it enough, you'll be fluent in it!

    • @lucasgerosa4177
      @lucasgerosa4177 ปีที่แล้ว +4

      Just watch it 19 more times then

    • @NoOne-ev3jn
      @NoOne-ev3jn ปีที่แล้ว

      I understood 0.5% of your 5% 😅

    • @EIsenah
      @EIsenah ปีที่แล้ว

      Same over here

    • @kcvinu
      @kcvinu 7 หลายเดือนก่อน

      I also had a keen interest in assembly language. Finally got my hands on Microsoft's MASM Assembler. Created a DLL and used it in Python. It was some nice functions to create a window. But it ran 4 times slower than DLLs I made in other languages ​​like C. With that, I just stopped my MASM endeavor.

  • @SquallSf
    @SquallSf ปีที่แล้ว +6

    ClearScreen is very unoptimized!
    1. You need to clear 320x200/8 = 8000, but your program clears 8192, that is extra 192 times internal loop, which for 6502 is way too much extra.
    2. Handling the outer loop is very slow! If you do the math (but it is obvious in the code too), outer loop is 32 ($20). So you could optimize that by:
    LDX #32
    ...
    DEX
    BNE :-
    This way you will reduce the instructions in half (from 2 to 1), and the generated code will be just 1 byte (instead of 4 as it is now).

  • @topperdude2007
    @topperdude2007 ปีที่แล้ว +13

    Fun project! Reminds me of when we did something similar back in the early 90's (our undergrad days) - buddy and I started by reverse engineering and writing an application that did the same thing as Norton Disk Doctor in C. This was in the pre-Windows days and on the old (state of the art back then) 80286 based computer with floppy disks since PCs were not as widely available in our country back then and we had to do all the development after school hours by reserving computers in our school's lab (unless the seniors wanted it for their projects).
    Anyways, once we re-wrote the entire thing in Assembly, boy was it fast! Much faster than Norton's version and about 2.x times faster than our own C based version. Love watching these videos - brings back some nostalgic memories. Thank you, Sir! 👍

    • @sjococo
      @sjococo ปีที่แล้ว

      Yes, good old times. 60KB ramdisk in videocard memory in order to use unused RAM in videocard while in textmode to name just one remarkeble program

    • @milk-it
      @milk-it ปีที่แล้ว

      Awesome stuff, dude!

  • @rjy8960
    @rjy8960 ปีที่แล้ว +1

    I had a Commodore 16 when I was a kid and spent a lot of time messing with the monitor and writing assembly with it and began playing with self modifying code. It was instrumental in me building a love for coding in assembly which I did professionally for quite a few years albeit with 4-bit and other 8-bit families primarily.
    I've spent time with higher level languages but never developed the same affection that I have for assembly.

  • @RonaldvanderPutten
    @RonaldvanderPutten ปีที่แล้ว +2

    I'm just smiling... old school stuff... back to the good ol' days!

  • @milk-it
    @milk-it ปีที่แล้ว +4

    Love it. These small projects in C and Assembly on the older hardware along with the makefile overview are short and sharp enough to practice with, even if I have to convert it to 680xx Assembly and C on the Amiga. The structure and procedures are essentially the same. Thanks, Dave!

  • @morganskinner3863
    @morganskinner3863 ปีที่แล้ว +51

    The Z80 has an instruction that makes setting a load of bytes to the same value, so the CLS function in Z80 is effectively (once the registers have been setup) a single opcode. I used this to amaze my computing teacher in 1981. Happy days!

    • @JohnnieWalkerGreen
      @JohnnieWalkerGreen ปีที่แล้ว +4

      Forty years after using MPF-1, I remember that the "LD HL" opcode is "21", and "LD DE" is "31".

    • @SerBallister
      @SerBallister ปีที่แล้ว +2

      How do interrupts work with such a long instruction? Is the whole op-code un-interruptable ? Or does the Z80 have special handling to resume the opcode half way ?

    • @morganskinner3863
      @morganskinner3863 ปีที่แล้ว +5

      To zero out 1K at 0x4000, you would do this…
      LD HL, 0x4000
      LD BC, 0x1000
      LD A, 0
      LDIR
      I think ldir is interruptible, but don’t know for sure, it’s 40 odd years since I used it in anger!

    • @JohnnieWalkerGreen
      @JohnnieWalkerGreen ปีที่แล้ว +5

      @@SerBallister Z80 completes the current instruction before servicing the interrupt, even if it is interrupted in the middle of its execution.

    • @milk-it
      @milk-it ปีที่แล้ว +1

      Nice going.

  • @naukowiec
    @naukowiec ปีที่แล้ว +1

    Brings back good memories of writing hardware-specific code, and lots of days debugging interpreted languages by looking at machine code.

  • @mcmaddie
    @mcmaddie ปีที่แล้ว +1

    Haven't mixed C and asm since mid 90's. Brings back memories.

  • @d.jensen5153
    @d.jensen5153 ปีที่แล้ว

    I liked this a lot! 6502 assembly on the Apple II is where my computer education began. Assembly and C are still are still what I enjoy the most. It's the proliferation of platforms that has kept me busy.

  • @alphabasic1759
    @alphabasic1759 ปีที่แล้ว +5

    Personal opinion…multi-language (and I particular C and assembly) is very powerful. Used this combo heavily back in the early 80s

  • @stepannovotny4291
    @stepannovotny4291 ปีที่แล้ว

    Thank you! I have done this in a hacky way in SDCC +SDAS so I am delighted to see some additional perspectives on this sort of thing.

  • @toast_on_toast1270
    @toast_on_toast1270 ปีที่แล้ว +1

    Very nice, only looked at assembly briefly in 1st year cs when I wasn't paying attention - now I understand the beauty of it

    • @toby9999
      @toby9999 ปีที่แล้ว

      There is beauty to low level coding with asm and C that I miss in these days of bloatware.

  • @stonedhackerman
    @stonedhackerman ปีที่แล้ว +2

    Awesome! I still think stories/explanations of Windows internals were the best and most interesting content on this channel, but this is awesome too.

  • @Davemte34108
    @Davemte34108 ปีที่แล้ว

    Brings back memories of the coding I did during the 80's and 90's in a steel mill in Northwest Indiana.

  • @RonZuckerman
    @RonZuckerman ปีที่แล้ว +2

    Good stuff, Dave! I never worked in 6502 assembly, but it looks close enough to old 8-bit micros that I programmed in the past that I was able to follow the code pretty well.

  • @SEEMERIDECOM
    @SEEMERIDECOM ปีที่แล้ว +44

    Would love to see this ASM version running next to the C only version for comparison.
    When I rewrite something in ASM I always keep timings. Even when I'm optimizing I start with the slow code and as I make each improvement, I verify if it's actually an improvement. Sometimes, you make it slower.

    • @Ittiz
      @Ittiz ปีที่แล้ว +3

      seconded

    • @greatwolf.
      @greatwolf. ปีที่แล้ว +2

      third. Would also be interesting to compare the assembly generated by the compiler vs the hand-written version. Is it still substantially faster even with `-O3`? Where would the speedup come from?

    • @TAP7a
      @TAP7a ปีที่แล้ว +3

      Turns out modern compilers are wicked smaht

    • @kayakMike1000
      @kayakMike1000 ปีที่แล้ว

      Well, try to do count down loops instead of count up.
      Depends on which compiler.

    • @kayakMike1000
      @kayakMike1000 ปีที่แล้ว +2

      ​@@greatwolf. godbolt!

  • @jacoblf
    @jacoblf ปีที่แล้ว

    This is great. Thanks. I love how you dont waste time getting into a project.

  • @muddyexport5639
    @muddyexport5639 ปีที่แล้ว +2

    Cool code and explanation. I like the conciseness. Efficient...

  • @TheRojo387
    @TheRojo387 ปีที่แล้ว

    Bing Chat taught me a handy trick for even faster compilation: inlining machine code directly instead. Hand-assembling code might seem tedious at first, but it's far more rewarding once you get the hang of how it's done, and the memory mapping of your hardware. The technique is no secret, and involves setting a function pointer to your hand-assembled snippet, causing it to be executed as machine code whenever that function is called.

  • @ronaldroe4548
    @ronaldroe4548 ปีที่แล้ว +1

    Can't speak for anyone else, but I love these videos.

  • @carltone
    @carltone ปีที่แล้ว

    Thanks for creating this well done video. Was a trip down Memory lane for me. My programming career began in the late 70’s early 80’s writing 8080, then 8085 assembler ( better ICE) using an Intel MDS210. I was building Industrial apps on Single board processors. I vaguely remember using using the linker/ locator to place code into specific addresses ( read) EPROMs. I had one eprom that I could interchange with variable data. The blurry good old days. 😊

    • @rwatson2609
      @rwatson2609 ปีที่แล้ว +1

      Ha, I remember doing this as well back about then on the intel 8039 microcontroller series. Built my own eraser from a bug zapper and made a programmer from a ton of switches and discrete components, but just like you, they are pretty fuzzy memories. I had lots of time on my hands back then.

  • @fbodirector7464
    @fbodirector7464 ปีที่แล้ว +1

    This is exactly the type of content I subbed for.

  • @mhoover
    @mhoover ปีที่แล้ว

    Back in the 80s I did this a lot with IBM BASIC and 8088 assembly. Man, I'm old!

  • @chrisdixon5241
    @chrisdixon5241 ปีที่แล้ว

    Great video Dave, I'm really enjoying this series.
    In the pursuit of performance, I'd probably have unrolled those loops from the start at the expense of a little more memory, but seeing it run it looked fast enough!
    Great work!

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      Thanks! The C version was much slower, so it was worth the work!

  • @BleuSquid
    @BleuSquid ปีที่แล้ว

    You take back what you said about Makefiles! I frikken love them.

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว +1

      They're like a redheaded stepchild. You love them, but...

  • @RobertLBarnard
    @RobertLBarnard ปีที่แล้ว +3

    Thanks for going through this demo.
    Curious how you became familiar with MOS & the KIM kit?
    I worked on the Motorola 680x line for an industrial process control company (we also used DEC Vax & PDP in top of it all), fun times chasing bits. But never played with the MOS stuff, remembering it was well regarded.
    Oh, the funnest part of that job was bootstrapping up an HP 3060a automated test system (bed of nails, HP-IB, HP "calculator", the whole she-bang!). Nearly 40 years later... I "think" I miss it. No, probably not.

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      I never saw one in the day, but they're so similar to the PET and C64 in general architecture that a KIM is like a "mini PET" in a way!

  • @DanielFSmith
    @DanielFSmith ปีที่แล้ว +2

    You can save 1 cycle/byte by using absolute addressing for STA instead of indirect. (Assuming you don''t object to self-modifying code, and that your code's not going into ROM.)

    • @nkronert
      @nkronert ปีที่แล้ว +3

      As a kid doing assembly programming on the C64, I was somehow "scared" of these indirect addressing modes, so I would always do those self-modifying code loops. Never spent time to realize that in ROM this wouldn't work 😊

  • @dougpark1025
    @dougpark1025 ปีที่แล้ว

    I have only very rarely written any assembly. In a case like this I would probably start by looking at the code the C compiler generated and then start counting clock cycles to see if there are any obvious optimizations that could be made.
    It would be useful to provide an example of how C passes arguments into a function in assembly as well.

  • @gerakore8948
    @gerakore8948 ปีที่แล้ว

    i remember using assembly to speed up a 3d engine i wrote in qbasic. worked really well actually. fun times. the assembly basically pushed memory straight to the video memory. this allowed me to stack different layers together beforehand so there was no need to clear the screen.

  • @Fbiman93
    @Fbiman93 ปีที่แล้ว

    Love your shirt. I also love your explanation of C and assembly two things. I really want to learn.👍

  • @rickmellor
    @rickmellor ปีที่แล้ว

    My favorite channel. I click “like” before “play”. 😂

  • @treeturtle9378
    @treeturtle9378 ปีที่แล้ว +2

    Love the shirt Dave 👍

  • @andrewdunbar828
    @andrewdunbar828 ปีที่แล้ว

    Good stuff Dave. Might've been good to mention why name mangling is even needed/done for C at all.

  • @igot64problems42
    @igot64problems42 ปีที่แล้ว

    7:37 Yes, that's a bug - it will not store 0 at location SCREEN+$1E00. It will store 0 at SCREEN+$1EA0 on the first iteration but it will also store 0 at SCREEN+$1F40 which is the first byte just off the screen. There's a couple of ways you could fix this. One way would be to start Y at 0 then do INY and CPY #$A0. Executing the compare on every iteration will slow things down slightly.
    Another way would be to use the Negative flag. To do this, you'll have to unwind the loop into 4 instead of 2 so that Y can start at a positive number; a value less than 128:
    ldy #$4F
    : sta SCREEN+$1EF0,y
    sta SCREEN+$1EA0,y
    sta SCREEN+$1E50,y
    sta SCREEN+$1E00,y
    dey
    bpl :-
    When Y rolls around from 0 to $FF, it's a negative number and it won't branch.
    A quick fix to the problem would be to just do STA SCREEN+$1E00 straight after the loop.

  • @keptil
    @keptil ปีที่แล้ว

    It's been like 2 months since I watched one of your videos, only to come back and find that you've grown a Jedi Master beard.
    I'm okay with this, Obi Wan Dave..obi...
    I'll figure out a jedi name for you eventually.

  • @H2Obsession
    @H2Obsession ปีที่แล้ว

    Great video. Regarding 07:30, the scroll screen will fail to clear $BE00 (= SCREEN+$1E00) like you suspected. Fixed with easy-to-understand version is:
    lda #0
    : sta SCREEN+$1E00,y
    sta SCREEN+$1EA0,y
    iny
    cpy #$A0
    bne :-
    However this adds an extra instruction (cpy #$A0) inside the loop which is bad for speed. You wrote it as assembly for speed right? A better but harder-to-understand version is:
    lda #0
    ldy #$A0
    : dey
    sta SCREEN+$1E00,y
    sta SCREEN+$1EA0,y
    bne :-
    This runs at same speed as original, and doesn't miss clearing address $BE00. It looks weird because DEY instruction appears 3 lines above the BNE instruction... but it works because the two STA instructions do not affect any flags.
    Oh the joys of assembly language!

  • @deevs3973
    @deevs3973 ปีที่แล้ว

    Love the video. Takes me back to DOS programming, especially with the make files. I did Clipper (Compiled DBase/Foxbase) database programming with C and ASM function libraries I coded to provide things missing in the language that I needed.
    BTW.. Be aware of replies from imposters posing as Dave. I've seen a few on here.

  • @NoX-512
    @NoX-512 ปีที่แล้ว +1

    Yes, it*s a bug. If my memory serves, STA doesn*t affect the status flags, so you could move DEY to the start of the loop. Otherwise, you can subtract 1 from your address calculation (SCREEN + 1E9F, SCREEN + 1DFF)

  • @An.Individual
    @An.Individual ปีที่แล้ว

    In the mean time and inbetween time, I'm waiting for your next video.😁

  • @JeremyNasmith
    @JeremyNasmith ปีที่แล้ว +6

    I love the work you're up to for the KIM-1. Is your end goal to give it most or all of the functionality we expect from other Commodore machines Kernals? That would be a great series indeed.

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว +6

      That's sort of what I've been tinkering with, but with no keyboard interface, not sure how far I'll take it!

    • @lwilton
      @lwilton ปีที่แล้ว +1

      @@DavesGarage I've got an old KIM-1 single board computer, stock except that I increased the memory by something like 8K, if I recall. I also have a wide-carriage IBM Executive typewriter with a pin-feed platen that GE hacked with solenoids on the type bars, and leaf switches under the keys. It was originally used on a computer monitoring the tin plating line at Kaiser Steel. I kludged up a board to plug onto the PIA outputs that would monitor the switches and drive the solenoids, and had myself a teleprinter. It was pretty much useless, but I could do keyboard/printer I/O.

    • @chinesepopsongs00
      @chinesepopsongs00 ปีที่แล้ว +1

      @@DavesGarage I have a 6502 based board that was used end 80's begin 90's as part of home automation when the word was not yet invented. Mainly used in hospitals and prisons to automate and secure things. The boards has a ton of io both serial as parallel. It does not have normal video out, but a header for a LCD display, i have a 2 line model attached to it. Has 32kb ram 16kb rom and 16kb reserved for io space. I started to write a operating system for in somewhere in the 90's it's was never completed as i lost interest. I feel like i should dig the thing from the attic and finish it. Reason i have it is because it is the second revision of that board and i was the one who redesigned it. First gen had 16kb ram and 8kb rom and some other minor differences.

    • @Davemte34108
      @Davemte34108 ปีที่แล้ว

      @@lwilton Sounds about right, while working at LTV's Indiana Harbor Works (former Youngstown), GE was the main contractor for the rolling mills, similar things happened.

  • @rbolo29
    @rbolo29 ปีที่แล้ว

    I don't understand too much of this, but appreciate the effort.

  • @wkjagt
    @wkjagt ปีที่แล้ว

    Really cool video! I've never tried mixing C with assembly, but now I want to :-). About local labels in ca65 (I think what you're using are actually called unnamed labels), you can also branch two (or more) colons ahead with `:++`, `:+++` etc. Same thing when branching back multiple labels, you add multiple minuses. Local labels are also a thing by the way. They're labels that start with a @ (at sign), and they're only visible between two non-local labels. This is handy when you want to reuse common label names like @loop, @done, etc.

  • @patrickmcginnis7
    @patrickmcginnis7 ปีที่แล้ว

    wow, ok. I never could run C on a 65 series cpu ... so props for that. We burned eproms and had to run our machine lang. from there. I still have my 6502E in a box, I'm not inspired to re-invent the wheel as you are. I'm happy that there's experts out there that can appreciate unbloated code, I find the compilers and tools today (and even the next couple generations of coders that have come after us) are lazy AF. We had to fit 300KB on a floppy. I still run smallish programs that are

  • @greg4367
    @greg4367 ปีที่แล้ว

    This brought back all the reasons I prayed in the z80/8080 world and HATED the 6502.

    • @toby9999
      @toby9999 ปีที่แล้ว +2

      I loved the 6502. I wanted to learn Z80 but didn't have a Z80 system. My favourite was the 68000.

  • @kencreten7308
    @kencreten7308 ปีที่แล้ว +2

    Lot's of fun. Thanks, Sir.

  • @VoidloniXaarii
    @VoidloniXaarii ปีที่แล้ว

    Beautiful engineering! Thank you for sharing that! ❤

  • @JohnnieWalkerGreen
    @JohnnieWalkerGreen ปีที่แล้ว

    My second C compiler (after Unix V6) was Mark Williams C for MS/PC-Dos.

  • @jemdeweare6432
    @jemdeweare6432 ปีที่แล้ว

    Nice to see the combo , thank you dave

  • @pekahon
    @pekahon ปีที่แล้ว

    In z80 chips have 2 different sets of instructions, common documented and tested ones, and outer layer instructions that may work on with some chips.

  • @ElectronicFanArm
    @ElectronicFanArm ปีที่แล้ว

    Hi Dave I liked so much your video thanks for sharing. Don't you have a video working with Real Mode and Protected mode to work with PC without O.S?

  • @JPEaglesandKatz
    @JPEaglesandKatz ปีที่แล้ว

    Think I only used the MAC65+ assembler on the 6502 (atari) back in the days but that sure looks familiar :) Nice to see a follow up video on this

  • @jacobpalm
    @jacobpalm ปีที่แล้ว +1

    Very interesting! I’ve always been fascinated by these kind of low-level optimizations, and how programmers can squeeze more performance out of the hardware.
    Do you have any timings or similar to show how much faster the ASM routines were compared to the C ones?

  • @nathantron
    @nathantron ปีที่แล้ว +1

    Do you know if there's a way to use the compiler to merge a bunch of cpp files into one? allowing it to do the header and cpp merging for us so we dont have duplicate code?

  • @thisisreallyme3130
    @thisisreallyme3130 ปีที่แล้ว

    ​ @Dave's Garage Is that binary shirt from Geek Tropical? I need this. Please post if you have an affiliate link to order this AWESOME shirt..

  • @sukivirus
    @sukivirus ปีที่แล้ว +2

    I see C programming, I just hit like :)

  • @moshehalevihalemo1604
    @moshehalevihalemo1604 ปีที่แล้ว

    I love your shirt with 1s an 0s all over it 🙂

  • @NullPointer
    @NullPointer ปีที่แล้ว

    When I wrote my tiny kernel in x86, to scroll the screen I just moved the entire block of memory up except for the first row, instead of doing it row by row, it has the disadvantage that you lose whatever was there, but it's blazing fast if you don't care about that

  • @elmiguel1969
    @elmiguel1969 ปีที่แล้ว

    Dave, I have not read through the 100+ comments, so someone may have asked this already.
    In the _ClearScreen code: Is there a particular reason why you do
    inc dest_hi
    ldx dest_hi
    cpx #>SCREEN + $20
    bne :-
    instead of loading the X-register with ldx #$20 after ldy #0 and then change the logic to
    dex
    bne :-
    It would save you quite some cycles.

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      I don't have the code in front of me, but that sounds like a good idea!

  • @michaelguerrero7232
    @michaelguerrero7232 ปีที่แล้ว

    Just awesome!!! Retro to the steel awesome!!! Thank you

  • @Felice_Enellen
    @Felice_Enellen ปีที่แล้ว

    Is cc65 not smart enough to turn certain kinds of loops into register-indexed loops, rather than stack-variable-indexed loops?
    When I was writing code as a professional game dev and needed asm-speed code, I would try to write out a good implementation in asm and then create C or C++ code that generated something as close as possible to it, with comments on why certain things were arranged _just so._ This allowed the code to be portable, to be read and even debugged by asm-unaware programmers, and yet be highly performant.
    Another option is to have a reference version in C/C++ and a per-platform version that overrides it if present.

  • @SassyToll
    @SassyToll ปีที่แล้ว

    This is brilliant Dave Thank you

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว +1

      Glad you enjoyed it!

  • @techsteph
    @techsteph ปีที่แล้ว

    Hi Dave, I was wondering what is the shortest. though most useful program you ever wrote?
    Using old MS-DOS Debug I was down to 5 bytes of code and is the fastest programm I ever wrote :) Any idea what it is??

  • @colinmaharaj
    @colinmaharaj ปีที่แล้ว

    Since C++ builder went CLANG, their assembler is at+t assembler. Kinda tricky. I started programming in 1991, using borland C. I liked how easy it was to do the assembler. I used it to write high performance text screen io, editors, comport vt200 terminals (competed with procomm plus) all that nice stuff.
    Never used Microsoft compilers, so don't ask me about visual studio lol😂

    • @toby9999
      @toby9999 ปีที่แล้ว

      I have only used Microsoft compilers, so don't ask me about the others :)
      Actually, not quite true. I did dabble with other stuff back in the 70s and 80s but MS for the past 30 years.

  • @mirror1766
    @mirror1766 ปีที่แล้ว

    Always enjoy the programming videos. Wondered if the 1 to 127 count of bits in the last video should have been 0 to 127 or if 0 was left out for some special reason. Would be interesting to know what was going on in the C copy vs your assembly copy to compare the steps vs performance. Ironic that this was updating the display faster than Microsoft would on my Tandy 1000 DOS prompt but figured I ran into a compatibility fallback of sorts having updated the OS.

  • @godfreypoon5148
    @godfreypoon5148 ปีที่แล้ว

    With only a few chips, you could have an row offset register for the video RAM address...

  • @Peter-House-Jr
    @Peter-House-Jr ปีที่แล้ว

    I would really like to see how you put together your development environment for this project. Looks like you are using VSCode. What compiler toolchain and how are you transfering your final bits to the KIM1?
    Miss the Gentle Giant - where has he gone?

  • @mp-kq3vc
    @mp-kq3vc ปีที่แล้ว

    Thank you Dave! I was actually just today trying to figure out how system("cls"), system("clear") and the other variants work. I was digging through stdlib.h and couldn't find them at all. I see now that simply clearing the screen efficiently is way, way lower-level than I had realized. Hope the C lessons continue! Maybe one day you could explain the seemingly impossible "Press a key to continue" magic trick in C.

    • @casperes0912
      @casperes0912 ปีที่แล้ว

      You can just think of that as a mutex waiting on an interrupt either spin locking or yielding back to the OS depending on complexity of the system

  • @enablerrelbane
    @enablerrelbane ปีที่แล้ว

    Where did you get the shirt? Do you have a link to it?

  • @PhotonicNoodle
    @PhotonicNoodle ปีที่แล้ว +1

    Why does he say just shy of 8k bytes at 1:47 ? Isn't this exactly 8k?

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว +1

      No. It's precisely 8000 bytes (320x200/8) but 8K is 8192, so "just shy" of 8K.

    • @Snablarns
      @Snablarns ปีที่แล้ว

      No, 0x2000 (0xA000 - 0xBFFF) is 8192, according to the SI system known as 8 Ki.
      Looks like a mistake was made during recording. Could've happened to anyone - apparently even the best of us.

    • @DatBoi_TheGudBIAS
      @DatBoi_TheGudBIAS 4 หลายเดือนก่อน

      ​@@SnablarnsWindows also treats ki and multiples as kb/multiples
      I gess its easier to treat 1024 as kb Cuz well, it's kinda used everywere as 1024 lol
      Rare is the ocasion a person uses ki and kb properly

  • @michaelterrell
    @michaelterrell ปีที่แล้ว

    I had a Kim 1 that I salvaged from an old Audiometer, but it disappered a few years back when I was moving.

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      I had a brand new VT200A in the original box, it got lost in my last move. No idea how or why!

    • @michaelterrell
      @michaelterrell ปีที่แล้ว

      @@DavesGarage I lost two 10' by 20' warehouse full of computer and test equipment no long after 9/11. I was laid of the Friday before, and I was moving stuff from them to my house. I ended up bed ridden for a little over two years I had over a dozen PET computers, at least ten disk drives and a pile of 4023 printers. I also lost a 8023 printer. Wide carriage dot matrix that you could send a lot of formatting commands to.
      I still have one of the original LASER printers, with the same print engine that HP made famous.
      There is a SWTPC computer out in my shop, as well. I have an Altos 586 computer that I've promised to a computer museum. I'm in my '70s. I started working in a TV shop at 13. I was building Telemetry equipment for the Aerospace industry, until I ended up disabled.
      Some of the oddest computers that I've had were made by National Semiconductor nder their 'Datachecker' name. They were early POS that filled an entire enclosed relay rack. Both system were dead, and after way too long of not finding any information,I scraped them for parts. I recovered ten of the large LASER scanners. I sol the LASER tubes and power supplies, along with the front surface mirrors. I got more from the scrap aluminum housings than I paid for both systems.

  • @georgecop9538
    @georgecop9538 ปีที่แล้ว

    This is similar to linking the bootloader written in asm and the kernel to create the binary. Great video anyways!

  • @BinaryAdventure
    @BinaryAdventure 5 หลายเดือนก่อน

    Dave, where do I get that shirt?

  • @benetelrae
    @benetelrae ปีที่แล้ว

    The Dan Flashes sponsorship finally hit.

  • @okaro6595
    @okaro6595 ปีที่แล้ว

    6:04 I fail to see how it works when it increments both src_hi and dest_hi at the same time even though the 320 is not divisible by 256.

    • @bill3143
      @bill3143 ปีที่แล้ว

      Other than the initialization of the src & dest pointers and the final row detection, there is no actual concept of rows in the copy. You are correct that copying 256 bytes wouldn't copy an entire row, but since it's not row-centric it doesn't matter. The first time through the loop 256 bytes are copied from row 1 to row 0 then in the second iteration of the loop bytes 257->319 of row 1 are copied to row 0 and bytes 0->191 of row 2 is copied to row 1, and so on. The check for "BE" is how you know you're at the last row.
      Sorry if it's not clear, a diagram would work much better.

    • @jerry-p
      @jerry-p ปีที่แล้ว

      The screen buffer is 8000 bytes, and he's copying 8000-320 bytes which is 0x1e00 (7680) bytes. So, he's copying 30 (0x1e) 256-byte pages, a page at a time with the inner loop. Then he blasts (most of) the last line (320 bytes) to zeroes. As he suggested, he's not actually zeroing the 0th byte on the last line because he does the "dey", "bne :-". The fix is pretty simple for 6502 programmers; maybe he'll show us next time. I think he's also zeroing the byte at 0xbf40 because of the way he indexes, but that's probably not a problem either.
      To get the zeroth byte cleared, change the STA instructions to:
      : sta screen+$1ea0-1,y
      sta screen+$1e00-1,y
      then it actually clears just the bytes desired. Off-by-one bug.

  • @thisisreallyme3130
    @thisisreallyme3130 ปีที่แล้ว

    Yessss.. more CC65… TY!! 🎉

  • @JonBailey
    @JonBailey ปีที่แล้ว

    @davepl - is the scroll routine something that could see a performance benefit from loop unrolling?

    • @SquallSf
      @SquallSf ปีที่แล้ว

      Yes. In general everything that runs multiple times in a loop benefits from unrolling, because you reduce the overhead per iteration of a loop. However in early days memory was very expensive and in small quantity. That is why size was preferable "optimization". For example some of these early machines was sold with 4k RAM.

  • @shawnj4545
    @shawnj4545 ปีที่แล้ว

    I guess you can what you want, but typically you'd want to set your variables in assembly as zpsym instead of just hard coding a zero page address like you did (with CA65/CC65). I'm assuming you did that to simplify for viewers.
    PS. Will you write a full conio implementation for the kim-1 for cc65. :)

  • @kermitdafrog8
    @kermitdafrog8 ปีที่แล้ว

    What optimization do you use during the build process.

  • @__hannibaal__
    @__hannibaal__ ปีที่แล้ว

    Hello; Please What name of this editor of assembly and when can i get it.

  • @joshhiner729
    @joshhiner729 ปีที่แล้ว

    Great video!! Do you have a github page with the examples you use in your videos so we can download and get hands on with them? Thanks for the great videos either way.

  • @KurtSchwind
    @KurtSchwind ปีที่แล้ว

    It was probably for the 286, but I'm almost certain my C compiler let me inline the ASM code. Or am I mis-remembering? Like literally ASM { mov AX ..... }

    • @SquallSf
      @SquallSf ปีที่แล้ว

      yeah many high level languages (C, Pascal,..) allows you to inline asm. The the C Dave uses (CC65) has terrible syntax - each instruction require asm(..), which makes ugly code. On top of that you can't do some tricks to increase readability of the code (like macros) and some tricks like local unnamed labels. So using a separate asm code is much easier and readable.

  • @dand4485
    @dand4485 ปีที่แล้ว

    Curious, rather than iterating line by line to copying row 1 to row 0, row 2 to row 1.... Why not just move all 8K -320 from row 1's address to row 0's address? then fill from 8k-320 with zeros for the last line? Only two copy operations, and seems the cpu will copy the data after than multiple successive 320 byte copies...

    • @0LoneTech
      @0LoneTech ปีที่แล้ว

      Yes. Also, due to an 8bit quirk in 6502, it's faster to do page (256 byte) aligned loops. Each time an indexed operation has to cross a page boundary costs an extra cycle. Slightly complicated by the two pointers crossing pages at different times.

  • @johnkiddjr01
    @johnkiddjr01 ปีที่แล้ว

    We finally find out what he's really doing with all the subs he's been collecting at 8:46

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      It's like soylent green. The subs is people!

  • @jumeldipancaputra87
    @jumeldipancaputra87 ปีที่แล้ว

    Sir, with the raise of Zig programmer that it's already proven faster than assembly language for systyem programming, will assembly be replaced?

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว +1

      For very complex problems, yes. But for interfacing with hardware, assembly is still sometime necessary, I suppose. But the number of places you need ASM is incredibly small these days!

    • @jumeldipancaputra87
      @jumeldipancaputra87 ปีที่แล้ว

      @@DavesGarage Ok, thank's for your answer Sir.

  • @drganesh108
    @drganesh108 ปีที่แล้ว

    Sir there is new language k/a Rust. Is it better than c for performance and using multi-core processor.

    • @KC9UDX
      @KC9UDX ปีที่แล้ว

      I've heard there are multi-core 65xx, but you actually program them?

  • @theoriginalrecycler
    @theoriginalrecycler ปีที่แล้ว +1

    I’m interested

  • @stephenelliott7071
    @stephenelliott7071 ปีที่แล้ว

    Loving the low level coding episodes.

  • @UncleKennysPlace
    @UncleKennysPlace ปีที่แล้ว +1

    _"Make_ files are cryptic and annoying ..." and I thought it was just me who thought that.

  • @SRG-Learn-Code
    @SRG-Learn-Code ปีที่แล้ว

    Cool shirt! What does it says?

  • @ncot_tech
    @ncot_tech ปีที่แล้ว

    "Makefiles are cryptic and annoying"
    I find them to be like regexps. So long as you don't take your eyes off the screen or blink, it all makes sense. The second your attention wanders, symbols switch places...

  • @PrimalNaCl
    @PrimalNaCl ปีที่แล้ว +1

    That is a sweet shirt!

  • @lucidmoses
    @lucidmoses ปีที่แล้ว

    Instead of dealing with the name mangling in ASM. Wouldn't it be simpler to deal with it in C? Like so
    extern "C" {
    void ClearScreen(void);
    void ScrollScreen(void);
    }
    Then remove the _ from the Asm code.

    • @Z80Fan
      @Z80Fan ปีที่แล้ว +1

      extern "C" is a C++ construct that tells the compiler to use C linkage; it wouldn't make any sense in C.

    • @redcrafterlppa303
      @redcrafterlppa303 ปีที่แล้ว

      I'm not sure but I think
      extern void ClearScreen();
      And
      extern "C" void ClearScreen();
      Are identical and C does add the underscore even on extern C symbols.
      Most likely as some naive attempt to prevent calling of random assembly symbols from c that weren't meant for that. (since with this logic you can't call assembly symbols not starting with an underscore)
      But please correct me if I'm wrong. These are just thoughts and assumptions based on the video and my limited knowledge of assembly and c compilers.

    • @lucidmoses
      @lucidmoses ปีที่แล้ว

      @@Z80Fan C doesn't have Name Mangling. So either he is using a C++ compiler or a C compiler with a Name Mangling extension. Either way the extern "C" scope should tell it to use the standard C function naming for the linker.

    • @lucidmoses
      @lucidmoses ปีที่แล้ว

      @@redcrafterlppa303 Unfortunately this falls under the category of 'up to the compiler'. Which is why I wasn't sure in the original comment. If his compiles does Name Mangling on C code then it should also support the override scope.

    • @casperes0912
      @casperes0912 ปีที่แล้ว +1

      @@lucidmoses It's a platform thing. GCC and Clang on macOS will add underscores but not on Linux.

  • @nuketheswamp7774
    @nuketheswamp7774 ปีที่แล้ว

    Good stuff. Takes me back to when I tried to optimize graphics routines in perhaps my favorite MS product ever, QuickC. If the C version was too slow you could just mix in a block of assembly right in the c source. Really made those blits fast when you could combine unrolling loops and banging in words at a time instead of bytes! I'm a little bit sad that my next machine will probably be linux running win in VM's since MS seems to regard my PC as theirs for data mining :(

  • @v9turner
    @v9turner ปีที่แล้ว

    Ah the fun of trying to optimize screen handling on a slow 8bit CPU. For my atari 800 I built a dedicated card that plugged into the right cart slot to do multiply by 40 ! (40 bytes/line)

    • @KC9UDX
      @KC9UDX ปีที่แล้ว

      Nice! But then you're restricted to 40 columns! What if you wanted 38? 😁

  • @GlenHHodges
    @GlenHHodges ปีที่แล้ว

    Has anyone decoded Dave’s shirt yet?

    • @DavesGarage
      @DavesGarage  ปีที่แล้ว

      Congratulations. You’ve just discovered the secret message. Please send your answer to Old Pink, care of the funny farm, Chalfont.

    • @GlenHHodges
      @GlenHHodges ปีที่แล้ว

      @@DavesGarage 01010111 01101001 01101100 01101100 00100000 01100100 01101111 00100001

  • @Lion_McLionhead
    @Lion_McLionhead ปีที่แล้ว

    Lions just make the compiler output an assembly listing for a hello world function & copy the calling conventions. No-one memorizes the .export & name mangling except for maybe a job interview.

  • @solomongrundysfoot
    @solomongrundysfoot ปีที่แล้ว

    awesome work

  • @sparthir
    @sparthir ปีที่แล้ว

    Who else is trying to decode Dave's shirt? ;)