Microarch Club
Microarch Club
  • 44
  • 40 357
The WebAssembly Founding Story | Ben Titzer
Ben Titzer tells the story of how WebAssembly evolved out of collaboration between the Firefox team at Mozilla and the V8 team at Google, which was initially motivated by the introduction of asm.js.
Clip is from Episode 8 of the Microarch Club Podcast: microarch.club/episodes/1000/
Microarch Club: microarch.club/
X / Twitter: MicroarchClub
มุมมอง: 270

วีดีโอ

How a Sea of Nodes Compiler Works | Ben Titzer
มุมมอง 6182 หลายเดือนก่อน
Ben Titzer walks through how a Sea of Nodes compiler works, and evaluates both the technical and organizational trade-offs compared to traditional basic block approaches. Clip is from Episode 8 of the Microarch Club Podcast: microarch.club/episodes/1000/ Microarch Club: microarch.club/ X / Twitter: MicroarchClub
Fitting High-Level Languages on Microcontrollers | Ben Titzer
มุมมอง 1813 หลายเดือนก่อน
Ben Titzer describes optimization techniques that can be used to bring high-level language features to microntrollers, such as reference compression, romization, and reachable members analysis. Clip is from Episode 8 of the Microarch Club Podcast: microarch.club/episodes/1000/ Microarch Club: microarch.club/ X / Twitter: MicroarchClub
1000: Ben Titzer
มุมมอง 1K3 หลายเดือนก่อน
Ben Titzer joins to talk about the history and future of WebAssembly, the design and implementation of V8's TurboFan optimizing compiler, and the Virgil programming language. We also discuss bringing high-level language features to constrained hardware, the V8 team's response to the Spectre and Meltdown side-channel attacks, and how to design high performance virtual machines. Ben's Site: s3d.c...
Successful Hardware Relies on Robust Software Ecosystems | Kay Li
มุมมอง 1883 หลายเดือนก่อน
Kay Li discusses the rise of custom silicon and emphasizes the importance of a software ecosystem in the success of new hardware, using NVIDIA's CUDA platform as an example. Clip is from Episode 7 of the Microarch Club Podcast: microarch.club/episodes/111/ Microarch Club: microarch.club/ X / Twitter: MicroarchClub
Why Verification Became the Chip Design Bottleneck | Kay Li
มุมมอง 3153 หลายเดือนก่อน
Kay Li explains why design verification has become a bottleneck in the production of new chips, and how workflows used in software development can be applied to logic design to improve the situation. Clip is from Episode 7 of the Microarch Club Podcast: microarch.club/episodes/111/ Microarch Club: microarch.club/ X / Twitter: MicroarchClub
The Power & Limitations of Verilator | Kay Li
มุมมอง 4013 หลายเดือนก่อน
Kay Li describes using Verilator, an open source Verilog / SystemVerilog simulator, in industry, detailing the differences between cycle-based and event-based simulators, and the benefits of running simulation in parallel. Clip is from Episode 7 of the Microarch Club Podcast: microarch.club/episodes/111/ Microarch Club: microarch.club/ X / Twitter: MicroarchClub
111: Kay Li
มุมมอง 2043 หลายเดือนก่อน
Kay Li joins to talk about custom hardware used in high-frequency trading, development workflows for FPGA and ASIC design, and why verification has become a bottleneck in the design process. We also discuss SiLogy, the startup Kay founded with Paul Kim to improve the design workflow, including their experience applying to and going through YCombinator, their initial target market, and how the p...
How Quantum Computers Work | Rick Altherr
มุมมอง 2453 หลายเดือนก่อน
Rick Altherr walks through how quantum computers work in practice, focusing primarily on IonQ's trapped-ion architecture. Rick also touches on the current state of quantum computing, and describes challenges that need to be addressed in order to advance adoption. Clip is from Episode 6 of the Microarch Club Podcast: microarch.club/episodes/110/ Microarch Club: microarch.club/ X / Twitter: twitt...
How to Trust Your Hardware | Rick Altherr
มุมมอง 2.8K3 หลายเดือนก่อน
Rick Altherr describes techniques for detecting whether hardware has been tampered with, specifically focusing on Google's Titan chip, which serves as a hardware Root of Trust (RoT) by interposing the Serial Peripheral Interface (SPI) bus between privileged components and boot firmware flash. Clip is from Episode 6 of the Microarch Club Podcast: microarch.club/episodes/110/ Microarch Club: micr...
USBAnywhere and the History of Baseboard Management Controllers | Rick Altherr
มุมมอง 1753 หลายเดือนก่อน
Rick Altherr tells the story of how the Baseboard Management Controller (BMC) came to exist, and describes the role they play in server architecture. Rick also goes into detail on the USBAnywhere vulnerability that they discovered at Eclypsium, and how it allowed an attacker to easily connect to a server and remotely mount any USB device of their choosing. Clip is from Episode 6 of the Microarc...
The Google Minkowski Project: Temporospatial SDN | Rick Altherr
มุมมอง 1633 หลายเดือนก่อน
Rick Altherr describes working on the Minkowski Project at Google, which was a temporospatial software-defined network (TS-SDN) controller capable of routing over dynamic link topologies. The project provided the underlying networking layer for other Google moonshot efforts, such as Project Loon and an LEO satellite internet project that preceded Starlink. Minkowski lives on as Google spinout A...
The Impact of Micro-Optimizations at Google Scale | Rick Altherr
มุมมอง 1.1K3 หลายเดือนก่อน
Rick Altherr walks through how, when working at a company at the scale of Google, ruthlessly optimizing performance can have massive financial impact. Rick's anecdotes include finding a CPU scheduler bug impacting Google search auto-complete, and allocating reserved PCIe pins for USB access to expansion cards. Clip is from Episode 6 of the Microarch Club Podcast: microarch.club/episodes/110/ Mi...
Apple's PowerPC to x86 Transition | Rick Altherr
มุมมอง 3614 หลายเดือนก่อน
Rick Altherr describes the internal secrecy around Apple's transition from PowerPC to x86, as well as the work necessary to quickly port their hardware performance analysis tools from one architecture to the other. Rick also recounts having to work from two separate buildings in the subsequent transition to ARM with the launch of the iPhone. Clip is from Episode 6 of the Microarch Club Podcast:...
110: Rick Altherr
มุมมอง 3914 หลายเดือนก่อน
Rick Altherr joins to talk about working on hardware performance analysis tools at Apple during the PowerPC to x86 transition, building flight control software for internet satellites at Google, discovering vulnerabilities in baseboard management controllers, and much more. We also spend an extended portion of the conversation on Rick's current work in quantum computing, including comparing and...
The Compiler Explorer Story | Matt Godbolt
มุมมอง 2.7K4 หลายเดือนก่อน
The Compiler Explorer Story | Matt Godbolt
Making Sense of x86 Microarchitecture | Matt Godbolt
มุมมอง 2.6K4 หลายเดือนก่อน
Making Sense of x86 Microarchitecture | Matt Godbolt
The Hardware That Powers Financial Trading | Matt Godbolt
มุมมอง 1.4K4 หลายเดือนก่อน
The Hardware That Powers Financial Trading | Matt Godbolt
How Acorn Computers Became Arm | Matt Godbolt
มุมมอง 1.1K4 หลายเดือนก่อน
How Acorn Computers Became Arm | Matt Godbolt
MOS 6502 vs. Zilog Z80 | Matt Godbolt
มุมมอง 11K4 หลายเดือนก่อน
MOS 6502 vs. Zilog Z80 | Matt Godbolt
101: Matt Godbolt
มุมมอง 7954 หลายเดือนก่อน
101: Matt Godbolt
Is Oxide Computer Company Reinventing the Wheel? | Nathanael Huffman
มุมมอง 1.7K4 หลายเดือนก่อน
Is Oxide Computer Company Reinventing the Wheel? | Nathanael Huffman
What Happens When the Oxide Server Boots? | Nathanael Huffman
มุมมอง 9974 หลายเดือนก่อน
What Happens When the Oxide Server Boots? | Nathanael Huffman
How FPGAs are Used in Medical Imaging | Nathanael Huffman
มุมมอง 2124 หลายเดือนก่อน
How FPGAs are Used in Medical Imaging | Nathanael Huffman
The Magic of FPGAs | Nathanael Huffman
มุมมอง 2494 หลายเดือนก่อน
The Magic of FPGAs | Nathanael Huffman
How Java Got Started at Sun Microsystems | Robert Garner
มุมมอง 7464 หลายเดือนก่อน
How Java Got Started at Sun Microsystems | Robert Garner
100: Nathanael Huffman
มุมมอง 2675 หลายเดือนก่อน
100: Nathanael Huffman
SPARC Architecture Design Decisions | Robert Garner
มุมมอง 2705 หลายเดือนก่อน
SPARC Architecture Design Decisions | Robert Garner
How SPARC Got Its Name | Robert Garner
มุมมอง 1955 หลายเดือนก่อน
How SPARC Got Its Name | Robert Garner
Why Didn't Xerox Win the PC Wars? | Robert Garner
มุมมอง 1645 หลายเดือนก่อน
Why Didn't Xerox Win the PC Wars? | Robert Garner

ความคิดเห็น

  • @axelBr1
    @axelBr1 วันที่ผ่านมา

    Compiler Explorer is amazing. One of the design choices in C++ (and possibly other languages, (Java, Javascript? a long time since I used them)) is where to create your instances. A best practice is where they are used, but what happens if that is in a loop? I always thought that for performance reasons that it would be better to create the instances required in the loop, before entering the loop. Then one day it dawned on me that compilers are pretty smart and I can create the instance within the loop and the compiler will move the creation of the instance to before the loop. Using Compiler Explorer I was surprised the find that the compiler is even smarter than that, as because it knows the instance isn't used outside of the loop, it doesn't need to create the instance at all.

  • @Dygear
    @Dygear หลายเดือนก่อน

    No, they are putting a racing slick on it tho.

  • @compu85
    @compu85 หลายเดือนก่อน

    Great interview!! Thanks for posting it!

  • @MichaelKahle
    @MichaelKahle หลายเดือนก่อน

    Nice interview. I'd be interested to know WHAT parts of the Oxide stack are they not able to get support and code from the vendor. What incentive does a hardware manufacturer have to keep their drivers closed source?

    • @THB192
      @THB192 หลายเดือนก่อน

      On their podcast they've said they don't own the PSP, and they've said they also don't own the firmware on some of the power supply parts. Probably some other stuff, too. I'm pretty sure there is no open FTL layer on any SSDs, so that's probably also closed off.

  • @jasonleschnik9780
    @jasonleschnik9780 2 หลายเดือนก่อน

    Years of abstraction and poorly designed firmware to "get the job done" have accumulated an enormous amount of technical debt. This feels like a really refreshing approach, "Clean slate Cloud" with a hint of Bryan wanting a similar visibility in Hardware that DTrace afforded software. Bravo.

  • @24playermaker
    @24playermaker 2 หลายเดือนก่อน

    Kay seem to be a very smart guy, however lacks understanding on full ASIC product life cycle. Throughout the conversation, he made statements that were ambiguous or flat-out wrong. For instance, claiming that the reason XXX company's chip was delayed was due to an escape of a bug....While that may be true or false, there's always more to a delay of a chip. Ranging from poor initial planning, last minute features, process technology yielding issues, etc.

  • @24playermaker
    @24playermaker 2 หลายเดือนก่อน

    I would have to disagree on the statement made about Verification. Most of the work is in verification, hence therefore most people end up doing it no matter if they were top of their class. In fact, i know many highly smart individuals who have only work in verification.

  • @newtonchutney
    @newtonchutney 2 หลายเดือนก่อน

    Dan, what IEMs do you use? It kinda looks like your spects' arms reach into your ears.. 😅 Anyhow, please do continue making clips, I really wish to listen to your podcasts, but I end up not having time.. 🥲 Awsm job!

  • @angeldude101
    @angeldude101 3 หลายเดือนก่อน

    The biggest issue that I see is that people rely on this crutch _so much_ that an immense amount of effort and money has been put in to making it as fast as possible, to the point that it can often be faster than the much simpler integer arithmetic.

  • @adama7752
    @adama7752 3 หลายเดือนก่อน

    1:45:00 Web Assembly could catch up to hardware of 20 years ago (Thread/MultiCPU).

  • @alecsei393ify
    @alecsei393ify 3 หลายเดือนก่อน

    Very good, content!

  • @KabelkowyJoe
    @KabelkowyJoe 3 หลายเดือนก่อน

    Tried to listen all he was sooo annoying T H here was involved in making Itanium cant even mention his name AI of YT removes any comment cause he devunks climate scam, got me interested in Itanium this goy dont even understand this small part, so unlucky architecture it was just like Transmeta real VLIW not fake as their

  • @borealis8uno
    @borealis8uno 3 หลายเดือนก่อน

    If you have in the RAM speed a bottleneck, Z80 wins. Amstrad/Schneider CPC is an example: if designed with the MOS 6502, it would have had it at 1Mhz, while with the Z80 it was able to run at 4Mhz (~3.3 effective, but still something more in terms of computing power compared to the MOS)...

  • @BGBTech
    @BGBTech 3 หลายเดือนก่อน

    For testing my hobby CPU core, I have mostly been using Verilator to good effect (and only occasionally get around to running on an actual FPGA), but thus far it mostly works OK. Though, I also make fairly heavy use of an instruction set level emulator, as it can run programs at roughly the same speeds they would run on the FPGA, whereas Verilator is several orders of magnitude slower. But, I can also note that I don't use any 3rd party logic or modules (and pretty much no budget; as my CPU/ISA is a single-person hobby project), ... I don't think I have done too badly though (and it can pass the "does it run Doom?" test at least). Nevermind the seemingly never ending battle with deficiencies and bugs on the software side of things (the "OS" I am running on it still falls well short of a real OS; and porting another more mainstream OS to my custom ISA seems like a fairly big effort; as I am also using a non-mainstream C compiler, with binaries based on a modified PE/COFF, etc...).

  • @johnjakson444
    @johnjakson444 3 หลายเดือนก่อน

    I was in need of a Verilog cycle simulater some 30years ago, I was writing a Verilog style code in C and it was a horrible experience, I even had a crude C to verilog for FPGA synthesis. I took a few weeks off to create a cross compiler from Verilog to C (V2C) and within a short time had a decently useful tool, that could handle single clock domain RTL code. There was a rule that had to be followed, every Verilog module had to have all outputs registered and all assign statements had to be in time/event order as if it were C code. If those constraint are adhered to, then 1000s of lines of Verilog could be exploded into 10,000s lines of C code with the gloabl register at the end as C global ints. The only snag I hit was that the Visual C++ compiler could not compile a function bigger than maybe 500 LOC, so I had to add a partitioner to break the 10,000 LOC single function into 100s of functions about a 100 LOC each. All the states of the chips modulles become an int array of maybe 100,000 nets. The final performance was that every Verilog assignment would be replaced by a similar C style assignment, with the tool taking care of building the rats nest. Its a shame I did not maintain the code and keep it up to date, I have been meaning to go back to it but I don't do ASIC /FPGA design anymore. Such a tool could be useful in allowing C++ code to be written as a mix of C procedural code with HDL/RTL blocks in C form that can be guaranteed to perform properly as if they were hardware. I even envisauge a GUI frontend written as hardware with RTL widgets.

  • @womp6338
    @womp6338 3 หลายเดือนก่อน

    They don’t

  • @kayakMike1000
    @kayakMike1000 3 หลายเดือนก่อน

    x86... Sadly you can't really know because of the management engine. Arm has Trustzone, which is only as good as you trust Arm or the vendor.

  • @Zaniahiononzenbei
    @Zaniahiononzenbei 3 หลายเดือนก่อน

    It's so regrettable that we haven't advanced much from 1970's software. Backwards compatibility is great, but jesus, the structuing of powershell/NuShell is so much better.

  • @andrewgrant788
    @andrewgrant788 3 หลายเดือนก่อน

    The BBC Micro used a 6502, the ZX Spectrum used a Z80. Clearly the 6502 must be better.

  • @HarshKapadia
    @HarshKapadia 3 หลายเดือนก่อน

    Interesting! Thank you!

  • @franciscotoro827
    @franciscotoro827 3 หลายเดือนก่อน

    I have a question I hope some one can shed some light on for me. These 2 processors, were developed in the 70's and used though out the 80's even into the 90's even in use today for some things. question 1 was the 6205 made in the 70's the exact same as ones made in 1988. like a modded 6502 was used in a 2600, but then standard 6502 was in the 400 and 800, the 5200 and lynx, it was used in all early Comidors, and apple 1,2,3..... my question were processors super simple in there function back then, or sooo capable that the rest of the computer had to catch up with it, like ram and co processors, and video components ? Like if you look at games from 1980 on home pc's and systems compared to a game from 1989, the difference is almost as big as the jump from Wolfenstien to Halo, but now imagine I told you both games were run on the same CPU.... I think you would have questions

  • @BobBeatski71
    @BobBeatski71 3 หลายเดือนก่อน

    Hmmm, Java byte code to MC68000 assembly... brb....

  • @jk55.
    @jk55. 3 หลายเดือนก่อน

    👍

  • @testolog
    @testolog 3 หลายเดือนก่อน

    Anyway chine will destroy half internet in the world when they invade to Taiwan. Because a lot telecommunication company just a buy electronics what was made in China)

  • @josephlunderville3195
    @josephlunderville3195 3 หลายเดือนก่อน

    Love the slow pace and the combination of history and expertise here. I don't know if what you're doing here will be broadly popular but it's deeply meaningful to me as a technologist who loves knowing more about how the tools i work with came to be, and I bet it's important to a swathe of historians in the future too.

  • @capability-snob
    @capability-snob 3 หลายเดือนก่อน

    2:01:43 that's The Mill. I'm not entirely on board with Matt's point here. It's less about trusting compilers, and more about putting the power in the hand of the developer - whether that's someone writing a compiler, a language runtime, or hand cobbling assembler. It sure would be nice to have better tools for indirect branches in ia64, but it's hard to argue with advance and speculative loads, and being able to branch on whether they are successful. No spectres here.

  • @Calilasseia
    @Calilasseia 3 หลายเดือนก่อน

    Ah, so the Z80 only had a 4-bit ALU? That explains how they were able to implement a half carry bit for the DAA instruction in the flags register! And solves neatly a problem for emulator writers too. Wish I'd known this years ago!

  • @chaitanyakumar3809
    @chaitanyakumar3809 4 หลายเดือนก่อน

    Where should one look for documentation of these reverse engineering processes?

  • @johnjakson444
    @johnjakson444 4 หลายเดือนก่อน

    in 1979 I had the privilage of reverse engineering about a dozen processor chips, F8, 1802, 8080, 8085, Z80, 9900, 8086, 68000, Z8000, and others. By far my favourites were the 16/32 bit machines. For its MOS simplicity the 9900 was a marvel, it was really only a 1bit serial cpu so was very economical with transisters and at speed could really only run at maybe 250kips taking atleast 18 cycles for 16 bit register ops but it had the grown up architecture of a mini computer which it was derived from TI990. I had programmed the 9900 at TI too so I was biased. Later the 68000 would be my wife. As for the 6502 vs z80, 2 very different beasts neither of which I would want to write grown up code for. Of course my real job at the time was to work on the design of a UK microprocessor for parallel programing. I was impressed with my BBC though, decent piece of kit with loads of language compilers for it, I was much less impressed by that Sinclair Ql POS with the crippled 68008 in it but that was replaced with a Mac ASAP.

  • @imgod2u
    @imgod2u 4 หลายเดือนก่อน

    Itanium wasn't the only failure of VLIW for general computing. You all are old enough to remember Transmeta and their efforts and its spiritual and technical successor, nVidia's Project Denver. The argument of VLIW being capable of general compute works fine (even with the complexity of software memory management) if and only if you avoid variable latency. Which can be done if you can fit everything inside SRAM. As soon as you go to DRAM that entire argument gets thrown out. Especially if you go to DRAM and you're not the only processor trying to access DRAM. For less general compute workloads (DSP's, AI processors), sure, VLIW is great. Yet -- as almost every AI accelerator architect is learning -- there's still a lotta general compute that they need to stick a (usually small) CPU like a SiFive or even a not-so-small CPU (like Grace) because in between chained models (not to mention dynamism within models), you need some branchy code and real-time buffer management that simply can't be offline compiled.

  • @KabelkowyJoe
    @KabelkowyJoe 4 หลายเดือนก่อน

    0:45 Wait what? Wasn't MacOS compiled to other platforms ever since NeXT Step 90s !? Wasnt compiled to ARM already? Oh yeah "quickly port their hardware performance analysis tools" Their hardware PERFORMANCE analysis TOOLS! Debugers, profilers because it wasn't cross compiled anymore.

    • @KabelkowyJoe
      @KabelkowyJoe 4 หลายเดือนก่อน

      1:45 PDF kinda thing

    • @KabelkowyJoe
      @KabelkowyJoe 4 หลายเดือนก่อน

      2:10 That is sign of lie

  • @ssl3546
    @ssl3546 4 หลายเดือนก่อน

    I wish the full videos would have the video, since you already recorded it and use it for the snippet videos you publish. It's way more interesting to watch a video when you can see people's faces.

  • @capability-snob
    @capability-snob 4 หลายเดือนก่อน

    The am29k is such a beautiful chip, outstanding you got Philip in!

  • @ssl3546
    @ssl3546 4 หลายเดือนก่อน

    it's so wild that a guy with under a thousand subscribers gets these interviews and does a good job

  • @haiphamle3582
    @haiphamle3582 4 หลายเดือนก่อน

    What an inspiring story! Godbolt does not become a verb for no reason. It helps people have a deeper look into what happening under the hood.

  • @andrewdunbar828
    @andrewdunbar828 4 หลายเดือนก่อน

    I'm an old Z80 coder on the Speccy and only ever wrote the tiniest little bit of 6502 once on the Apple II and I always just believed the 6502 was faster, because everybody always says so. But then the CERBERUS came out about two years ago, with a 6502 and Z80 on it and all of their tests, including BBC Basic running on both, showed the Z80 was actually faster than the 6502.

    • @Calilasseia
      @Calilasseia 3 หลายเดือนก่อน

      The picture is a little more complicated. Memory copies on a Z80 are far faster, because after the register initialisation, you run a single instruction - LDIR. You have to write a software loop to achieve the same result on a 6502. But 6502 table indexing is faster, because the instructions to do so were native from the beginning, not bolt-ons post-8080. If you need to use IX or IY for your table indexing, that's slower than the 6502 equivalent. Tasks that take advantage of the Z80 being able to perform 16 bit arithmetic in one instruction, or take advantages of goodies such as LDIR, will always outpace other 8-bit CPUs, but interrupt handling is complicated on a Z80 (sometimes requiring support chips) and pushing 8 registers will always be slower than pushing 4. Also, any Z80 instruction requiring a prefix byte (DD/FD for IX/IY register use, CB for BIT instructions etc) will run more slowly than instructions on a CPU that doesn't use a prefix byte.

    • @andrewdunbar828
      @andrewdunbar828 3 หลายเดือนก่อน

      @@Calilasseia In Speccy game code nobody uses LDIR and friends in fast loops because they're famously slow. The fastest way to clear memory is to move the stack pointer to the end of the memory block you want to clear and push all the registers over and over. First game I disassembled that did this was 1985's Starion. There's a similar technique for copying. Nobody uses IX/IY in fast loops either.

    • @Calilasseia
      @Calilasseia 3 หลายเดือนก่อน

      @@andrewdunbar828 ... that approach won't work for COPYING memory blocks though, which is the use case I specified.

    • @andrewdunbar828
      @andrewdunbar828 3 หลายเดือนก่อน

      @@Calilasseia The approach does work and is used. TH-cam deletes comments with links but Google for "How To Write ZX Spectrum Games - Chapter 13" "Double Buffering". Another source if you Google "Chasing the raster on the ZX Spectrum in Sidewize". There's probably a bunch out there.

  • @smoothemoveexlax
    @smoothemoveexlax 4 หลายเดือนก่อน

    Can we get support for other CPU platforms including ARM and RISC-V targets? That would be super useful.

    • @Evan490BC
      @Evan490BC 4 หลายเดือนก่อน

      There is support for both ARM and RISC-V, as far as I know.

  • @momoanddudu
    @momoanddudu 4 หลายเดือนก่อน

    Re predicting branches before it's decoded, the CPU does it based on the address in which the command is stored. That is, when PC is the address of the branch, the branch predictor predicts which where the branch command stored there, and seen in the past, will go. Possibly, that memory block holds a DLL, and it was replaced by the time execution returned to the same address. That means that at decode time, the CPU has to handle the possibility the address no longer contains a branch, or a bit trickier - it contains a different branch. Usually the CPU can easily tell it's a different command, flush the pipe and branch predictor, and restart. If it happens to contain exactly the same branch command (as in binary memory content), which would behave differently due to preceding, I don't know if CPUs actually detect this, or suffers a few branch misses until it learns the new behavior.

    • @haiphamle3582
      @haiphamle3582 4 หลายเดือนก่อน

      For the case having the same address, that should be when a context switch happens, and the same virtual address appears from another process, right? Maybe hardware will also flush the branch predictor upon a context switch. For the case where the branch behaves differently, based on previous conditions, CPU designers have many different strategies to cope with, for example, the CPU can store a few recent results and use them to select the final result. There are many other techniques described by Agner Fog (Also mentioned by Godbolt) here: www.agner.org/optimize/microarchitecture.pdf

  • @AnnatarTheMaia
    @AnnatarTheMaia 4 หลายเดือนก่อน

    I dream of building an OpenSPARC T2 server someday, with the first prototype implemented in FPGA.

  • @AnnatarTheMaia
    @AnnatarTheMaia 4 หลายเดือนก่อน

    SPARC is still one of my favorite architectures, and I still do a lot of porting of modern open source software to SPARC in 2024.

  • @andrewdunbar828
    @andrewdunbar828 4 หลายเดือนก่อน

    through which they went by

  • @andrewdunbar828
    @andrewdunbar828 4 หลายเดือนก่อน

    "Sine Nordic Country" = Denmark

  • @insu_na
    @insu_na 4 หลายเดือนก่อน

    I bet he must hate `consteval`, people not just compiling their code on his AWS instances but also letting it compute stuff 😂

  • @surters
    @surters 4 หลายเดือนก่อน

    Yeah it quite wild that its actually the instruction fetcher that must guess where to go as the decode is 6-7 cycles down and the retirement is maybe 200 cycles down ... all this though the power of xor and saturation counters taking a vote on which saturation counter to use.

  • @oidpolar6302
    @oidpolar6302 4 หลายเดือนก่อน

    So, "Parallela" became "Tilera"?

  • @jamesphilemon8010
    @jamesphilemon8010 4 หลายเดือนก่อน

    It's so good to hear Australian and English technogists telling their stories in entertaining ways as only they can.

  • @walterpark8824
    @walterpark8824 4 หลายเดือนก่อน

    And, Thank you so much for introducing me to Agnes Fog's work.

  • @AnnatarTheMaia
    @AnnatarTheMaia 4 หลายเดือนก่อน

    WIll you add Sun Studio compilers too? (If it must be, there are Sun Studio compilers for GNU / Linux).

    • @MattGodbolt
      @MattGodbolt 4 หลายเดือนก่อน

      We'll add pretty much anything that installs easily and folks submit a PR for. There's two PRs usually required; one to add the installation to our infra repo and then another to configure it (if it's simple and looks like clang/gcc). If it requires more work there are per compiler customisation points. Google for "how to add a compiler to compiler explorer" if you're interested 🎉

  • @user-st1nj5pd5o
    @user-st1nj5pd5o 4 หลายเดือนก่อน

    Great talk 🦜

  • @Heckatomba
    @Heckatomba 4 หลายเดือนก่อน

    The name mentioned at 1:45 is Agner Fog.