Back when humor wasn't banned from compilers, I had heard of the "long long" and decided to see what it thought of a long long long. The error message was "long long long is too long."
And the funny thing, it increments after the ; for elemental types, but for classes it does it immediately. (OK, C has no classes, and C is not a subset of C++)
@@steffenbendel6031 C is mostly a subset of C++, with some silly divergence making C less ergonomic because the standards body didn’t want mangled symbols. There’s also a few “hidden” differences that may on occasion bite you: for example, character constants are still type int in C, but are rather sensibly type char in C++. Plus even if you never touch classes templates & variadics, type introspection, and proper function overloading push C++ over the top for me.
I worked on a commercial product, started in 1980s. The first public version was on the IBM AS/400 [now I5 series] and it used 128 bit addresses. The pain making it work was severe. Later on it was published for s390, Linux,Unix,Windows... both 32 and 64 bit... and that was relatively easy. The hard work had all been done.
I have to say as a newbie that learning C has made me understand a lot of fundamentals better. In other languages it felt like the added abstraction didn't really allow me to see the inner workings as clearly.
Yeah, it makes you wonder why the geniuses who knew the hardware intimately by writing assembly all day were desperate for high-level languages with lots of high-order abstractions, doesn't it?
@@JerehmiaBoaz They probably weren't actually desperate for "high-level languages with lots of high-order abstractions". It's more likely that someone else wanted them, so they begrudgingly created them.
That's a trap. C is very far from the metal in very important respects and if you think that it works like the hardware below it does, you're in for some really nasty surprises. You expect an integer overflow to result in a wraparound? That's not in the spec, and a lot of compiler optimizations will actually show you that C does not work like the underlying hardware there. If you wanna know how hardware works, ~~learn~~ write a Forth.
I'm an old fart, and trust me... don't fall into the immature type of zealotry you get with stuff like this article. The fact is, C is the reason we have lot of the tech we have today - period. You could argue its time has passed and things evolve (as they should) but hating on something that served us is no better than hating old buildings that don't meet modern standards. It's a great language that's just dated. Should we move on? Yes. Does it help to learn it? Yes, for educational purposes. Should you learn something else too? Yes. Don't listen to the zealots. They're going nowhere in life and will only have their rants online where nobody successful cares.
C (and sometime C++) is the lingua franca of embedded systems. Of all the dozens of embedded processor in your computer right now, most of them are written in C. (flash management processor, hard disk drive processor, PCIe PHY controller, touch panel processor, etc , device drivers.. to name just a few)
Congratulations you discovered why C can run on literally anything with some sort of processing unit and memory (it's because the entire standard is up to the hardware's implementation).
So true, this is the reason for so much different ISAs and even different architectures, such as GPUs ou DSPs, using C language most of the time. They can use C programming model and adapat it to the ISA/model specifics.
Good luck running all those modern languages on a DSP with 40bit integer and 16bit char. C can handle it beautifully. It is not C which has a problem. The problem is the false hardware assumptions in the other languages.
NO! This is a complete misunderstanding. It takes an immense amount of backend compiler work to support even a single target, and this isn't made easier by C's type system since the stdint types still have to be supported. The ideal, rewrite history scenario is 1) C's types always functioned like stdint types and 2) compiler authors didn't name targets the same thing if they had different ABIs.
@@WHYUNODYLAN It's not easier *today* when all architectures work in a broadly similar way there is a reason the original type sizes in C were implementation defined and that's because you could have an old supercomputer from the 70s that only worked with 24 bit words as its smallest addressable memory unit and you needed to support that.
size_t: unsigned integer type than can address any offset within any memory section. ssize_t signed integer that covers at least [−1, ½(SIZE_MAX−1)] (NB!, the only negative value that is guaranteed to supported is −1) rsize_t: intersection of size_t and ssize_t. intptr_t: signed integer type that express any memory address and uses two's complement. uintptr_t: unsigned version of intptr_t. ptrdiff_t: this one is a real mess.
Wait, intptr_t is actually defined as using two's complement? While regular numbers are like "Ehh, we don't know what we want and what we are, we just exist"?
ptrdiff_t is just whatever is most relevant for storing the pointer subtraction result, and is always a signed type. (Come to think of it, it is actually a stupid idea to have it be signed, because if it is of the size of a pointer (which makes sense and is how it usually is) but signed, it can only handle differences whose absolute value is not more than half the memory size, and everything beyond that is either overflow or underflow which is UB for signed types. It would make much more sense to have it be unsigned so that wraparound on it is well defined and it produces the right result but without worrying for signed integer overflow/underflow)
@@kodirovsshik difference and comparison operations between two pointers are defined only if they belong to the same object (which means, if you calculate difference between &a and &b, you actually have an undefined behavior). signed result is needed so you can tell if one address is smaller or larger than the other. if you make it unsigned, all the negative values are actually end up as very large unsigned values, and i'm pretty sure this will cause larger issues with UB than a integer overflow ever would (for example, imagine having a machine that doesn't have an int type of the exact size as the pointer type, adding ptrdiff_t to the pointer would probably result into the out-of-bounds address).
@@dziuaftermidnight In order to tell if one address is larger that the other, there is a much more intuitive &a < &b instead of &a - &b < 0. ptrdiff_t being signed offers nothing in this scenario
Well, the optimal C implementation for the hardware in a toaster and the hardware in a supercomputer are completely different, that's why its vague there isn't a one size fits all solution. C's flexibility for HW manufacturers was always its appeal.
Any language can be compiled to run on any hardware. Just need to have a written compiler. Which is pretty standard. People choose C, because of what it can do - compared to other languages. It's a mid-level language. It allows you to interact directly with hardware. Something other languages abstract, meaning you have no ability to do much of anything advanced. You can't write drivers in Java.
Sadness. Don't blame C for only solving 90% of "computers are not all the same" problems. It was never a problem with C. Heck, as a C programmer, I also hate dealing with linkers, calling conventions, and ABI's. Its the ugly underbelly of software engineering. The alternative is we hard-code binary interfaces for every piece of hardware we might expect to interface with. Its the very reason libraries, drivers, and operating systems exist.
im sure gcc, clang hardcode it already, and by the look of this article so does every other language. (Either that, or they use C). So its just hard code all the way down
Agreed. The author doesn't really understand the distinction between the System Call Interface and the C wrapper library that most people interact with. He does not understand that all C implementations are exactly that and they have freedom to do their own thing. This is not a failure of C or even other languages, it is the nature of the beast and he is simply unaware that people "don't bother" writing their own SysCall wrappers - they just reuse glibc or whatever. The Linux System Call Interface is fixed and Linus does a LOT to ensure it never breaks (the rants he has about this 😅...) There is nothing to stop Rust from implementing a SysCall wrapper library that bypasses glibc. They just haven't done the work.
The article wasn't attacking the idea of a portable binary interface specification language, it was pointing out how insane C must appear to be in fulfilling that role. In web programming there's protobufs, graphql, openapi, none of which demand that you parse an unparseable language if you wish to interact with the system exposing the API.
What is long? It holds at least 32bits worth of data, says the C standard. It holds exactly 32bits of data, says Windows. It holds exactly a pointer, says Linux.
@@plaintext7288 long is at least 32 bits. Your compiler may throw an error or insert code to do the math for a 32bit number with a 8/16 bit cpu. This will be horribly inefficient, but it will work.
The thing that I don't get is the obsession of making a new language for every single niche task these days. It's a lot like the web dev, "how to lose focus on mastering a single thing". Ideally I want to go back to a standard few powerful languages and just use a well defined lib for a certain task.
If you create a core library that you want others to use to create programs with and if for example it's a library implementing a new protocol or part of it then it should be written in C. If it isn't then I assume the library isn't a serious project. Core libraries need to be fast, use small amount of memory, not crash on a failure (looking at you rust, with panic on memory allocation failure), be able to use on any system and platform and from any programming language.
LLVM is part of it. You don't actually have to make a full compiler anymore. You can make a language parser for any syntax, plug the result into an LLVM backend, and claim your new language has "comparable performance to C".
C avoids any ABI by design. Its up to the OS and architecture to do the ABI. The function calling convention (part of the ABI) is the prologue and epilogue the architecture probably suggests to include...
12:50 I dont know what zig people has told you to believe that zig is literally a fully fledged C compiler but its literally just clang with additional flags 😅
The ABI suffers from having to continue to be "C". And C suffers from having to continue to be "the ABI". It feels like we'd need a software equivalent of the phonetic alphabet, i.e., a synthetic language that nobody actually uses, and doesn't even make sense on its own (no "grammar"), just for the purpose of creating well-defined interfaces. But then, that language would probably have to "talk C" too. Sigh...
I build I c header parser a long time ago. The first try was just to recognize keywords and attempt a translation, much like a macro processor. I klash-trashed around with it for some time, and the program got quite large. Finally I decided that it wouldn't work in terms of handing all case, so I realized that the program needed to be the top half of a C compiler with the bottom (actual code generation) chopped off. I rewrote the program as a scanner/preprocessor followed by a syntax follower. That worked much better.
Zig does a lot of things that might as well be magic Macros? Nah, just use the Zig but tell it to do it at compile time Cross compiling? Yes and fuck it, use Zig as a C or C++ cross-compilation toolchain too Build system? Just use Zig for that too, and use it to build anything, including things that aren't even written in Zig Binary size? Smaller than C in most cases Absolute madness.
So... the author is crying because it takes more than a couple of lines of code to do that? The typical solution is... a library. Maybe something like llvm.... If you're doing compilery stuff, why complain that you might need to employ a complier library?
@@ITR But as should have been made painfully clear in this article, communicating with an ABI is NOT a simple thing. You NEED a compiler library to do something like that. And further, the point is that this would be true even if C were NOT the 'de facto' ABI for the system. And libraries like LLVM can do that. You could use it to snarf up a C++ header and then emit code in your own stuff where needed to interact with the interface in that header.
The fun starts when you work on systems were the smallest addressable memory cell is 16bit wide, so your "char" is 16bit; this is true from some DSPs. This gives the mind blowing results that sizeof(uint16_t) is 1, and sizeof(uint32_t) is 2..
Back in the early 90's the Texas Instruments C40 DSP C compiler had char at 32 bits. Actually was a pretty nice dev environment though, I enjoyed it. It had an interesting harvard-like yet flat single address space
The article author doesn't seem to know that the C libraries are themselves just a wrapper for the syscalls that actually connect to the kernel. the kernel is entirely language agnostic, but has a method that is specific to each CPU/ISA it's ported to that enables a function call to a protected kernel runtime level on that ISA (in x86_64 that is: load the function arguments and a code that specifies the function to be called into specific registers and then execute the SYSCALL instruction, but since this is specific to x86_64, different methods get used for arm, risc-V, MIPS, PPC, even x86_32). If you want to compile and create an executable without using the C libraries and their ABI, the answer is simple: implement the kernel interface directly. If you don't want to implement them all directly with your equivalent of an asm() directive, just implement one generic syscall instruction in your compiler that will spit out the right instructions for each architecture/OS you want to support and then make wrapped functions for each of the kernel routines you want to access. It'll be some work, but you can stop bitching for a while.
@@krumbergifyDo a degree, yes, but it's kinda impossible on some platforms where the syscall interface is undocumented or unstable, in which case, it still uses whatever wrapper library exists to make the syscalls.
"The article author doesn't seem to know that the C libraries are themselves just a wrapper for the syscalls that actually connect to the kernel.", no the vast majority of C libraries do far more than just "call the kernel".
@@phillipsusi1791 You're suffering the same lack of understanding that the author of the article does. Whatever standard library has will provide all the same kinds of functionality as glibc, except it relies on glibc to handle syscalls. There is nothing stopping the Rust maintainers from implementing their own SysCall wrappers. Easier on Linux/Unix, not so easy on Windows... But if Rust really is "the future", Microsoft will start to implement a native Rust interface anyway... As to the question of "unstable SysCall interface"... This is answered by Linus MANY "do not break userland" rants; if this is an issue, use a better-managed OS/kernel.
size_t is the same size as the memory address size - 64 bit or 32 bit; It's also the type which the sizeof() macro returns. There is an equivalent signed version, ssize_t which is for functions which want to return a size in bytes or a negative error value
I work with DSPs and microcontrollers. I intentionally put a buffer overflow in the serial console driver to allow for firmware upgrades because I didn't have time to write a proper bootloader before release date. The shellcode both installs the bootloader and fixes the buffer overflow. I hid it from my boss by having the DMA driver, not the CPU, perform the attack, so static code analysis wouldn't catch it. - "Hey, how did you upgrade that without a bootloader?" (Proceeds to show him a very short assembly program that reads from the console FIFO and dumps it into flash) - "That wasn't in your code" - "Nope, but look, it calls the DMA driver to read until line idle without any bounds check, then we just overwrite the vector table mapping to point the reset vector to the shellcode and send the CPU reset command". For those who don't know, JEDEC compliant flash chips wrap around to address 0 when reading past the end of memory. All you need is some way to mask off the address bits that aren't used, which looks like a safety check when writing since it prevents writing past the end of memory and throwing error codes.
This quote and how he is saying it, always pops up in my head. But I think the quote is taken out of context and Linus actually means: "Nothing better than C for Linux Kernel."
@@thingsiplay That is because 90% of people who are quoting it are getting the quote from the same video that's been recommended to everyone and their grandma at this point. I don't even think most people actually watch the video. They just see the title and it sticks.
Just a grumpy old embedded engineer here. The question is what your job is. If your job is to make software that must work on a CPU that has less than 256 pins or operating voltage more than 1.5V, you are most probaly left with only C/C++ toolchains. Assuming you want to avoid assembly. The hardware mentioned above is basically everything your world is operated by. Want something working above 80 degC? Something that should work on a wide voltage range? If one does not want to learn new assembly for each of the targets at hand, C is the language of the cheap hardware. To my surprise, the most vocal people against C is the ones who use joe, vi, emacs or some other -vintage- exotic editors.
In the 80's I worked on spell checkers for typewriters and handheld devices. Even when a C compiler existed for the obscure CPUs being used, it was generally buggy and failed to even meet the loose "standard" provided by the original K&R book. It sounds like it is a little bit better now, but still a big pain.
I remember arguing with a professor's assistant decades ago about the notion of wobbly definitions. I felt, and still do to this day, that programmers need to know how big a number they can store in a given type, and they need to be able to rely on that. He made a vague, hand waving argument about how they don't really, and how software can be more portable if wobbly definitions are used. More portable? The definitions for things *change* when you port, and that makes software *more* portable? That's not obviously fallacious? Either I was missing something that I'm still missing to this day, or that fellow was just kind of ideologically captured. That reminds me of an argument I had with a high school calculus teacher about a theoretical baseball out in space slowing down on it's own with no external forces acting on it. I said if there's no friction or gravity or anything, won't it just go on in a straight line at the same speed forever? No, there would be some rate of change of it's speed, the teacher insisted. Sure there would, I said, but it would be zero. No, they insisted. It would be almost zero, but not quite zero. It would be an epsilon. (The closest thing to zero that's not zero.) Ok, so if it's non-zero, in what direction will it be? In the direction opposite the one it's moving in. Really? Moving relative to what? The ether? It seemed to me that the analogy he was making to try to explain what epsilon is was breaking down a bit. "Maybe we should be talking about a circumstance under which the formula describing the values we're tracking actually involves epsilon?" I got told by him and by class members not to argue with the teacher. I think I facepalmed at the time, and I still facepalm remembering it. Fortunately the calculus/physics teacher at the community college affirmed my understanding. I didn't tell my high school teacher. (One facepalm was sufficient.)
Hah! I'm sure many of us have had terrible arguments with teachers when we catch them out on their own lack of understanding. I agree that code usually needs to make some assumptions about the ranges of values you can store. It's not usually *dire* because `int` is often big enough and a smaller computer will naturally not be able to work on as large a data set, but you're not wrong. That said, there is some truth to increased portability from "wobbly definitions," but it's more about the language itself. You see, C comes from a time when some major hardware platforms had a 36-bit word size and you couldn't count on a (hardware-supported) 8-bit value or maybe any power-of-2 type size. Two's complement wasn't universal either. If C had demanded exact bit sizes for each type, or two's complement, then no C for you! Computer hardware looks a lot different-and a lot more uniform-50+ years later, and it's easy to forget why certain things don't make as much sense anymore.
@@PassifloraCerulea That, sir, is a very good point! And, gosh, C is still used in all kinds of crazy embedded environments. I feel a bit stupid now. I suppose it does make sense to have a non-specific "int" type, and then have platform-dependent types like "int8" and so on.
@@PassifloraCerulea Yeah, such teacher things come in my mind a lot. Even though most of my teachers were really good, I seem to remember most vividly the bad ones because I felt strong emotions. I wish it was the other way around.
I think the wobbly definitions make the code portable in really trivial cases . Like a for loop that doesn't go past 255 or something . I always use stdint.h for uint32_t . I also never use enums because it makes me uncomfortable going "they are probably 32bit ints" . Probably ? That's going to be bad news if I want to bit pack those enums .
yeah the non-zero friction is actually due to virtual particles popping in an out of existence. It's a very weird phenomenon that's only recently been discovered/confirmed/whatever
Programming is like a lake, a very old untouched lake with no fish in it. The surface is all nice and clear but deep down there's mud and decaying plant matter. I like diving deep in there cuz I like rolling around in that mud but it sure feels icky every time lol.
30:50: How are you going to have int32_t on a 9- or 7-bit machine? And what are you going to do when you a size, a memory address expressed as an integer, or a difference between pointers? Sure, you don't need short, int, long, and long long, anymore as you have {u,}int_{fast,least}N_t nowadays, but there still are integer types where you don't want a number.
Isn't __int128 a compiler extension? So different implementation in different compilers is expected. And you don't need glibc to create file you can make syscall directly
Sure, you can make a syscall directly... if you know what ABI the kernel that you are running on speaks. And even the same kernel running on the same CPU can speak different ABIs.
Ok... this observation doesn't make it any easier for languages that aren't C to be able to call functions defined in C header files that contain the text "__int128" in them
Also, calling syscalls directly is not supported by any OS except linux IIUC. For example windows requires that you link to KERNEL32.DLL and call functions there instead of syscalls, and the API for it is defined in windows.h which I'm sure contains every possible horror you can find in C.
5:03 I would argue that's partially untruth. You could just do a wrapper around single every system call (this would require writing assembly) and you would be good to go. The major question is: are you willing to do that?
Her core thesis is that writing a standard library for systems is impossible because you HAVE to either use libc or write assembly, and "of course" writing assembly is impossible, therefore it cannot be done without C.
Everyone gets so hung up on the syscall example and completely misses the FFI thing, which is not about syscalls only, but also about how you make different programming languages *of any kind, in any situation* interact with each other. I don't know about you, but when I'm programming in CL and trying to get a compute shader to run, I'm not gonna try and figure out how to do it in Assembly, especially if I want what I'm doing to be portable.
size_t is used in std c/++ because it's the return type of sizeof. The reason they do *that* is because anytime you're working with arrays, indices, byte sizes, etc. are all times where that value can be potentially converted to/from a pointer or used as a pointer offset. That means it needs to be able to hold any value a valid pointer could so on 64bit machines it needs to be a 64 bit int, on 32 bit it needs to be 32 bits, etc. Thus size_t avoids all of the nonsense of int vs long vs long long, etc and avoids the hardcoded sizing of int32_t and just says "whatever the size of your pointer is the size of the integer you should use for indexing, here you go". C++ mucks this up a bit because it has extra large pointers in some cases due to multiple inheritance, v-tables, etc.
The problem seems to be that versioning and compatibility aren't represented explicitly in C definitions, but are implied from the field order of structs or type alignment. For a universal base ABI, you'd really want something like protobuf, which has explicit field numbers, a first-class concept of "unknown fields," and a defined, platform-independent serialization. All of the clever workarounds that MS does in the minidump example should really be part of the interface specification language. Also, we got ourselves into trouble by using the same type for "an integer (of some size)," "a pointer," "an offset," "an address," and "an integer that is fast for this CPU." Obviously, it's about 40 years too late to do anything about any of this, but we can shake our fists at the sky.
8:56 While the real sizes of those depend on the implementation of the data type, the minimum size for a short is 16 bits, 32 for a long and 64 for a long long. A short long would be the same as a short and a short short (half of a short) would be the same as a char.
9:00 When I was a programming student, I had a friend I'd correspond with by email every now and then exchanging homework assistance, and I'd just gotten introduced to C (via GCC) coming from Java experience in high school, and I personally maintain that the funniest exchange she and I had ever had was when I was messing around with the type system in C, learning about the sizes of different types, and managed to get back the error message "'long long long' is too long for GCC". Like "oh, 'long long', that's fine, but 'long long LONG'??? That's going too far buddy!" 🤣
Yea, int was supposed to be 16 bits on 16 bit CPUs, 32 bits on 32 bit CPUs and 64 bits on 64 bit CPUs, but they kept it at 32 bit for x86_64 because "changing it would break things" ( namely badly written software ). I have worked on a CPU where int was 24 bits and long was 32. Go figure.
@@phoneywheeze That's a completely unrelated concept. Even if you did malloc your ints, you still need to pull them into registers to actually do any math with them, and at that point you need to know how many bits the CPU will add together when you ask C to grab two ints and add them to know if your program will break in the event that, say, the user asks to make a table with more than 65535 rows.
You can call these functions including the binaries of the required C libraries into your compiled code by calling them in assembly or machine code. But use them as your programming lingua franca is worse than use C.
I've had to deal with this at work. We have literal meetings over ABI/API compatibility and together design a header and review it before it ships. (I'm just a junior dev tho) Thankfully, there are standard methods to do this well and maintain compatibility, but it comes with extreme cost. When a new feature is at potential odds with the old one, and there is a change that old and new types can mix, then oh boy. For example, what do you do when multiple apps load your shared object. Shared objects are typically loaded only once into memory but mapped to different virtual addresses b/w apps. The object file has both the versioned APIs - v1 and v2. One uses v1, the other uses v2. But both the apps talk to each other. You'd think everything would "compute" in v1 but the second app does not "know" about the first app. Remember that struct members are just offsets from a pointer so if you get a v2 object from the second app but was created by v1, in your library you're reading out of bounds in your library ... but it's the fault of the second app! (It should've set the version to 1 instead of 2). I dreamed up this scenario, but I wouldn't be surprised if it happened at some point in time. Hey hey what if apps exported their ABI info as json :D along with the binary. Then we can just use that to interop. This would also allow languages to change layouts (as in consecutive members don't need to be in the same order in the binary as they are declared)
This is just another case of the diamond dependency problem seen with objects, and should be solved the same way. Make the base abi definitions include semantic versioning info, and resolve it at link time or load time. If it cannot be resolved at that time, don't let the code run. C has always been defined as a source portable language, not an abi portable language, but some system and low level library developers don't help.
I remember thinking about this when I worked mostly on systems programming. I think it's more a problem with operating systems than it is C. An OS could have a more abstract Interface Definition Language. I think fuschia does for instance. As with most things in operating system design it's terrible to try and theorize about. The most shocking thing I learned as systems programmer was that Microsoft is occasionally better than the linux developers. Their backwards compatibility with ABIs is better imo and the ability to link shared libraries without needing the full .so to compile is clearly better afaict. I don't know how glibc can exist like this in the year of our lord 2024 I must be missing something.
at a time it was compiler dependant, but nowadays they shift to be platform dependant, but there's a reference definition for the minimum size of each integer, and the minimum size of a int is a 16bit integer
Now I feel old and way under knowledge. This debate of frustration takes me back to nearly the same debate in '71 (when my friends were introduced to BASIC) and again mid 80's (when MS published 7 volumes of Windows libraries. I am sure this was debated before my time as well. Each time the debate split folks into a couple of camps. Both sides said F-it I roll my own and those that dug into figuring out the next level.
C is turing complete. And therefore it can run any (computational) program, or any algorithm you can come up with. Incidentally, CSS is also turing complete. CSS is actually a programming language.
As someone who's studied enough C++ to know how it differs from C, the moment I heard Rust and Swift use C to talk to each other, I knew somebody, somewhere had F----d up big time and we're all paying for it now... Good to know my hunch was correct!
I've read their blog before/know who the author is: They have written the Rustonomicon, _the_ thing to read about unsafe rust. The "sounds like skill issues" by the Primagen at 6:20 and the chat at 6:23 "WTF? Even ChatGPT would do better than this article" made me once again realize that the Chat and often times The Primagen too have absolutely no idea about some part of programming.
She also did a ton of work getting Rust to have functional FFI with many different C compilers. She is absolutely the person to defer to on this kind of stuff.
Does that not confirm the so called "skill issue" then? Because She's an expert with unsafe Rust not a C expert. Nevertheless, that is still experience she got for unsafe Rust interfacing with ABIs. Judging by the other comments here, it seems the ABIs goes deeper than just C
@@JArielALamus for one thing, I'd wager she's more of a "C expert" than most good C programmers are, since compiler people generally need to know a lot more specifics about languages. For another, I assume "skill issue" was a joke response to the "you can't actually parse C" claim.
@@WHYUNODYLAN I don't know about that. I have seen the responses to this article and for the looks of it, she is indeed talking outside her area of expertise. I'm not an expert either, so I can't say for sure, I can only go by the information I'm seeing. Also, by those same responses, it seems working with unsafe Rust is a really challenging task. I'll take a deeper look into all of this, I may learn new things. Anyways, thanks for pointing out about the Rustonomicon author
Fushcia is the only modern OS that doesn't rely on C lib. It uses it as shim for programs that rely on c lib but it's not required. It uses language agnostic Fuchsia Interface Definition Language (FIDL) to describe binary interface.
C is almost a real programming language. Assembly is the real programming language, except that assembly language is a collection of hundreds of wildly different languages. Z80 assembly is very different from PIC16 or DSP56k or S08 or S12X, ARM, IBM 370, RGP30… all different. That’s the source of the type/size issues in C. I’m lucky enough to program embedded systems but even then, C isn’t the same between different processor platforms.
@@thingsiplay are you implying anything? this style of writing is popular with young people. it is styled differently than older folks might be used to, but this is right up my alley as a twenty-something years old. one sentence in a paragraph for the purpose of emphasis is quite common for our generation
as a C and C++ embedded SoftEng I'm in the unique position to understand size_t as an important type. you see, when making cross platform code that handles memory you need an (usually unsigned) integer type that is guaranteed to line up with pointer types within the system. this is done to reduce or avoid conversions when performing pointer arithmetic or storing pointers as address values(see how C handles ++ operator or addition with any type pointers, and why) when seeing how those things operate on a basic level you get to understand why those inconsistencies exist, and why for anyone above system/embedded those language types should be used either as a "I just need whatever type here" for things that are not cross platform, or not at all. leave it for people like me to know which type to put in the "int64_t" typedef line as we are already making and dealing with stupid system defines on a daily basis and need those system specific types for many things. lastly. even where I work, no one wants to touch "long", it's a remnant of the transition out of 16bit to 32bit and was therefore 32bit originally. "long" in contrast to "short". but what about int? who the fuck even knows at this point? those types were used by people back in the days where the use of typedefs was basically reserved for structs and unions, when token-pasting wasn't even a thing yet, and many more stupid limitations, they didn't know any better.
C is beautiful, anyone who thinks otherwise never got there through Fortran 77 and Basic before that. C is the lightsaber of programming languages, if you cut your leg off with it then you should have stuck to blasters.
My first job out of college was using mainly C on an HP-UX Unix machine in early 2000's. The app controlled time clocks, sent signals to machinery in the plant, etc. It had to be real-time. My boss and I would write some assembly to pull in because the C function wasn't fast enough. Great times!!! After my boss and I left that company around the same time I went to C# and never looked back. I still to other stuff SQL, Javascript, Go, etc... now at a robotics company I'm writing C++ for micro controllers. I still do C# / SQL there to communicate to the robots.
I think what programming language authors forget is that there are standard ABIs like the SysV and Windows ABIs (these are not C ABIs). If you use them, you could just link with any language that has a compiler for that specific ABI. No need to parse any C code, just mark the function as external and let the compiler emit code that conforms to that ABI and then let the linker link stuff together. It's a widely known functioning interface that language authors somehow forgot how to use. Yes, I get that people don't want to deal with all of the object files but how else are you gonna glue two languages together.
@@kuhluhOG IIRC I read through the AMD64 SysV abi when I read this article a while back. 128 bit integers aren't really defined in the ABI so each compiler is free to do whatever. The answer is that if you use weird types on function boundaries you are likely going to have problems, just push it to the stack.
@@Exilum I don't think it's the bigger problem here. It's that languages are trying to parse C instead of linking with it. C *should* be like any other language and adhere to the ABI.
Tsoding Has done a lot of work with C FFI and stuff. It may be useful to take a look at his approach. At the end of the day C isn't too much complicated than ASM. Technically all those types are just bits and bytes on the assembly level.
int was supposed be 64 bit on 64 platforms. But so many people used well... very poor programming practices in the 32bit era and just assumed type sizes. Of course when you compile for a new target this breaks shit everywhere.
C didn't really have standard intX_t types back then. So I believe when you don't have a way to explicitly specify the size of an integer you start to assume sizes of int/long.
The best thing objective-c and swift have are protocols from the small talk message based roots. It lets you do soft inheritance where things don’t have to conform structurally like memory mapping stuff in c
Longs are not well-defined. That's the point. The type sizes change based upon the platform. It can be literally equal to an Int on some platforms. You must inquire using #ifdef kind of things to figure out what applies at compile time. The problem is not that C has no ABI, but that NO ONE clearly defined an ABI and said, "This is what we're using." In fact, if you do that, you're pushed into the world of JiT compilers. There is an easier way to do it: publishing a standard for defining interfaces that involves Linkers and Loaders doing some last-minute fixups between different binaries at load time. You'd basically assume a fixed ABI across all machines, then have the Linker/Loader inject helper transforms to the local environment. It's different from JiT only in that, once done for a given machine, it could easily be cached for the next run-time and there is (theoretically) no overhead that isn't necessary for ANY ABI abstraction to work on pre-compiled software. No one has written such a standard or modified a linker yet (that I'm aware). The alternative for fully-compiled code is this: "C is the standard OS-writing language. They must have C. Let's use C as a glue to patch together our interfaces."
6:44 Nnnnope! I'm actually in the process of writing a C/header parser. Yeah, it's a b!tch, but you just have to put in the work. But then again, I'm old school; I'm quite used to burying my head in code for days/months/years on a single project.
You should check out the link given for "parsing C is basically impossible" in the article, which contains such beautiful test cases as: typedef long T, U; enum {V} (*f(T T, enum {U} y, int x[T+U]))(T t); // The above declares a function f of type: // (long, enum{U}, ptr(int)) -> ptr (long -> enum{V}) T x[(U)V+1]; // T and U again denote types; V remains visible These are some absolutely hellish things which even existing compilers get wrong, and writing a correct parser from the pile of prose rules in the C standard is not at all clear, especially when there are mistakes in the standard as well (some pointed out in the same paper). The paper's title is "A Simple, Possibly Correct LR Parser for C11" - even after that in-depth investigation the authors still can't claim to have a correct parser for C11. If you think it's "just a matter of putting in the work" I think you don't understand what you are up against. Then again, if you are okay with being wrong in a bunch of edge cases then yes, it's comparatively simple, and nothing to apologize for either, even the big leagues are doing that because the actual C standard is bonkers.
Same. Whenever I am doing C stuff, I (almost) always use bit-defined types like uint64_t unless I can't such as interoping with spicy Win32 stuffs which is distressing frequently for me since a lot of the stuff I do C/C++ stuff for is game hooks (speed run tools, randomizers, etc.) where my code will be injected into the process and I'll be working with Win32 and DirectX APIs a lot of the time.
8:52 I call 64-bit integers "long" and Notch does so too as it seems, but in Python3's struct module a simple "long" is identical to an "int", a 32-bit integer. There you need a "long long". But I thought "long" meant "it's longer than the default"?
@@thingsiplay Does it, though? Can't speak for programming but what I've found with art is that you end up trying and trying so much to make it work with prompting, you'd be better off having it done professionally by someone with experience in the first place.
@@Mayhzon That's because you are not satisfied with the many produced content. You produce a lot, meaning high quantity, with low quality. That's why you are looking for a professional. So my statement aligns with your statement as well. :-) voila
@@zeratax It's the thing with "long". What is a "long"? How many bits does it have? What's the alignment? C does everything it can do to not give an answer to that. It all depends on the OS-cpu combination. If you work on a programming language, you want things like that to be hammered in stone.
@@krux02 i just don’t see where exactly the article states that every other language should do this instead of c. all i see is the author describing why c is a protocol and not a language and how insanely complicated this problem is.
this article doesnt quite grasp that my day job is writing the same c code that runs on your 128-bit processor... but for a 16-bit AVR. And with slightly different typing, it is surprisingly portable
Confusing article that makes almost no sense at all, even if the author had the right ideas, clearly couldn't express them. This could be explained in less than 5 minutes by Casey :shrug:
C only got ”int” integers. Want it longer? use ”long int”, still not enough? use ”long long int”. Need an other bit? ”unsinged long long int”. Changed your mind, actually didn’t need that something shorter than an int? No problem ”short int” to your rescue. If you want values of specific byte lenght, use the standard integer types from stdint.h, here is a few examples: int8_t, int16_t, uint8_t.
The issue is that C ABI the protocol is written in C the programming language, so you have to understand the latter to be able to parse interface definitions using the former. This is stupid and unnecessary and causes all of C's shortcomings as a language to impact the quality of interfaces that would otherwise have nothing whatsoever to do with C.
@digama0 wrong. The ABI is a protocol. Protocols ate not language dependent. You are as foolish as the article author. He did eventually state the real culprit which wasn't the ABI. It's that "major OS" all describe the interoperability to their API and communication between shared libraries using C headers which are hard to parse. Just like the article, you are whining and not properly constructing or emphasizing the real reason. So this article mainly scrambled the minds of noobs as the main point was glossed over while he blamed the wrong thing.
@digama0 it has literally nothing to do with C ABI but is a protocol. The article did no justice and glossed over that "major OS" and their APIs and shared library formats are the whole source of the issue. So what you said shows lack of understanding of the entire thing. Although I think a lot of people are blindly following that article which mixed things up completely
All in all it was a whiny article about OS and shared library interfaces using C header files which are hard to parse. And it was presented as a war against C ABI which has nothing to do with it. Quite lame
@@gregorymorse8423 "All in all it was a whiny article about OS and shared library interfaces using C header files which are hard to parse." Yes. This is dumb but it's also the reality of what people (like Aria!) have to deal with in order to interface with the rest of the world. How do you expect progress to be made if we don't speak out when things are bad? I assure you this is not coming from a lack of understanding, but rather from knowing just how messed up the whole situation is and yet how little can be done to improve it short of moving all the big players in the world away from "the C ABI". Sure, it's the fault of these big players for using C, but it's also C's fault for being a bad ABI. It never really planned to be one, so that's justifiable, but it is one nonetheless and as long as people ignore its shortcomings we won't get anything better.
There is a reason why types that don't defined a set bit width are standard for C and C++. You can't guarantee that a particular platform will have a particular size. If you have just i8 and i16, but you need to write code for a platform that doesn't have either of those, what do you do? While this is EXTREMELY unlikely in a modern context, both C and C++ are obsessed with "backwards compatibility" Something once compiled for an DSP56000 should still be compile-able for an DSP56000 . I disagree with this stance, but i'm in the minority.
The TLDR of what I've seen here is that all the changes they described is similar to how some programming languages allow for function overloading, but they're doing it at the memory and symbol table level. This is almost identical to how 64-bit instruction set extended the 32-bit instruction set, but this time they'd be doing it for an entire language (C), except that doesn't really work when switching across operating system because C has data types (like long and long long) that don't translate well. What's even the fix to this?
When Rust has an OS written in Rust they'll have a stable ABI, when they have two OSs written in Rust they'll realize they don't have a stable ABI anymore, especially if there's commercial competition between those OSs. (not even thinking of competing Rust compilers or instruction sets). That's the real history and origin of ABI issues. People blaming the language are misguided. You can look enviously at languages running in a single source sandbox wrapped in bubble wrap with training wheels on, but that's not the environment C or Rust want to run in.
In embedded we always use int32_t, uint8_t etc. Makes life much simpler, I can't stand that most codebase for PC is 99% ints, long longs and other stuff.. Using anything else makes 0 sense and is forbidden by every dev. We do the same when writing embedded C++ code and we stay away from std:: and everything bloated from it.
because when you do embedded you know the architecture of the machine. PC codebases need to be able to run on different types of machines with minimal changes.
@@theheadpriest on PC you have 2-3 architectures. In embedded you can reuse same codebase between dozens architectures and 8, 16, 32 bit processors with different endianess.
Back when I started programming, there was no common interface between operating systems. To create a compiler you wrote in assemble calling the system specific APIs. If you wanted to port to a new system, you started over from scratch. If you don't like C you can still do this 😅. Of course many of the interfaces are C-compatible 😮. Still you aren't using C. You are just spending 1000 times the effort. As far as standardization goes. Try using pre-ANSI C and make it work on hundreds of compilers on dozens of CPUs. Back then you could not even assume that the word size was a multiple of 8. While trying to make the current situation better is admirable, this still sounds like a crybaby who doesn't know how good they have it.
This is why I never understood why we're so determined to even define something like intmax_t. Every type should just have its bit count attached to it and we should declare every variable we use with that scope instead of relying on the compiler to figure it out.
It made sense back in the day where some machines were 16bit others were 21 others 12 and such but we definitely need a ... not c but a &c (reference c) protocol that doesnt accoutn for now entirely defunct and never gonna happen again 21bit machines and such, but for that you'd need literally everyone in the world to agree to recompile every single thing to that new but correctly stable abi with defined sizes
by definition they are different, i agree that the whole int, short int, long int, long long int is a mess and it's arguably worse for portability, but the definition for the intmax_t and uintmax_t have nothing to do with precise width integers, it's by definition the largest int that the standard defines, this only problem is with c23, because the standard defines a 128 bit precise width integer, while also allowing intmax_t to not be equal to sizeof(int128_t), the standard is not technically finished yet, so this is not a problem, specially if you cite __int128_t, that it isn't part of the standard, and this is not a compiler thing anymore (in the old days the size of any integer type was compiler dependant), nowadays this is defined by the platform so you don't need to rely on compilers to know the width of each type
None of this even mentioned data field padding between unaligned types in structs even when you get the size right. That's a whole other thing and is compiler dependent and adjustable in compiler options. I once had to communicate between fortran and C and the same types with compatible sizes in data blocks (structs) were padded differently by default between the Fortran and the C compilers. That was ~30 years ago. This isn't a new problem, and even C had to try to be compatible with earlier languages on occasion.
Ok, to limit misconceptions - C standard officially do not support int128 as a type - it is a “modern” compiler extensions that are not guaranteed to have a stable ABI between compilers and topic author knowingly put this exact fact to test an ABI to make a discussion feel cringey. This is a problem of modern C/C++ as stated that many types are required to be updated to 128 bit versions (like intmax/size) if wider integers are introduced as part of standard it will not be an ABI break as there is no guarantee about width of this types - but all code will suddenly be broken as everything works on them. BTW: this is the reason why we have 128 bit (and more) capable processors but no company risk to advance further as this will break C/C++.
c23 is supposed to have int128_t, however they will allow you to have intmax_t not capable of holding a int128_t value, but yeah , the issue in the article is not a problem
Wait... does this mean that any C-program written to take advantage of AVX-512 uses something like int128 instead of something like int512? (side-note, I see here that Unreal Engine actually has something called int512 as a type definition) But what about some of the newer existing/planned RISC-V vector-extended CPU's? The spec' goes all the way up to 4096 bits wide instructions! :O Is everyone just gonna' leave that on the table? Apologies if this post is silly, since I'm quite the beginner.
@@predabot__6778 in this case it does not have a stable ABI, so the programs can't have interoperability, each program has it's own definition that only suits their needs, in fact even the System V ABI for i128 in x86_64 is completely optional
@@predabot__6778 I don't think C itself knows about SIMD. It is the compiler that may for example unroll a loop to a SIMD call. An extension may also be created to explicitly call SIMD or you can use assembly.
8:53 Truth. For years I had longs defined in the dynamic types on my application platform/OS. I just never thought much about them. The system is adaptive, so the size isn't an issue. However, longs always require 'deviant' code since there is no agreed-upon size for them. So, "Removal of longs" is on my TODO list for my platform. No big loss.
Back when humor wasn't banned from compilers, I had heard of the "long long" and decided to see what it thought of a long long long. The error message was "long long long is too long."
Such a shame. We need to bring back humor and easter eggs
And long long is two long :^)
@@Sven-W Long long long long would be two too long :^)
That one is, sadly, not from MPW C.
C love you long^2 time max
Alternative title: dyslexic man quits job to read articles full-time
We all have "dyslexic man that quit his job to read articles to us while we drive to work"-itis
Either that or ligma.
@@usmanmehmood55 what's ligma?
@@DeusEx3ligmaclang
ligma nutz
C isn't a programming language... It's a lifestyle
yes, skill issues in C lead to programming in rust.
C is a mindset. C is a grindset.
- said no one, ever
Rawdogging bits 4 lyfe, Bois!
Like putting a label on a lifestyle. So they can water it down
18:51 C++ increments the value of C, but yields the old value
Why do i love this comment, i must have clang for brains
Bjarne Stroustrup should have called it C-- as it yields C and then by the subsequent sequence point every new feature makes it worse.
And the funny thing, it increments after the ; for elemental types, but for classes it does it immediately. (OK, C has no classes, and C is not a subset of C++)
@@____uncompetativec-- is another language too 😅 it is how haskell gets compiled
@@steffenbendel6031 C is mostly a subset of C++, with some silly divergence making C less ergonomic because the standards body didn’t want mangled symbols. There’s also a few “hidden” differences that may on occasion bite you: for example, character constants are still type int in C, but are rather sensibly type char in C++. Plus even if you never touch classes templates & variadics, type introspection, and proper function overloading push C++ over the top for me.
"Sometimes, I don't think I realize, just how much of shoulders we stand on"... So true.
Ugly malformed shoulders
Probably at least an intmax_t amount of shoulders.
@@macchiato_1881 undefined amount of shoulders
I worked on a commercial product, started in 1980s. The first public version was on the IBM AS/400 [now I5 series] and it used 128 bit addresses. The pain making it work was severe. Later on it was published for s390, Linux,Unix,Windows... both 32 and 64 bit... and that was relatively easy. The hard work had all been done.
I have to say as a newbie that learning C has made me understand a lot of fundamentals better. In other languages it felt like the added abstraction didn't really allow me to see the inner workings as clearly.
Not only as a newbie.. It gives you so much understanding of how computers work its silly.
Yeah, it makes you wonder why the geniuses who knew the hardware intimately by writing assembly all day were desperate for high-level languages with lots of high-order abstractions, doesn't it?
@@JerehmiaBoaz They probably weren't actually desperate for "high-level languages with lots of high-order abstractions".
It's more likely that someone else wanted them, so they begrudgingly created them.
That's a trap. C is very far from the metal in very important respects and if you think that it works like the hardware below it does, you're in for some really nasty surprises. You expect an integer overflow to result in a wraparound? That's not in the spec, and a lot of compiler optimizations will actually show you that C does not work like the underlying hardware there.
If you wanna know how hardware works, ~~learn~~ write a Forth.
I'm an old fart, and trust me... don't fall into the immature type of zealotry you get with stuff like this article. The fact is, C is the reason we have lot of the tech we have today - period. You could argue its time has passed and things evolve (as they should) but hating on something that served us is no better than hating old buildings that don't meet modern standards. It's a great language that's just dated. Should we move on? Yes. Does it help to learn it? Yes, for educational purposes. Should you learn something else too? Yes. Don't listen to the zealots. They're going nowhere in life and will only have their rants online where nobody successful cares.
Two programming languages kissing each other and realising they are both C is like having two blind folded cousins in a kissing booth.
And their love child is a horribly disfigured abomination. The analogy holds well.
Sweet home...
@ragectl
Hot
@@theseangleAlabama
C (and sometime C++) is the lingua franca of embedded systems. Of all the dozens of embedded processor in your computer right now, most of them are written in C. (flash management processor, hard disk drive processor, PCIe PHY controller, touch panel processor, etc , device drivers.. to name just a few)
Congratulations you discovered why C can run on literally anything with some sort of processing unit and memory (it's because the entire standard is up to the hardware's implementation).
So true, this is the reason for so much different ISAs and even different architectures, such as GPUs ou DSPs, using C language most of the time. They can use C programming model and adapat it to the ISA/model specifics.
Good luck running all those modern languages on a DSP with 40bit integer and 16bit char.
C can handle it beautifully.
It is not C which has a problem. The problem is the false hardware assumptions in the other languages.
@@vonnikonThe problem is that the author fundamentally doesn't understand ABIs.
NO! This is a complete misunderstanding. It takes an immense amount of backend compiler work to support even a single target, and this isn't made easier by C's type system since the stdint types still have to be supported. The ideal, rewrite history scenario is 1) C's types always functioned like stdint types and 2) compiler authors didn't name targets the same thing if they had different ABIs.
@@WHYUNODYLAN It's not easier *today* when all architectures work in a broadly similar way there is a reason the original type sizes in C were implementation defined and that's because you could have an old supercomputer from the 70s that only worked with 24 bit words as its smallest addressable memory unit and you needed to support that.
size_t: unsigned integer type than can address any offset within any memory section.
ssize_t signed integer that covers at least [−1, ½(SIZE_MAX−1)] (NB!, the only negative value that is guaranteed to supported is −1)
rsize_t: intersection of size_t and ssize_t.
intptr_t: signed integer type that express any memory address and uses two's complement.
uintptr_t: unsigned version of intptr_t.
ptrdiff_t: this one is a real mess.
Wait, intptr_t is actually defined as using two's complement? While regular numbers are like "Ehh, we don't know what we want and what we are, we just exist"?
@@dixaba C23 finally requires two's complement representation for all signed integers, idk about intptr_t in the old standards tho
ptrdiff_t is just whatever is most relevant for storing the pointer subtraction result, and is always a signed type.
(Come to think of it, it is actually a stupid idea to have it be signed, because if it is of the size of a pointer (which makes sense and is how it usually is) but signed, it can only handle differences whose absolute value is not more than half the memory size, and everything beyond that is either overflow or underflow which is UB for signed types. It would make much more sense to have it be unsigned so that wraparound on it is well defined and it produces the right result but without worrying for signed integer overflow/underflow)
@@kodirovsshik difference and comparison operations between two pointers are defined only if they belong to the same object (which means, if you calculate difference between &a and &b, you actually have an undefined behavior). signed result is needed so you can tell if one address is smaller or larger than the other. if you make it unsigned, all the negative values are actually end up as very large unsigned values, and i'm pretty sure this will cause larger issues with UB than a integer overflow ever would (for example, imagine having a machine that doesn't have an int type of the exact size as the pointer type, adding ptrdiff_t to the pointer would probably result into the out-of-bounds address).
@@dziuaftermidnight In order to tell if one address is larger that the other, there is a much more intuitive &a < &b instead of &a - &b < 0. ptrdiff_t being signed offers nothing in this scenario
To C, or not to C.
we should C# 😔
@@blobglo I just barfed in my mouth, a bit.
I C what you did there.
to C is to believe
why not brain fuck
Well, the optimal C implementation for the hardware in a toaster and the hardware in a supercomputer are completely different, that's why its vague there isn't a one size fits all solution. C's flexibility for HW manufacturers was always its appeal.
Any language can be compiled to run on any hardware. Just need to have a written compiler. Which is pretty standard. People choose C, because of what it can do - compared to other languages. It's a mid-level language. It allows you to interact directly with hardware. Something other languages abstract, meaning you have no ability to do much of anything advanced. You can't write drivers in Java.
Sadness. Don't blame C for only solving 90% of "computers are not all the same" problems. It was never a problem with C. Heck, as a C programmer, I also hate dealing with linkers, calling conventions, and ABI's. Its the ugly underbelly of software engineering. The alternative is we hard-code binary interfaces for every piece of hardware we might expect to interface with. Its the very reason libraries, drivers, and operating systems exist.
im sure gcc, clang hardcode it already, and by the look of this article so does every other language. (Either that, or they use C). So its just hard code all the way down
You can use FIDL to describe ABI and allow automatic tools to generate required bindings. That's what Fuchsia uses.
Agreed.
The author doesn't really understand the distinction between the System Call Interface and the C wrapper library that most people interact with.
He does not understand that all C implementations are exactly that and they have freedom to do their own thing.
This is not a failure of C or even other languages, it is the nature of the beast and he is simply unaware that people "don't bother" writing their own SysCall wrappers - they just reuse glibc or whatever.
The Linux System Call Interface is fixed and Linus does a LOT to ensure it never breaks (the rants he has about this 😅...)
There is nothing to stop Rust from implementing a SysCall wrapper library that bypasses glibc. They just haven't done the work.
The article wasn't attacking the idea of a portable binary interface specification language, it was pointing out how insane C must appear to be in fulfilling that role. In web programming there's protobufs, graphql, openapi, none of which demand that you parse an unparseable language if you wish to interact with the system exposing the API.
@@gnuemacscoin interesting. i was wondering what would happen if linux kernel exposed a protobuf api when I saw that linux interface.
What is long? It holds at least 32bits worth of data, says the C standard. It holds exactly 32bits of data, says Windows. It holds exactly a pointer, says Linux.
What if I'm compiling for some tiny 8/16 bit cpu running Linux?
@plaintext7288 you use a uint8_t and uint16_t
@@plaintext7288 Linux doesn't support 16bit computers.
@@plaintext7288 you can still emulate larger types. (that's how we used 128bit types before)
@@plaintext7288 long is at least 32 bits. Your compiler may throw an error or insert code to do the math for a 32bit number with a 8/16 bit cpu. This will be horribly inefficient, but it will work.
Yah, C is the supreme authority from which all other languages, tools, apps are derived.
Mother ship.
The thing that I don't get is the obsession of making a new language for every single niche task these days. It's a lot like the web dev, "how to lose focus on mastering a single thing".
Ideally I want to go back to a standard few powerful languages and just use a well defined lib for a certain task.
C was great in the past.
If you create a core library that you want others to use to create programs with and if for example it's a library implementing a new protocol or part of it then it should be written in C. If it isn't then I assume the library isn't a serious project. Core libraries need to be fast, use small amount of memory, not crash on a failure (looking at you rust, with panic on memory allocation failure), be able to use on any system and platform and from any programming language.
LLVM is part of it. You don't actually have to make a full compiler anymore. You can make a language parser for any syntax, plug the result into an LLVM backend, and claim your new language has "comparable performance to C".
C avoids any ABI by design. Its up to the OS and architecture to do the ABI. The function calling convention (part of the ABI) is the prologue and epilogue the architecture probably suggests to include...
12:50 I dont know what zig people has told you to believe that zig is literally a fully fledged C compiler but its literally just clang with additional flags 😅
Me overhere using custom types for everything and never using built-ins..... Ahhh i love c
Do you write your own string class as every programmer did in the 90th?
The ABI suffers from having to continue to be "C". And C suffers from having to continue to be "the ABI". It feels like we'd need a software equivalent of the phonetic alphabet, i.e., a synthetic language that nobody actually uses, and doesn't even make sense on its own (no "grammar"), just for the purpose of creating well-defined interfaces. But then, that language would probably have to "talk C" too. Sigh...
If the ABI is well written, it does not need to talk C but C needs to talk the ABI. The ABI comes before C. C is just a language like any other
Nn
> You can't parse a c header
Zig: skill issue
I build I c header parser a long time ago. The first try was just to recognize keywords and attempt a translation, much like a macro processor. I klash-trashed around with it for some time, and the program got quite large. Finally I decided that it wouldn't work in terms of handing all case, so I realized that the program needed to be the top half of a C compiler with the bottom (actual code generation) chopped off. I rewrote the program as a scanner/preprocessor followed by a syntax follower. That worked much better.
Zig does a lot of things that might as well be magic
Macros? Nah, just use the Zig but tell it to do it at compile time
Cross compiling? Yes and fuck it, use Zig as a C or C++ cross-compilation toolchain too
Build system? Just use Zig for that too, and use it to build anything, including things that aren't even written in Zig
Binary size? Smaller than C in most cases
Absolute madness.
So... the author is crying because it takes more than a couple of lines of code to do that? The typical solution is... a library. Maybe something like llvm.... If you're doing compilery stuff, why complain that you might need to employ a complier library?
@@ovalteen4404 Because the purpose *isn't* to compile c++. The purpose is to communicate with an ABI.
@@ITR But as should have been made painfully clear in this article, communicating with an ABI is NOT a simple thing. You NEED a compiler library to do something like that. And further, the point is that this would be true even if C were NOT the 'de facto' ABI for the system. And libraries like LLVM can do that. You could use it to snarf up a C++ header and then emit code in your own stuff where needed to interact with the interface in that header.
At which point do we throw everything away and do THE big rewrite?
It would be funny if history just repeated itself even after THE big rewrite
never
@AlFasGD i approve of a big rewrite every 64 years
Things like extreme archaic ISO standards and all the industrial applications C is used it keep it from evolving.
The amount of times I think about this is unhealthy.
The fun starts when you work on systems were the smallest addressable memory cell is 16bit wide, so your "char" is 16bit; this is true from some DSPs. This gives the mind blowing results that sizeof(uint16_t) is 1, and sizeof(uint32_t) is 2..
Back in the early 90's the Texas Instruments C40 DSP C compiler had char at 32 bits. Actually was a pretty nice dev environment though, I enjoyed it. It had an interesting harvard-like yet flat single address space
The article author doesn't seem to know that the C libraries are themselves just a wrapper for the syscalls that actually connect to the kernel. the kernel is entirely language agnostic, but has a method that is specific to each CPU/ISA it's ported to that enables a function call to a protected kernel runtime level on that ISA (in x86_64 that is: load the function arguments and a code that specifies the function to be called into specific registers and then execute the SYSCALL instruction, but since this is specific to x86_64, different methods get used for arm, risc-V, MIPS, PPC, even x86_32). If you want to compile and create an executable without using the C libraries and their ABI, the answer is simple: implement the kernel interface directly. If you don't want to implement them all directly with your equivalent of an asm() directive, just implement one generic syscall instruction in your compiler that will spit out the right instructions for each architecture/OS you want to support and then make wrapped functions for each of the kernel routines you want to access. It'll be some work, but you can stop bitching for a while.
Go does this, right?
@@krumbergifyDo a degree, yes, but it's kinda impossible on some platforms where the syscall interface is undocumented or unstable, in which case, it still uses whatever wrapper library exists to make the syscalls.
"The article author doesn't seem to know that the C libraries are themselves just a wrapper for the syscalls that actually connect to the kernel.", no the vast majority of C libraries do far more than just "call the kernel".
How about using hardware api's like opengl? i'm pretty sure that all GPU vendors wrote their opengl API implementations for their gpus in C
@@phillipsusi1791 You're suffering the same lack of understanding that the author of the article does.
Whatever standard library has will provide all the same kinds of functionality as glibc, except it relies on glibc to handle syscalls.
There is nothing stopping the Rust maintainers from implementing their own SysCall wrappers.
Easier on Linux/Unix, not so easy on Windows... But if Rust really is "the future", Microsoft will start to implement a native Rust interface anyway...
As to the question of "unstable SysCall interface"... This is answered by Linus MANY "do not break userland" rants; if this is an issue, use a better-managed OS/kernel.
size_t is the same size as the memory address size - 64 bit or 32 bit; It's also the type which the sizeof() macro returns. There is an equivalent signed version, ssize_t which is for functions which want to return a size in bytes or a negative error value
I work with DSPs and microcontrollers. I intentionally put a buffer overflow in the serial console driver to allow for firmware upgrades because I didn't have time to write a proper bootloader before release date. The shellcode both installs the bootloader and fixes the buffer overflow. I hid it from my boss by having the DMA driver, not the CPU, perform the attack, so static code analysis wouldn't catch it.
- "Hey, how did you upgrade that without a bootloader?"
(Proceeds to show him a very short assembly program that reads from the console FIFO and dumps it into flash)
- "That wasn't in your code"
- "Nope, but look, it calls the DMA driver to read until line idle without any bounds check, then we just overwrite the vector table mapping to point the reset vector to the shellcode and send the CPU reset command". For those who don't know, JEDEC compliant flash chips wrap around to address 0 when reading past the end of memory. All you need is some way to mask off the address bits that aren't used, which looks like a safety check when writing since it prevents writing past the end of memory and throwing error codes.
"Nothing better than C" - Linus Torvalds
This quote and how he is saying it, always pops up in my head. But I think the quote is taken out of context and Linus actually means: "Nothing better than C for Linux Kernel."
@@thingsiplay That is because 90% of people who are quoting it are getting the quote from the same video that's been recommended to everyone and their grandma at this point. I don't even think most people actually watch the video. They just see the title and it sticks.
@@soulextracter it's so annoying along with that atrocious quote by Stroustrup about people complaining about languages (I'm team C)
@@rusi6219 whats the quote?
@@poggarzz something about languages that aren't complained about aren't in use
the problem with regular C is that it is not approved directly by god, unlike holyC which is provided to us straight by divine intelligence
Just a grumpy old embedded engineer here. The question is what your job is. If your job is to make software that must work on a CPU that has less than 256 pins or operating voltage more than 1.5V, you are most probaly left with only C/C++ toolchains. Assuming you want to avoid assembly. The hardware mentioned above is basically everything your world is operated by. Want something working above 80 degC? Something that should work on a wide voltage range? If one does not want to learn new assembly for each of the targets at hand, C is the language of the cheap hardware. To my surprise, the most vocal people against C is the ones who use joe, vi, emacs or some other -vintage- exotic editors.
In the 80's I worked on spell checkers for typewriters and handheld devices. Even when a C compiler existed for the obscure CPUs being used, it was generally buggy and failed to even meet the loose "standard" provided by the original K&R book.
It sounds like it is a little bit better now, but still a big pain.
I thought that's what Forth was for.
How long is an unsigned shlong?
69 inches
a bit longer shlong
Same size as a signed shlong.
Anyone who hasn't signed their shlong is untrustworthy.
I remember arguing with a professor's assistant decades ago about the notion of wobbly definitions. I felt, and still do to this day, that programmers need to know how big a number they can store in a given type, and they need to be able to rely on that. He made a vague, hand waving argument about how they don't really, and how software can be more portable if wobbly definitions are used. More portable? The definitions for things *change* when you port, and that makes software *more* portable? That's not obviously fallacious? Either I was missing something that I'm still missing to this day, or that fellow was just kind of ideologically captured.
That reminds me of an argument I had with a high school calculus teacher about a theoretical baseball out in space slowing down on it's own with no external forces acting on it. I said if there's no friction or gravity or anything, won't it just go on in a straight line at the same speed forever? No, there would be some rate of change of it's speed, the teacher insisted. Sure there would, I said, but it would be zero. No, they insisted. It would be almost zero, but not quite zero. It would be an epsilon. (The closest thing to zero that's not zero.) Ok, so if it's non-zero, in what direction will it be? In the direction opposite the one it's moving in. Really? Moving relative to what? The ether? It seemed to me that the analogy he was making to try to explain what epsilon is was breaking down a bit. "Maybe we should be talking about a circumstance under which the formula describing the values we're tracking actually involves epsilon?" I got told by him and by class members not to argue with the teacher. I think I facepalmed at the time, and I still facepalm remembering it. Fortunately the calculus/physics teacher at the community college affirmed my understanding. I didn't tell my high school teacher. (One facepalm was sufficient.)
Hah! I'm sure many of us have had terrible arguments with teachers when we catch them out on their own lack of understanding. I agree that code usually needs to make some assumptions about the ranges of values you can store. It's not usually *dire* because `int` is often big enough and a smaller computer will naturally not be able to work on as large a data set, but you're not wrong.
That said, there is some truth to increased portability from "wobbly definitions," but it's more about the language itself. You see, C comes from a time when some major hardware platforms had a 36-bit word size and you couldn't count on a (hardware-supported) 8-bit value or maybe any power-of-2 type size. Two's complement wasn't universal either. If C had demanded exact bit sizes for each type, or two's complement, then no C for you! Computer hardware looks a lot different-and a lot more uniform-50+ years later, and it's easy to forget why certain things don't make as much sense anymore.
@@PassifloraCerulea That, sir, is a very good point! And, gosh, C is still used in all kinds of crazy embedded environments. I feel a bit stupid now. I suppose it does make sense to have a non-specific "int" type, and then have platform-dependent types like "int8" and so on.
@@PassifloraCerulea Yeah, such teacher things come in my mind a lot. Even though most of my teachers were really good, I seem to remember most vividly the bad ones because I felt strong emotions. I wish it was the other way around.
I think the wobbly definitions make the code portable in really trivial cases . Like a for loop that doesn't go past 255 or something .
I always use stdint.h for uint32_t . I also never use enums because it makes me uncomfortable going "they are probably 32bit ints" .
Probably ? That's going to be bad news if I want to bit pack those enums .
yeah the non-zero friction is actually due to virtual particles popping in an out of existence. It's a very weird phenomenon that's only recently been discovered/confirmed/whatever
Programming is like a lake, a very old untouched lake with no fish in it. The surface is all nice and clear but deep down there's mud and decaying plant matter. I like diving deep in there cuz I like rolling around in that mud but it sure feels icky every time lol.
No more netflix = no more hoodie?
The hoodie was company property
we need to give him a new hoodie
30:50: How are you going to have int32_t on a 9- or 7-bit machine? And what are you going to do when you a size, a memory address expressed as an integer, or a difference between pointers? Sure, you don't need short, int, long, and long long, anymore as you have {u,}int_{fast,least}N_t nowadays, but there still are integer types where you don't want a number.
Isn't __int128 a compiler extension? So different implementation in different compilers is expected. And you don't need glibc to create file you can make syscall directly
Sure, you can make a syscall directly... if you know what ABI the kernel that you are running on speaks. And even the same kernel running on the same CPU can speak different ABIs.
Ok... this observation doesn't make it any easier for languages that aren't C to be able to call functions defined in C header files that contain the text "__int128" in them
Also, calling syscalls directly is not supported by any OS except linux IIUC. For example windows requires that you link to KERNEL32.DLL and call functions there instead of syscalls, and the API for it is defined in windows.h which I'm sure contains every possible horror you can find in C.
5:03 I would argue that's partially untruth. You could just do a wrapper around single every system call (this would require writing assembly) and you would be good to go. The major question is: are you willing to do that?
Her core thesis is that writing a standard library for systems is impossible because you HAVE to either use libc or write assembly, and "of course" writing assembly is impossible, therefore it cannot be done without C.
Everyone gets so hung up on the syscall example and completely misses the FFI thing, which is not about syscalls only, but also about how you make different programming languages *of any kind, in any situation* interact with each other. I don't know about you, but when I'm programming in CL and trying to get a compute shader to run, I'm not gonna try and figure out how to do it in Assembly, especially if I want what I'm doing to be portable.
C is love, c is life and also goddamn nice
size_t is used in std c/++ because it's the return type of sizeof. The reason they do *that* is because anytime you're working with arrays, indices, byte sizes, etc. are all times where that value can be potentially converted to/from a pointer or used as a pointer offset. That means it needs to be able to hold any value a valid pointer could so on 64bit machines it needs to be a 64 bit int, on 32 bit it needs to be 32 bits, etc. Thus size_t avoids all of the nonsense of int vs long vs long long, etc and avoids the hardcoded sizing of int32_t and just says "whatever the size of your pointer is the size of the integer you should use for indexing, here you go". C++ mucks this up a bit because it has extra large pointers in some cases due to multiple inheritance, v-tables, etc.
I don’t C any problems here
The problem seems to be that versioning and compatibility aren't represented explicitly in C definitions, but are implied from the field order of structs or type alignment. For a universal base ABI, you'd really want something like protobuf, which has explicit field numbers, a first-class concept of "unknown fields," and a defined, platform-independent serialization. All of the clever workarounds that MS does in the minidump example should really be part of the interface specification language. Also, we got ourselves into trouble by using the same type for "an integer (of some size)," "a pointer," "an offset," "an address," and "an integer that is fast for this CPU."
Obviously, it's about 40 years too late to do anything about any of this, but we can shake our fists at the sky.
"Rust make people angry that things don't looks like Rust"
After the last 3 years coding only in swift I can relate.
8:56 While the real sizes of those depend on the implementation of the data type, the minimum size for a short is 16 bits, 32 for a long and 64 for a long long. A short long would be the same as a short and a short short (half of a short) would be the same as a char.
Not on all systems and not by the standard.
Tower of Babel. I hope I still speak Python after the collapse.
9:00 When I was a programming student, I had a friend I'd correspond with by email every now and then exchanging homework assistance, and I'd just gotten introduced to C (via GCC) coming from Java experience in high school, and I personally maintain that the funniest exchange she and I had ever had was when I was messing around with the type system in C, learning about the sizes of different types, and managed to get back the error message "'long long long' is too long for GCC". Like "oh, 'long long', that's fine, but 'long long LONG'??? That's going too far buddy!" 🤣
Yea, int was supposed to be 16 bits on 16 bit CPUs, 32 bits on 32 bit CPUs and 64 bits on 64 bit CPUs, but they kept it at 32 bit for x86_64 because "changing it would break things" ( namely badly written software ). I have worked on a CPU where int was 24 bits and long was 32. Go figure.
Don't forget 80 bit floats!
I'm not that deep, but if the authors problem is not being sure if int is 32 bits or 64 bits, why not use malloc and pointer?
@@phoneywheeze That doesn't really have anything to do with it.
@@phoneywheeze That's a completely unrelated concept. Even if you did malloc your ints, you still need to pull them into registers to actually do any math with them, and at that point you need to know how many bits the CPU will add together when you ask C to grab two ints and add them to know if your program will break in the event that, say, the user asks to make a table with more than 65535 rows.
You can call these functions including the binaries of the required C libraries into your compiled code by calling them in assembly or machine code. But use them as your programming lingua franca is worse than use C.
I've had to deal with this at work. We have literal meetings over ABI/API compatibility and together design a header and review it before it ships. (I'm just a junior dev tho)
Thankfully, there are standard methods to do this well and maintain compatibility, but it comes with extreme cost. When a new feature is at potential odds with the old one, and there is a change that old and new types can mix, then oh boy.
For example, what do you do when multiple apps load your shared object. Shared objects are typically loaded only once into memory but mapped to different virtual addresses b/w apps. The object file has both the versioned APIs - v1 and v2. One uses v1, the other uses v2. But both the apps talk to each other. You'd think everything would "compute" in v1 but the second app does not "know" about the first app. Remember that struct members are just offsets from a pointer so if you get a v2 object from the second app but was created by v1, in your library you're reading out of bounds in your library ... but it's the fault of the second app! (It should've set the version to 1 instead of 2).
I dreamed up this scenario, but I wouldn't be surprised if it happened at some point in time. Hey hey what if apps exported their ABI info as json :D along with the binary. Then we can just use that to interop. This would also allow languages to change layouts (as in consecutive members don't need to be in the same order in the binary as they are declared)
This is just another case of the diamond dependency problem seen with objects, and should be solved the same way. Make the base abi definitions include semantic versioning info, and resolve it at link time or load time. If it cannot be resolved at that time, don't let the code run.
C has always been defined as a source portable language, not an abi portable language, but some system and low level library developers don't help.
Seeing people argue about tools to get job done and disregarding where this tool shine is beyond me
We should make whatever ABI HolyC uses the FFI standard instead.
I second this. Down with the C ABI. Praise be to HolyC.
I remember thinking about this when I worked mostly on systems programming. I think it's more a problem with operating systems than it is C. An OS could have a more abstract Interface Definition Language. I think fuschia does for instance. As with most things in operating system design it's terrible to try and theorize about. The most shocking thing I learned as systems programmer was that Microsoft is occasionally better than the linux developers. Their backwards compatibility with ABIs is better imo and the ability to link shared libraries without needing the full .so to compile is clearly better afaict. I don't know how glibc can exist like this in the year of our lord 2024 I must be missing something.
9:10 its actually compiler dependant for example TurboC has 16 bit ints iirc
at a time it was compiler dependant, but nowadays they shift to be platform dependant, but there's a reference definition for the minimum size of each integer, and the minimum size of a int is a 16bit integer
Now I feel old and way under knowledge. This debate of frustration takes me back to nearly the same debate in '71 (when my friends were introduced to BASIC) and again mid 80's (when MS published 7 volumes of Windows libraries. I am sure this was debated before my time as well. Each time the debate split folks into a couple of camps. Both sides said F-it I roll my own and those that dug into figuring out the next level.
C is turing complete. And therefore it can run any (computational) program, or any algorithm you can come up with.
Incidentally, CSS is also turing complete. CSS is actually a programming language.
I thought it was obvious, since CSS stands for CpluSpluS. (right?)
As someone who's studied enough C++ to know how it differs from C, the moment I heard Rust and Swift use C to talk to each other, I knew somebody, somewhere had F----d up big time and we're all paying for it now... Good to know my hunch was correct!
I've read their blog before/know who the author is: They have written the Rustonomicon, _the_ thing to read about unsafe rust.
The "sounds like skill issues" by the Primagen at 6:20 and the chat at 6:23 "WTF? Even ChatGPT would do better than this article" made me once again realize that the Chat and often times The Primagen too have absolutely no idea about some part of programming.
She also did a ton of work getting Rust to have functional FFI with many different C compilers. She is absolutely the person to defer to on this kind of stuff.
Does that not confirm the so called "skill issue" then? Because She's an expert with unsafe Rust not a C expert. Nevertheless, that is still experience she got for unsafe Rust interfacing with ABIs. Judging by the other comments here, it seems the ABIs goes deeper than just C
@@JArielALamus for one thing, I'd wager she's more of a "C expert" than most good C programmers are, since compiler people generally need to know a lot more specifics about languages. For another, I assume "skill issue" was a joke response to the "you can't actually parse C" claim.
@@WHYUNODYLAN I don't know about that. I have seen the responses to this article and for the looks of it, she is indeed talking outside her area of expertise. I'm not an expert either, so I can't say for sure, I can only go by the information I'm seeing.
Also, by those same responses, it seems working with unsafe Rust is a really challenging task.
I'll take a deeper look into all of this, I may learn new things. Anyways, thanks for pointing out about the Rustonomicon author
@@JArielALamus No, she is in fact an expert on this exact topic.
Each long should increase the number of bits by a random amount, and not by a multiple of eight, in fact, only by prime number amounts.
C is the means by which all is revealed.
the fall is coming
C-ing is believing.
Fushcia is the only modern OS that doesn't rely on C lib. It uses it as shim for programs that rely on c lib but it's not required. It uses language agnostic Fuchsia Interface Definition Language (FIDL) to describe binary interface.
C is not a language, it's something more!
IT'S AN EXPERIENCE!
@@juanmacias5922 IT'S A JOURNEY!
C is almost a real programming language. Assembly is the real programming language, except that assembly language is a collection of hundreds of wildly different languages. Z80 assembly is very different from PIC16 or DSP56k or S08 or S12X, ARM, IBM 370, RGP30… all different. That’s the source of the type/size issues in C. I’m lucky enough to program embedded systems but even then, C isn’t the same between different processor platforms.
I really dislike articles which have a new paragraph for every single sentence. They probably do it, so the article looks longer.
no, it's for emphasis
Reddit spacing
@@卛 For people who can't read?
@@thingsiplay are you implying anything?
this style of writing is popular with young people. it is styled differently than older folks might be used to, but this is right up my alley as a twenty-something years old.
one sentence in a paragraph for the purpose of emphasis is quite common for our generation
@@卛 It doesn't matter what generation it is, I dislike the way how the article is structured.
This article sounds like "My problem with C are the other languages"
Can't we all just get a long?
as a C and C++ embedded SoftEng I'm in the unique position to understand size_t as an important type.
you see, when making cross platform code that handles memory you need an (usually unsigned) integer type that is guaranteed to line up with pointer types within the system.
this is done to reduce or avoid conversions when performing pointer arithmetic or storing pointers as address values(see how C handles ++ operator or addition with any type pointers, and why)
when seeing how those things operate on a basic level you get to understand why those inconsistencies exist, and why for anyone above system/embedded those language types should be used either as a "I just need whatever type here" for things that are not cross platform, or not at all.
leave it for people like me to know which type to put in the "int64_t" typedef line as we are already making and dealing with stupid system defines on a daily basis and need those system specific types for many things.
lastly.
even where I work, no one wants to touch "long", it's a remnant of the transition out of 16bit to 32bit and was therefore 32bit originally. "long" in contrast to "short".
but what about int? who the fuck even knows at this point? those types were used by people back in the days where the use of typedefs was basically reserved for structs and unions, when token-pasting wasn't even a thing yet, and many more stupid limitations, they didn't know any better.
C MENTIONED LETS GO
Go mentioned!
ABI is not defined by C. C needs to comply with the ABI. Any language that compiles to byte code needs to comply with the ABI of the system.
I find it hilarious how much rust fanbois complain about C abis and apis, while rust in itself is an unstable, constantly breaking mess.
C is beautiful, anyone who thinks otherwise never got there through Fortran 77 and Basic before that.
C is the lightsaber of programming languages, if you cut your leg off with it then you should have stuck to blasters.
It is the best programming language ever, and I realized that after 15 years of programming in everything else and then revisiting C.
Welcome to the dark side of wizardry
My first job out of college was using mainly C on an HP-UX Unix machine in early 2000's. The app controlled time clocks, sent signals to machinery in the plant, etc. It had to be real-time. My boss and I would write some assembly to pull in because the C function wasn't fast enough. Great times!!! After my boss and I left that company around the same time I went to C# and never looked back. I still to other stuff SQL, Javascript, Go, etc... now at a robotics company I'm writing C++ for micro controllers. I still do C# / SQL there to communicate to the robots.
I think what programming language authors forget is that there are standard ABIs like the SysV and Windows ABIs (these are not C ABIs). If you use them, you could just link with any language that has a compiler for that specific ABI. No need to parse any C code, just mark the function as external and let the compiler emit code that conforms to that ABI and then let the linker link stuff together. It's a widely known functioning interface that language authors somehow forgot how to use. Yes, I get that people don't want to deal with all of the object files but how else are you gonna glue two languages together.
With the exception that the article stated that two compilers (gcc and clang) disagree on the interpretation of a part of the SysV ABI.
There was a mention of the SysV ABI being problematic too.
@@kuhluhOG IIRC I read through the AMD64 SysV abi when I read this article a while back. 128 bit integers aren't really defined in the ABI so each compiler is free to do whatever. The answer is that if you use weird types on function boundaries you are likely going to have problems, just push it to the stack.
@@kuhluhOG this is probably Cs fault and not the ABI, or the ABI is just badly written. But there's no use of changing it now
@@Exilum I don't think it's the bigger problem here. It's that languages are trying to parse C instead of linking with it. C *should* be like any other language and adhere to the ABI.
Tsoding Has done a lot of work with C FFI and stuff. It may be useful to take a look at his approach.
At the end of the day C isn't too much complicated than ASM. Technically all those types are just bits and bytes on the assembly level.
int was supposed be 64 bit on 64 platforms. But so many people used well... very poor programming practices in the 32bit era and just assumed type sizes. Of course when you compile for a new target this breaks shit everywhere.
C didn't really have standard intX_t types back then. So I believe when you don't have a way to explicitly specify the size of an integer you start to assume sizes of int/long.
The best thing objective-c and swift have are protocols from the small talk message based roots. It lets you do soft inheritance where things don’t have to conform structurally like memory mapping stuff in c
Very interesting article but I really didn't like the tone. It felt very high-horse and dismissive even where it wasn't
look up the author. It will explain things 😂
The author is pretentious skynet
@@pavlinggeorgiev Aria Beingessner? I couldn't find anything strange with a basic google search.
Longs are not well-defined. That's the point. The type sizes change based upon the platform. It can be literally equal to an Int on some platforms. You must inquire using #ifdef kind of things to figure out what applies at compile time.
The problem is not that C has no ABI, but that NO ONE clearly defined an ABI and said, "This is what we're using." In fact, if you do that, you're pushed into the world of JiT compilers. There is an easier way to do it: publishing a standard for defining interfaces that involves Linkers and Loaders doing some last-minute fixups between different binaries at load time. You'd basically assume a fixed ABI across all machines, then have the Linker/Loader inject helper transforms to the local environment. It's different from JiT only in that, once done for a given machine, it could easily be cached for the next run-time and there is (theoretically) no overhead that isn't necessary for ANY ABI abstraction to work on pre-compiled software.
No one has written such a standard or modified a linker yet (that I'm aware).
The alternative for fully-compiled code is this: "C is the standard OS-writing language. They must have C. Let's use C as a glue to patch together our interfaces."
6:44 Nnnnope! I'm actually in the process of writing a C/header parser. Yeah, it's a b!tch, but you just have to put in the work. But then again, I'm old school; I'm quite used to burying my head in code for days/months/years on a single project.
You should check out the link given for "parsing C is basically impossible" in the article, which contains such beautiful test cases as:
typedef long T, U;
enum {V} (*f(T T, enum {U} y, int x[T+U]))(T t);
// The above declares a function f of type:
// (long, enum{U}, ptr(int)) -> ptr (long -> enum{V})
T x[(U)V+1]; // T and U again denote types; V remains visible
These are some absolutely hellish things which even existing compilers get wrong, and writing a correct parser from the pile of prose rules in the C standard is not at all clear, especially when there are mistakes in the standard as well (some pointed out in the same paper). The paper's title is "A Simple, Possibly Correct LR Parser for C11" - even after that in-depth investigation the authors still can't claim to have a correct parser for C11. If you think it's "just a matter of putting in the work" I think you don't understand what you are up against. Then again, if you are okay with being wrong in a bunch of edge cases then yes, it's comparatively simple, and nothing to apologize for either, even the big leagues are doing that because the actual C standard is bonkers.
Same. Whenever I am doing C stuff, I (almost) always use bit-defined types like uint64_t unless I can't such as interoping with spicy Win32 stuffs which is distressing frequently for me since a lot of the stuff I do C/C++ stuff for is game hooks (speed run tools, randomizers, etc.) where my code will be injected into the process and I'll be working with Win32 and DirectX APIs a lot of the time.
31:08 is size_t supposed to be unsigned and same size as memory address?
size_t is always unsigned and it's the type returned by the sizeof operator
I like the way things are, C is the only sane language, thank god it's everywhere.
Still a language, but also a protocol.
8:52 I call 64-bit integers "long" and Notch does so too as it seems, but in Python3's struct module a simple "long" is identical to an "int", a 32-bit integer. There you need a "long long". But I thought "long" meant "it's longer than the default"?
AI = Productivity goes up, quality goes down
^^This. (Not sure about the productivity, either).
@@3polygons "Productivity" in the sense of "Quantity". I just used that term in sarcasm manner as an alias for quantity.
@@thingsiplay I see.. :)
@@thingsiplay Does it, though?
Can't speak for programming but what I've found with art is that you end up trying and trying so much to make it work with prompting, you'd be better off having it done professionally by someone with experience in the first place.
@@Mayhzon That's because you are not satisfied with the many produced content. You produce a lot, meaning high quantity, with low quality. That's why you are looking for a professional. So my statement aligns with your statement as well. :-) voila
"You wanna run on everything or not kid?"
Hes got it backwards. C bends to arbitrary structure layouts and whatnot. He wants every other language to do that instead of C.
idk where it says that?
@@zeratax It's the thing with "long". What is a "long"? How many bits does it have? What's the alignment? C does everything it can do to not give an answer to that. It all depends on the OS-cpu combination. If you work on a programming language, you want things like that to be hammered in stone.
@@krux02 i just don’t see where exactly the article states that every other language should do this instead of c. all i see is the author describing why c is a protocol and not a language and how insanely complicated this problem is.
@@krux02but you don't really want such things to be hammered down when you are writing your own os or a compiler for your own platform.
She, not he
this article doesnt quite grasp that my day job is writing the same c code that runs on your 128-bit processor... but for a 16-bit AVR. And with slightly different typing, it is surprisingly portable
Hi, this is nice, how do you do that?
Confusing article that makes almost no sense at all, even if the author had the right ideas, clearly couldn't express them. This could be explained in less than 5 minutes by Casey :shrug:
No you are just stupid
C only got ”int” integers.
Want it longer? use ”long int”, still not enough? use ”long long int”. Need an other bit? ”unsinged long long int”. Changed your mind, actually didn’t need that something shorter than an int? No problem ”short int” to your rescue. If you want values of specific byte lenght, use the standard integer types from stdint.h, here is a few examples: int8_t, int16_t, uint8_t.
stdint.h is a pretty recent thing, and it was a real pain before it became widely available on most (still not all) of the platforms.
unsigned doesn't give extra bits, it just shifts the negative, to start with 0
The author is confusing C ABI, a protocol with C the programming language. Maybe its just clickbait but that part of the argument was nonsense
The issue is that C ABI the protocol is written in C the programming language, so you have to understand the latter to be able to parse interface definitions using the former. This is stupid and unnecessary and causes all of C's shortcomings as a language to impact the quality of interfaces that would otherwise have nothing whatsoever to do with C.
@digama0 wrong. The ABI is a protocol. Protocols ate not language dependent. You are as foolish as the article author. He did eventually state the real culprit which wasn't the ABI. It's that "major OS" all describe the interoperability to their API and communication between shared libraries using C headers which are hard to parse. Just like the article, you are whining and not properly constructing or emphasizing the real reason. So this article mainly scrambled the minds of noobs as the main point was glossed over while he blamed the wrong thing.
@digama0 it has literally nothing to do with C ABI but is a protocol. The article did no justice and glossed over that "major OS" and their APIs and shared library formats are the whole source of the issue. So what you said shows lack of understanding of the entire thing. Although I think a lot of people are blindly following that article which mixed things up completely
All in all it was a whiny article about OS and shared library interfaces using C header files which are hard to parse. And it was presented as a war against C ABI which has nothing to do with it. Quite lame
@@gregorymorse8423 "All in all it was a whiny article about OS and shared library interfaces using C header files which are hard to parse." Yes. This is dumb but it's also the reality of what people (like Aria!) have to deal with in order to interface with the rest of the world. How do you expect progress to be made if we don't speak out when things are bad? I assure you this is not coming from a lack of understanding, but rather from knowing just how messed up the whole situation is and yet how little can be done to improve it short of moving all the big players in the world away from "the C ABI". Sure, it's the fault of these big players for using C, but it's also C's fault for being a bad ABI. It never really planned to be one, so that's justifiable, but it is one nonetheless and as long as people ignore its shortcomings we won't get anything better.
There is a reason why types that don't defined a set bit width are standard for C and C++.
You can't guarantee that a particular platform will have a particular size.
If you have just i8 and i16, but you need to write code for a platform that doesn't have either of those, what do you do?
While this is EXTREMELY unlikely in a modern context, both C and C++ are obsessed with "backwards compatibility"
Something once compiled for an DSP56000 should still be compile-able for an DSP56000 .
I disagree with this stance, but i'm in the minority.
The TLDR of what I've seen here is that all the changes they described is similar to how some programming languages allow for function overloading, but they're doing it at the memory and symbol table level. This is almost identical to how 64-bit instruction set extended the 32-bit instruction set, but this time they'd be doing it for an entire language (C), except that doesn't really work when switching across operating system because C has data types (like long and long long) that don't translate well. What's even the fix to this?
that's the article in a nutshell but seems on the last note you didn't get it, there are no solutions, it just becomes your problem when you have to
@@itsmeagain1415 Thank you! That's what I thought as well but it seems I didn't word it clearly enough.
When Rust has an OS written in Rust they'll have a stable ABI, when they have two OSs written in Rust they'll realize they don't have a stable ABI anymore, especially if there's commercial competition between those OSs. (not even thinking of competing Rust compilers or instruction sets). That's the real history and origin of ABI issues. People blaming the language are misguided. You can look enviously at languages running in a single source sandbox wrapped in bubble wrap with training wheels on, but that's not the environment C or Rust want to run in.
In embedded we always use int32_t, uint8_t etc. Makes life much simpler, I can't stand that most codebase for PC is 99% ints, long longs and other stuff..
Using anything else makes 0 sense and is forbidden by every dev.
We do the same when writing embedded C++ code and we stay away from std:: and everything bloated from it.
because when you do embedded you know the architecture of the machine. PC codebases need to be able to run on different types of machines with minimal changes.
@@theheadpriest on PC you have 2-3 architectures.
In embedded you can reuse same codebase between dozens architectures and 8, 16, 32 bit processors with different endianess.
@@theheadpriestif you need a different data for your platform (eh memory address) then you should write a new program.
Back when I started programming, there was no common interface between operating systems. To create a compiler you wrote in assemble calling the system specific APIs. If you wanted to port to a new system, you started over from scratch.
If you don't like C you can still do this 😅. Of course many of the interfaces are C-compatible 😮. Still you aren't using C. You are just spending 1000 times the effort.
As far as standardization goes. Try using pre-ANSI C and make it work on hundreds of compilers on dozens of CPUs. Back then you could not even assume that the word size was a multiple of 8.
While trying to make the current situation better is admirable, this still sounds like a crybaby who doesn't know how good they have it.
This is why I never understood why we're so determined to even define something like intmax_t. Every type should just have its bit count attached to it and we should declare every variable we use with that scope instead of relying on the compiler to figure it out.
It made sense back in the day where some machines were 16bit others were 21 others 12 and such but we definitely need a ... not c but a &c (reference c) protocol that doesnt accoutn for now entirely defunct and never gonna happen again 21bit machines and such, but for that you'd need literally everyone in the world to agree to recompile every single thing to that new but correctly stable abi with defined sizes
by definition they are different, i agree that the whole int, short int, long int, long long int is a mess and it's arguably worse for portability, but the definition for the intmax_t and uintmax_t have nothing to do with precise width integers, it's by definition the largest int that the standard defines, this only problem is with c23, because the standard defines a 128 bit precise width integer, while also allowing intmax_t to not be equal to sizeof(int128_t), the standard is not technically finished yet, so this is not a problem, specially if you cite __int128_t, that it isn't part of the standard, and this is not a compiler thing anymore (in the old days the size of any integer type was compiler dependant), nowadays this is defined by the platform so you don't need to rely on compilers to know the width of each type
None of this even mentioned data field padding between unaligned types in structs even when you get the size right. That's a whole other thing and is compiler dependent and adjustable in compiler options. I once had to communicate between fortran and C and the same types with compatible sizes in data blocks (structs) were padded differently by default between the Fortran and the C compilers. That was ~30 years ago. This isn't a new problem, and even C had to try to be compatible with earlier languages on occasion.
Ok, to limit misconceptions - C standard officially do not support int128 as a type - it is a “modern” compiler extensions that are not guaranteed to have a stable ABI between compilers and topic author knowingly put this exact fact to test an ABI to make a discussion feel cringey.
This is a problem of modern C/C++ as stated that many types are required to be updated to 128 bit versions (like intmax/size) if wider integers are introduced as part of standard it will not be an ABI break as there is no guarantee about width of this types - but all code will suddenly be broken as everything works on them.
BTW: this is the reason why we have 128 bit (and more) capable processors but no company risk to advance further as this will break C/C++.
c23 is supposed to have int128_t, however they will allow you to have intmax_t not capable of holding a int128_t value, but yeah , the issue in the article is not a problem
Wait... does this mean that any C-program written to take advantage of AVX-512 uses something like int128 instead of something like int512? (side-note, I see here that Unreal Engine actually has something called int512 as a type definition) But what about some of the newer existing/planned RISC-V vector-extended CPU's? The spec' goes all the way up to 4096 bits wide instructions! :O Is everyone just gonna' leave that on the table? Apologies if this post is silly, since I'm quite the beginner.
@@predabot__6778 in this case it does not have a stable ABI, so the programs can't have interoperability, each program has it's own definition that only suits their needs, in fact even the System V ABI for i128 in x86_64 is completely optional
@@predabot__6778 I don't think C itself knows about SIMD. It is the compiler that may for example unroll a loop to a SIMD call. An extension may also be created to explicitly call SIMD or you can use assembly.
8:53 Truth. For years I had longs defined in the dynamic types on my application platform/OS. I just never thought much about them. The system is adaptive, so the size isn't an issue. However, longs always require 'deviant' code since there is no agreed-upon size for them. So, "Removal of longs" is on my TODO list for my platform. No big loss.
Terrible: that’s the long and short of it
It's not that C isn't a language. It's the only language. All these others are just newer C compilers
*code generator