Ah shoot and P.S.: so if at one point some guy was flipping switches to get the first computer programmed, how do you do that with things like quantum computers?
Hi, this is the author. Thank you both for reading the article and for responding with this information. I have been a long time lurker at HN and when I saw the traffic numbers on my blog I wondered if someone linked to the post here. It has been admittedly gratifying that at least a few people have taken an interest in my weird little musings.
Anyway. If I understand you correctly: at the bottom of it all, there is a thing called an assembler, which is a kind of compiler, that makes the code existing "above" the assembler understandable to the actual hardware.
And I'm guessing there's a reason we don't just have people coding in assembly all the time? Probably it is a huge pain to code at the assembly level? If so -- could you redesign the hardware so it is easier just to code directly, I guess, "on" the hardware? Then you wouldn't have to worry about all these languages talking to each other. Plus, I bet the computer would run faster since you don't have to have it doing as much stuff, right?
Potentially dumb question: for computers that are not electric, like mechanical computers or quantum computers, do you essentially have to make a whole new assembly language?
It's hard to express some of this in layman's terms, but it's good practice for me because I plan to write a book about exactly this:
Generally what a CPU does is, fetch an instruction (represented by a number) from memory, run it, then repeat with the next instruction (a little hand-waving here because modern CPUs also have the ability to rewrite the instructions they're given into equivalent-but-faster instructions, in pursuit of speed, but generally that's the idea). So, if you wanted, you could write the exact numbers representing the instructions you want to run into memory, and let the CPU run it. But this would be very hard to do, so, there are tools:
- Instead of writing stuff into memory, you write it into a file, and the computer has an operating system that knows how to read files and put them into memory.
- Instead of writing the numbers, you can write little text descriptions of the instructions, like "add" instead of the number "4," and the assembler handles the translation (as well as some other bookkeeping) for you.
So, why not write in assembly all the time? Because it's a lot of work: the instructions available on a CPU are very low level: mostly moving numbers around in memory and doing basic arithmetic. It's much easier to be able to work at a higher level of abstraction, like "take this list of people and sort it by last name" than "add 3 to this number and store the result at memory location 47." That higher level of abstraction is what a compiler does: you write in a higher-level language and it generates code in another language (like assembly) that does the same thing. In practice only things that need to run very fast, or things where no other tools are available, are written in assembly.
Why not make assembly higher level then? Complicated question: one, some chips do, to a degree: there are chips designed to make it easier to hand-code for them. But most don't because it ends up being both faster and easier not to: you can use less hardware if you support fewer and simpler instructions, and if you tailor those instructions to make it easier to write a compiler targeting them, it ends up being faster and simpler for that too. As for it being faster because the computer is doing less: it's actually doing more!
Remember that the compiler step usually happens before you run the program: someone writes source code, feeds it to a compiler or assembler, and gets out the numbers the computer can run. The source is no longer needed; you can hand that output to someone else instead and they can run it directly. That means we care a lot more about the time it takes to run something than the time to compile something. A chip with higher-level instructions makes the opposite tradeoff: a more complex instruction would take longer for the hardware to run, and you'd pay that cost more often.
But wait, you ask, you'd also need fewer instructions, your program would be shorter! Well, maybe you'd win that tradeoff, maybe not. It depends on what you're doing. This is part of why there are lots of different chips; different applications want different instructions in hardware. There are lots of benchmarks for chips for how quickly they can solve a particular math problem, and sometimes chip A is better at benchmark X than chip B, but B beats A at benchmark Y: differences in instruction set mean that the assembly for one is faster than for another one, at some things.
Non-electric computers: in general, every machine has a specific assembly language for that machine. The AMD Ryzen I'm typing this on runs different instructions than my Apple M1 laptop which also runs different instructions than the Espressif ESP32-based game system on my desk. This follows for non-electric computers too: any stored-program computer, the format of how you store the program is arbitrary and up to the designer of that computer, and presumably a quantum computer or a mechanical one would have different instructions. I can't really speak for quantum computers (are there any yet?) but I've been designing a toy CPU for fun, and here's how I did it:
- Figure out which instructions I wanted, and how I wanted to represent them as numbers
- Write a program that runs on my normal computer that will translate mnemonics ("add", "store") to those numbers (the assembler)
- Write another program that runs on my computer that will pretend to be the new one, and run the new one's instructions (a virtual machine)
- Write some programs in the new machine's assembly
If you add a step after that for "make a piece of hardware that will be the new CPU," then that's pretty much how a new computer is built. I wouldn't need to physically use switches, but that's just because instead I'd write things to a ROM chip using my normal computer. If I really had to, I could do that using switches though, and that was common for homebuilt hobbyist computers of the 70s-80s.
Edit, probably worth noting: "in general, every machine has a specific assembly language for that machine," in practice this is becoming less and less true. It's now common for several different chips to have either entirely identical instructions or commonalities in instruction set. ARM computers (which is everything from an Apple M1 / M2 to smartphones to Raspberry Pis to small microcontrollers) share an instruction set; there are tiers of ARM chips, like an M1 will do things an STM32 microcontroller won't, but it's mostly the same assembly language, and that's part of the value of that architecture. Same with x86 chips: the 80386 supported all the 80286 instructions and some extra; the 80486 supported 80386 plus some extra, and so on, right up to the current generation i7 / i9 / whatever. And AMD ones, Ryzen / Threadripper / etc, support the same instructions as the Intel ones. The reason why is obvious: Intel wants to sell you a new chip and it's easier to do so if you can still run all the programs you ran on your old chip; AMD wants to sell you their chips and it's easier to do so if they can run all the programs you currently use on your Intel chip.