Just wondering do you think you got tinnitus or was it there and you suddenly started noticing? I don't know I got it around 20y ago but I'm honestly unsure if it was one or the other because it became worse and worse the more I started focusing on it. Eventually it subsided. I can still hear it if I listen for it (as I just did now and I can hear a distinct 'bruising' kind of sound) but there's literally months between I even think of it or notice it. There have been studies that lots of 'normal' people notice tinnitus when they enter a sound-proof room.
What helped me was just taking long showers - I literally couldn't hear a thing during the shower and some time after. And it seems the 'drown out' period would last longer. And just knowing something would stop it somehow made me ease more into it and maybe reduced the fear that had been programmed into my brain. I also did omega 3 and gingo biloba (just low doses) and felt like it had some effect.
Was there any trigger and how 'loud' do you perceive it?
I've always had tinnitus, but it used to be that I could only hear it in absolute silence, but it was a medication that triggered mine to go from barely there to screaming banshees in my ears 24x7x365. It sucks to know that I will never truly experience silence again, but my brain does tune it out most of the time. But it's mostly noticeable at night. Mostly.
While I can see your point I also think it is not directly relevant to OP. Firstly, I don't think OP meant that people are idiots for using LLM's, it was just a way of saying that skill is no longer required so even idiots can do it whereas it used to be something that required high skill.
As for the comparisons - some are partly comparable to the current situation, but there's some differences as well. Sure books and online content enabled others to join, thereby reducing the "moat" for those who built careers on esoteric knowledge. But it didn't make things _that_ easy - it still required years of invested time to become a good developer. Also, it happened very gradually and while the developer pie was growing, and the range of tech growing, so developers who kept on top of technology (like OP did) could still be valuable. Of course, no one knows fully how it will play out this time around; maybe the pie will get even bigger, maybe there's still room for lots of developers and the only difference is that the tedious work is done. Sure, then it is comparable. But let's be honest, this has a very real chance of being different (humans inventing AI surely is something special!) and could result in skill-sets collapsing in value at record time. And perhaps worse, without opening new doors. Sure, new types of jobs may appear but they may be so different that they are essentially completely different careers. It is not like in the past you just needed to learn a new programming language.
I started at 16, 44M now, but also remember all that COM stuff, writing shell extensions for Windows 95 and stuff. And reading about it in the press (MSDN Magazine?). It was the new AI then ;)
I think you really hit the jackpot because you got a full career out of it, saw an amazing evolution etc. So you can hopefully enjoy the ride now being more as a spectator without the fear of being personally affected by job displacement. Enjoy the retirement!
Same boat (though 44M) - I don't think it has become less fun, on the contrary it can help with the stuff that was trivial but could still take time to get right. Now it can crank out that stuff often correctly on first try. Of course I have the same fear of job security as everyone else and it is sad to see something you were good at being taken over by machines, but it is not because I enjoy the work itself less, quite the contrary.
I didn't read/hear it as reducing human life to 'training energy', but I don't like the comparison at the technical level.
Firstly, the math isn't even close. A human being consumes maybe 15 MWh of food energy from years 0 to 20. Modern frontier models take on the order of 100,000 MWh to train. It's a 10,000x difference. Furthermore, the human is actively doing 'inference' (living, acting, producing) during those 20 years of training and is also doings lots of non-brain stuff.
Besides the energy math, it's comparing apples-to-oranges. A human brain doesn't start out as a blank slate; it has billions of years of evolutionary priors for language and spatial reasoning that LLMs have to teach themselves from scratch, so this could explain why a human can do some things cheaper. Also, the learning material available to a human is inherently created to be easily ingested by a human brain, whereas a blank LLM needs to build the capacity to process that data.
Altman seems to hint at a comparison to the whole human evolution, but that seems unfair in the other direction, because humans and human evolution had to make discoveries from scratch and trial and error whereas LLMs get to ingest the final "good stuff". But either way you slice it, it's just not a good comparison, though not an 'inhuman' or immoral one.
A US resident consumes 76 MWh per year [0], so 1.52 GWh over 20 years. A single model can be trained once and used by millions. Therefore LLMs are ~10000x more energy efficient than humans.
Your numbers are about how much is used also for transport etc. Sam's number were about what the human body itself uses for training, hence why I used the caloric consumption.
If such a simplistic explanation was true, LLM's would only be able to answer things that had been asked before, and where at least a 'fuzzy' textual question/answer match was available. This is clearly not the case. In practice you can prompt the LLM with such a large number of constraints, so large that the combinatorial explosion ensures no one asked that before. And you will still get a relevant answer combining all of those. Think combinations of features in a software request - including making some module that fits into your existing system (for which you have provided source) along with a list of requested features. Or questions you form based on a number of life experiences and interests that combined are unique to you. You can switch programming language, human language, writing styles, levels as you wish and discuss it in super esoteric languages or morse code. So are we to believe this answers appear just because there happened to be similar questions in the training data where a suitable answer followed? Even if for the sake of argument we accept this explanation by "proximity of question/answer", it is immediately that this would have to rely on extreme levels of abstraction and mixing and matching going on inside the LLM. And that it is then this process that we need to explain how works, whereas the textual proximity you invoke relies on this rather than explaining it.
I think you're confusing OP for the people who claim that there is zero functional difference between an LLM and a search engine that just parrots stuff already in it. But they never made such a claim. Here, let me try: the simplest explanation for how next token estimation leads to a model that often produces true answers is that for most inputs, the most likely next token is true. Given their size and the way they're trained, LLMs obviously don't just ingest training data like a big archive, they contain something like an abstract representation of tokens and concepts. While not exactly like human knowledge, the network is large and deep enough that LLMs are capable of predicting true statements based on preceding text. This also enables them to answer questions not in their training dataset, although accuracy obviously suffers the further you deviate from known topics. The most likely next token to any question is the true answer, so they essentially ended up being trained to estimate truth.
I'm not saying this is bad or underwhelming, by the way. It's incredible how far people were able to push machine learning with just the knowledge we have now, and how they're still making process. I'm just saying it's not magic. It's not something like an unsolved problem in mathematics.
No one ever made the claim it was magic, not even remotely. Regarding the rest of your commentary: a) The original claim was that LLM's were not understood and are a black box. b) Then someone claims that this is not true, and they know well how LLM's work, it is simply due to questions & answers being in close textual proximity in training data. c) I then claim this is a shallow explanation because you then need to invoke additionally a huge abstraction network - that is a black box, d) you seem to agree with this while at the same time saying I misrepresented "b" - which I don't think I did. They really claimed they understood it and only offered this textual proximity thing.
In general, every attempt at explanation of LLM's that appeal to "[just] predicting next token" is thought terminating and automatically invalid as explanation. Why? Because it is confusing the objective function with the result. It adds exactly zero over saying "I know how a chess engine works, it just predicts the next move and has been trained to predict the next move" or "A talking human just predicts the next word, as it was trained to do". It says zero about how this is done internally in the model. You could have a physical black box predicting the next token, and inside you could have simple frequentist tables or you could have a human brain or you could have an LLM. In all cases you could say the box is predicting the next token and if any training was involved you could say it was trained to predict the next token.
I have a weather station that takes two 1.2 V. The LCD screen is a bit dim compared to when used with fresh 1.5 V alkalines. Other than that, most things take the 1.2 V well. But they better do because alkalines reach 1.2 V with >50% capacity left.
At our high school we each had to buy a TI-83 calculator kit, and it came with one of those Rayovac alkaline chargers.
I also had a Seitek Eco charger that could charge "normal" alkalines. But you had to be careful not to discharge them too deep. It seemed kind of pointless over rechargebles though the capacity of NiCD/NiMH was way lower back then (I remember when NiMH AA batteries at 700 mAh were considered really high!). And perhaps it some devices it was great they held 1.5 V.
About 15 years ago I was writing software for an embedded device made by another company, and they sent us a unit for testing. It had a small rectangular rechargeable lithium battery that was charged via a DC jack.
At one point I hadn’t kept it charged, the battery went completely flat, and after that it would no longer charge at all. When I called the company, they said the battery was now too deeply discharged and required an “intelligent” charger to revive it. They sent a charger with a slot for the bare battery; some LEDs blinked in various patterns for a while, and eventually normal charging resumed.
I’ve always wondered what that charger actually did, that the built-in charger was not capable of. Was it performing some kind of analysis to decide whether the battery was safe to recover (e.g. after deep discharge), or was it simply applying some initial charge ignoring the battery’s protection circuitry (and at what risk)?
Actually, the voltages had to be raised due to the shadow mask, and this rise in voltage meant you were now in x-ray territory, which wasn't the case before. The infamous problems with TV's emitting x-rays and associate recalls were the early color TV's. And it wasn't so much from the tube, but from the shunt regulators etc. in the power supply that were themselves vacuum tubes. If you removed the protection cans around those you would be exposed to strong radiation. Most of that went away when the TV's were transistorized so the high-voltage circuits didn't involve vacuum tubes.
Most of those old TVs were not Faraday Caged either, nor were they grounded to earth, so all that radiation and energy was one hardware failure away from seriously unfunny events. Their chassis grounding always gave a tingle to the touch.