Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The significance of GPT-3 is that the scaling isn't slowing down. With every increase in the number of parameters the doubters say, "Oh, you'll hit diminishing returns," or "Oh, the curve will go sigmoid," but it hasn't happened.

If OpenAI develops GPT-4, with 1T parameters, I wouldn't be surprised to see a performance gain larger than the jump between GPT-2 and GPT-3.

What GPT-3 shows us is that we're going to have ML systems that can write at the level of an average human pretty soon now.



I have a much lower opinion of the average human's writing capabilities. Much of what we see online has been written by people who either love to write or are journalists. I think GPT-3 is already at the average human's writing level.


I'm not sure what's being discussed here exactly. If we talk about vocabulary, spelling and grammar I agree with you. On the other hand humans are able to express opinions and idea, come up with novel things to say, not merely mimicking an input.

If you give me a huge corpus of chinese texts and a very long time, I might be able to figure out what character goes with what other, find the various structures in the text and then be able to generate a somewhat convincing made up chinese text while still not understanding a word of it.

These GPT-3 demos are impressive because they look like real text with proper syntax and grammar, but they still express absolutely nothing. It reads like a long series of rambling that goes nowhere. There's no intent behind it.

It reminds me of these videos of apes imitating humans by using their tools, banging hammers ineffectively. They are able to copy the appearance of the behavior, but not the reasoning behind it. They don't get why we bang hammers or what it achieves.


Have you read any business books? I used to read quite a few. For the most part, they take a central thesis and then repeat variations on the theme over and over again. Sometimes with anecdotes of questionable veracity. I venture that many of them could be generated with GPT-3.

My point is, GPT-3 is operating at human levels for certain contexts. I think it would get passing grades on essays in a lot of schools in the US, for instance, just based on syntax and grammar.


This stuff is so new that HN threads may be the first to mention realistic potential applications - congratulations, I think you just found one. Having GPT-3 render a first draft of books in the archetype you mention (one simple idea stretched out over many pages) seems like a very profitable endeavor.


> Having GPT-3 render a first draft of books in the archetype you mention (one simple idea stretched out over many pages).

Given what I've seen so far with GPT-3, that simple idea would have to have already been discussed at length on forums on the internet and in the corpus.

Usually books have facts and studies that they use as supporting points. Many of the connections they make between the subject material and their thesis are unique, and this forms their supporting argument. GPT-3 is rearranging words and sentences to resemble structures it's seen before, but it does not create novel facts.


So ideally it could work like a meta-study. Meta-studies combine results from multiple separate studies, making correlations and drawing more confident conclusions. Most 'original' human ideas are just reinventions of older ideas, too.

The interesting part is that GPT-3's leap in performance can be attributed to scaling. That's easier to do than inventing completely new approaches. Scale data, scale compute, scale money, then you have something you couldn't have invented directly.


A Chinese room can still be an interesting conversation participant


That's a good point, but I feel like there's still a long way to go before the model has enough data to actually output insightful content. Right now it seems to mostly output grammatically correct Lorem Ipsum.


I completely agree. People have a hugely inflated opinion of themselves. It's already better at writing articles than your average human.


>> The significance of GPT-3 is that the scaling isn't slowing down.

Can I ask- who says this? Is it your personal opinion? Is it the conclusion of the article above, as you understand it? Is it a commonly held opinion of some of the experts in language modelling?

I am asking because everytime there is a claim like "X is important because Y" and someone points out that "Y" is not that interesting, if someone else then says "X is important because Z" and Z is not Y, it's very difficult to have a productive conversation, because it's very difficult to know what we are talking about. Of course, this is the internets and not scientific debate (typically carried out in peer reviewed publications) but if the goalposts keep moving all the time, it's pointless to even try to have a conversation about the merits and flaws of such a complex system. That, with all due respect.

Now, regarding whether GPT-3 is slowing down, it isn't, but it's not going very fast either. Like I say, the curve in the middle of the article that shows acccuracy as a function of parameters is quite flat. Depending on how you want to define diminishing returns, the image painted by the accuracy plot is not that far from it and in any case average accuracy is pretty disappointing.

>> What GPT-3 shows us is that we're going to have ML systems that can write at the level of an average human pretty soon now.

Like I say, there's no good metrics for this kind of task. We have no way to determine what is writing "at the level of an average humen" (let alone what is an "average human"), except eyballing output and expressing a subjective opinion. Anyone might claim that GPT-3 is already capable of writing "at the level of an average human". Anyone might claim that GPT-2 is. Or a Hidden Markove Model, or an n-gram model. Such claims really don't mean anything at all.

It is important to note that this is exactly the task that OpenAI has publicised the most, with GPT-3: a poorly defined task with no good metrics. This insistence in promoting an ability that cannot be objectively evaluated as being a strong point of the model is strong evidence that the model is not nearly as good as advertised.


But it is slowing down. In computer vision, we had MNIST solved and waited for more than 2 decades of exponential growth in compute until ImageNet was solved. That 98% percent accuracy on ImageNet is nowhere near good enough for applications like self-driving cars. How many decades until we reach a 10^6 error rate? Keeping in mind that exponential growth in compute is over.


> Keeping in mind that exponential growth in compute is over.

DL is getting increasingly specialized hardware, there's plenty of growth there. Plus what we're seeing here is GPT scaling up without algorithmic changes. Algorithms are advancing too.


GPT-3 or GPT-4 can give us "convincing liars", but we still need to figure out how to combine them with actual factual databases and do a quick fact-checking/validation/inference. GPT-3 is showing us a convincing human-like style, but no real substance. It's a massive step forward in any case.

I might try to generate soft-science essays with GPT-3 at one of my universities to see if it passes through TA filters.


It’s this and the fact that few/one shot learning seems to just emerge with with models of this size




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: