Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A Philosophical Introduction to Language Models (arxiv.org)
83 points by sebg on Jan 11, 2024 | hide | past | favorite | 24 comments


This is the "Anti Stochastic Parrot" paper. I am happy to finally read a philosophical paper on LLMs.

I just have a small quip.

> While LLMs show promise in various forms of task generalization, their participation in the ratcheting process of cultural learning thus appears contingent on further advancements in these areas, which might lie beyond the reach of current architectures.

I don't think cultural transmission with LLMs is that incipient.

Evolution through Large Models - https://arxiv.org/abs/2206.08896

We have seen papers where LLMs are used as "search heads" or "mutation operators" in evolutionary methods. They can rely on their language proficiency to explore the most promising leads, spanning a vast combinatorial space.

And another example of "cultural transmission": GPT-4 has its paws all over Mistral and LLaMA finetunes. Plenty of cultural transmission between big and small LLMs. Not to mention Phi, a model trained purely on synthetic GPT-4 data. Synthetic data has been estimated to be 5x more efficient than human data, you can make a model 1/5 size with similar performance.


> Synthetic data has been estimated to be 5x more efficient than human data, you can make a model 1/5 size with similar performance.

Interesting. Can you cite anything to back up that claim?


Yes.

> We follow the “Textbooks Are All You Need” approach, focusing this time on common sense reasoning in natural language, and create a new 1.3 billion parameter model named phi-1.5, with performance on natural language tasks comparable to models 5x larger, and surpassing most non-frontier LLMs on more complex reasoning tasks such as grade-school mathematics and basic coding.

https://arxiv.org/pdf/2309.05463.pdf


Thanks!


See also Self-Play Fine-Tuning (SPIN) https://arxiv.org/abs/2401.01335 that (per authors) tunes as well as Direct Preference Optimization (DPO).


I was interested to see a paper from a philosophical perspective but it seems this one fails to grapple with some of the more foundational critiques, primarily around reference where it claims that externalism militates for a view where reference is purely situated. I read the cited paper (Mandelkern & Linzen) and the arguments seem very weak to me that an intelligent agent "refers" in the absence of any grounding whatsoever beyond relative semantics between text. What it seems to argue is that because we sometimes are able to refer without absolute grounding (due to lack of direct experience of the referent), then reference overall does not require the referrer to have any world-grounding whatsoever. To me that's a leap. I would like to see more working against phenomenology here, especially Brian Cantwell Smith's so-far-unanswered critique of machine learning as intelligence.

One of their other arguments, which is under the umbrella of the "re-description fallacy" hinges on the observation that describing (e.g.) a piano as "hammers hitting strings" doesn't preclude it from more complex behavior like harmony, and so a simple description of LLMs as autocomplete etc doesn't preclude more advanced understanding. While it's true that any complex process can be inappropriately dismissed by simplifying its description, I think this too-neatly sidesteps more complex critiques which advance from simple descriptions of LLM behavior as contrasted with complex behavior from sentient intelligence that cannot be implemented by those simple mechanisms.


Language model understanding could still be reconciled with internalism about meaning if we conceptualize text itself as a form of sensory input. Like sense data, text is ultimately caused by external reality. The language model "perceives" reality through text. Indeed, modern multimodal language models like Gemini treat text similar to pictures or audio.

The conceptual difference between text and usual sensory input is that the causal story that produces the text is much more complex than the causal story that produces, say, photos. Photons bouncing off objects and hitting the retina or a CCD sensor, a very simple mechanism. For text the mechanism involves intelligent humans writing it down, so the relation to reality is much more indirect. So inferring an internal world model from text is much harder than from pictures. Then again, language models are trained on a lot of text, much more than a human could ever read.


> their programming proficiency “favorably compares to the average software engineer’s ability”

I don't know how people can believe this, based on a paper, when many average software engineers are still employed.


It’s been less than a year since the release of GPT-4, which seems to have been what prompted that claim. Even if the claim is true, many employers might not know it or believe it yet. Also, it would take time for employers to adapt their workflows to replace human engineers with LLMs.

Anectodal evidence from another field impacted by LLMs—translation—suggests that work is starting to dry up for many freelancers, especially those who work remotely. Translators who work within companies or other large organizations and whose work involves more human interaction seem to be holding on to their jobs—for the time being at least.


If work dries up for freelance translators, but not for freelance programmers, that's evidence about the relative ability of GPT-4 in both.


not necessarily, I think translators that work in high stakes scenarios will remain people, such as notarized translation of documents, while others, like the ones from Duolingo can be replaced more easily


Full stack engineering begins with philosophy and ends with assembly. Here we go.


Or the other way round?


The main philosophical question in my mind is when did we allow people redefine reality and when did the post truth society become the norm. There should be no debate over sentient software, yet here we are. This is the result of taking people seriously when we shouldn't.


> There should be no debate over sentient software

I honestly don't know if you are implying its obvious software _can't_ be sentient, or its obvious software _can_ be sentient. The fact that I can't tell which you mean proves that there is and should continue to be a debate over sentient machines.


And also the topic itself is hardly new. It has been around at least since Descartes, maybe longer.

The solution is anything but obvious. The closely related hard problem of consciousness is not called hard for nothing.


yeah it's called hard to introduce backdoor dualism in modern philosophy

"hard" is the new way of saying "beyond physical", the new transcendent plane


David Chalmers coined the terms "hard problem" and "easy problems" in a 1994 talk at The Science of Consciousness conference held in Tucson, Arizona:

> There is not just one problem of consciousness. “Consciousness” is an ambiguous term, referring to many different phenomena. Each of these phenomena needs to be explained, but some are easier to explain than others. At the start, it is useful to divide the associated problems of consciousness into “hard” and “easy” problems.


The Enlightenment produced free speech and reasoning. Nietzsche said, "god is dead," but a lot of people said it before and after - because reasoning could not fill in the gap of a shared reality. Harari's Sapiens gives a good history; Hoffman's claim that "natural selection does not favor veridical perception" says you're pretty confused about what is actually going on wrt "truth"; Seth's "Being You" might help to understand what conscious beings are actually trying to do in relation "truth" and survival.


“So long as man remains free he strives for nothing so incessantly and so painfully as to find someone to worship.” said Dostoevsky. In this case, it would appear that some people desperatly want to worship software, and are assaulting society with their new religion. Let's stop at Nietzsche. Everything else is a waste of electrons.


Sorry if I'm straying from the topic, and I understand that we are technological beings and want to develop more and more.

But I keep thinking about how much we've lost control in all this. The fact that we need to spend rivers of money on energy and GPUs for LLMs, to automate our super boring daily tasks (which didn't even need to exist in the first place) says a lot about our dysfunctionality. I would trade all this paraphernalia we created for the freedom to have my small farm, grow my own food, and be happy with my family and friends, too bad that this is a very, very distant dream.


First, you can do this (if you have the appropriate skills). I know people who grew up off the grid, cabins built by the family, well water, outhouses, no electricity. There are whole Amish communities and communes if you want to still be around people. However, a lot of this nostalgia for a pre-technological time ignores the realities of a world without antibiotics, painkillers, modern dentistry, indoor plumbing, hot showers. It ignores infant mortality and women regularly dying in childbirth. In the modern world you can still choose to live off the grid while taking advantage of most of these things, but not if everyone does.


> for the freedom to have my small farm, grow my own food, and be happy with my family and friends, too bad that this is a very, very distant dream

Every now and then I bump into people on HN that have seen the light.

Well let me tell you it's all doable. The tiny bubble of people that live in a non existant alternative reality is just that, a tiny bubble.

I managed to escape all this nonesense, first mentally, then financially, and buy a little house where I can grow my own food - for hobby, I have plant pots in my house growing tomatoes and spring onions, doesn't work but I am learning - and be happy with family and friends.

Naturally I have a very nice tech room with all the cool stuff. I am not rich, but I am free. Fun thing is you don't need to completely erase the paraphernalia. The two things are not mutually exclusive. All you need to do is have a clear mental separation of the two (tech bubble and reality) and compartmentalize the sane from the insane - an easy thing to achieve when you surround yourself with down to earth people that don't run around scared of an LLM and don't regurgitate what a marketing campaign programmed them to do. Those people are the matrix drones that don't want to be saved, and in their mind it's all or nothing - hermit or tech slave. Most normal people are watching in awe how the tech industry turns itself into the subject of ridicule.

To be fair if you live in the US it's probably easier to achieve what you want. For the money I paid for a little box in the UK i'd have been able to buy a mansion there.


it was all better when we were living in caves, didn't have most of our current problems; we lost our simple life too long ago




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: