$1 is an incredibly low price to pay for advertising and an incredibly high price to pay for legitimately interacting with a community. This would have the exact opposite of the intended effect.
The US supreme court declined to hear a challenge to the claim that if you explicitly disclaim any human input to make a point then the art isn't copyrightable in the US. The copyrightability of "actually I had some design input" is still up in the air in the US, and copyrightability in general is still up in the air in probably the entire rest of the world as well as every court in the US outside of the DC circuit (because the supreme court declining to hear a case does not constitute an endorsement of the lower courts ruling or create precedent).
There's absolutely nothing stopping you granting a license to public domain work... granting a license is just waiving rights that the author might have to sue for copyright infringement under certain circumstances...
Personally I'd be unwilling to use this work without the license, because I would not be confident that it was public domain.
I've got some serious moral questions about "rewrite your own widely used project from scratch with the same package name"... but I don't think it's fair to call this "someone else's project" when the OP has apparently been the only maintainer working on the project for 13 years...
- Arguing that you owned the copyright on the copied code (the author here has apparently been the sole maintainer of this library since 2013, not all, but a lot of the code that could be copied here probably already belongs to him...)
I'd disagree, the other training on top doesn't alter the fundamental nature of the model that it's predicting the probabilities of the next token (and then there's a sampling step which can roughly be described as picking the most probable one).
It just changes the probability distribution that it is approximating.
To the extent that thinking is making a series of deductions from prior facts, it seems to me that thinking can be reduced to "pick the next most probable token from the correct probability distribution"...
The fundamental nature of the model is that it consumes tokens as input and produces token probabilities as output, but there's nothing inherently "predictive" about it -- that's just perspective hangover from the historical development of how LLMs were trained. It is, fundamentally, I think, a general-purpose thinking machine, operating over the inputs and outputs of tokens.
(With this perspective, I can feel my own brain subtly oferring up a panoply of possible responses in a similar way. I can even turn up the temperature on my own brain, making it more likely to decide to say the less-obvious words in response, by having a drink or two.)
(Similarly, mimicry is in humans too a very good learning technique to get started -- kids learning to speak are little parrots, artists just starting out will often copy existing works, etc. Before going on to develop further into their own style.)
Non-sequitor: "perspective hangover" might be my favorite phrase I've ever read. So much of what we deal with is trying to correct-the-record on how we used to think about things. But the inertia that old ideas or modes have is monumental to overcome. If you just came up with that, kudos.
We could argue about whether fine tuning is still about predicting a distribution or not, but really I feel like whether or not that word is accurate misses the point of why the description is useful.
I like the phrasing because it distinguishes it from other things the generative model might be doing including:
- Creating and then refining the whole response simultaneously, like diffusion models do.
- Having hidden state, where it first forms an "opinion" and then outputs it e.g. seq2seq models. Previously output output tokens are treated differently from input tokens at an architectural level.
- Having a hierarchical structure where you first decide what you're going to say, and then how you're going to say it, like wikipedia's hilarious description of how "sophisticated" natural language generation systems work (someone should really update this page): https://en.wikipedia.org/w/index.php?title=Natural_language_...
Welllll I'm not so sure that phrase is well-suited for your intended meaning, then. (Also, tangentially, I think could argue thinking models w/ the elided thought prelude satisfy "having hidden state where it first forms an opinion.")
Put a loop around an LLM and, it can be trivially made Turing complete, so it boils down to whether thinking requires exceeding the Turing computable, and we have no evidence to suggest that is even possible.
As typically deployed [1] LLMs are not turing complete. They're closer to linear bounded automaton, but because transformers have a strict maximum input size they're actually a subset of the weaker class of deterministic finite automaton. These aren't like python programs or something that can work on as much memory as you supply them, their architecture works on a fixed maximum amount of memory.
I'm not particularly convinced turing complete is the relevant property though. I'm rather convinced that I'm not turing complete either... my head is only so big after all.
[1] i.e. in a loop that appends output tokens to the input and has some form of sliding context window (perhaps with some inserted instructions to "compact" and then sliding the context window right to after those instructions once the LLM emits some special "done compacting" tokens).
[2] Common sampling procedures make them mildly non-deterministic, but I don't believe they do so in a way that changes the theoretical class of these machines from DFAs.
Context effectively provifes an IO port, and so all the loop needs to do is to simulate the tape head, and provide a single token of state.
You can not be convinced Turing complete is relevant all you want - we don't know of any more expansive category of computable functions, and so given that an LLM in the setup described is Turing complete no matter that they aren't typically deployed that way is irrelevant.
They trivially can be, and that is enough to make the shallow dismissal of pointing out they're "just" predicting the next token meaningless.
Turing Machines don't need access to the entire tape all at once, it's sufficient for it to see one cell at a time. You could certainly equip an LLM with a "read cell", "write cell", and "move left/right" tool and now you have a Turing machine. It doesn't need to keep any of its previous writes or reads in context. A sliding context window is more than capacious enough for this.
You're right of course, but at the point where you're saying "well we can make a turing machine with the LLM as the transition function by defining some tool calls for the LLM to interact with the tape" it feels like a stretch to call the LLM itself turing complete.
Also people definitely talk about them as "thinking" in contexts where they haven't put a harness capable of this around them. And in the common contexts where people do put harness theoretically capable of this around the LLM (e.g. giving the LLM access to bash), the LLM basically never uses that theoretical capability as the extra memory it would need to actually emulate a turing machine.
And meanwhile I can use external memory myself in a similar way (e.g. writing things down), but I think I'm perfectly capable of thinking without doing so.
So I persist in my stance that turing complete is not the relevant property, and isn't really there.
Yeah, humans and LLMs and a TM transition function are all Turing complete in the same way, but it's also basically a useless fact. You could possibly train a sufficiently motivated rat to compute a TM transition function.
That's why I specifically didn't call the LLM itself Turing complete, but stated that if you put a loop around a Turing machine you can trivially make it Turing complete. Maybe I should have been clearer and write "the combined system" instead of it.
But the point is that this is irrelevant, because it is proof that unlesss human brains exceed the Turing computable, LLM's can at least theoretically be made to think. And that makes pushing the "they're just predicting the next token" argument anti-intellectual nonsense.
I am not sure it is proof, at least not in an interesting way. It's also proof that Magic: The Gathering could theoretically be made to think. Which is true but doesn't tell you anything much about MtG other than that it is a slightly complicated ruleset that has a couple of properties that are pretty common.
I think both sides of this end up proving "too much" in their respective directions.
> whether thinking requires exceeding the Turing computable
I've never seen any evidence that thinking requires such a thing.
And honestly I think theoretical computational classes are irrelevant to analysing what AI can or cannot do. Physical computers are only equivalent to finite state machines (ignoring the internet).
But the truth is that if something is equivalent to a finite state machine, with an absurd number of states, it doesn't really matter.
Hence why I finished the sentence "and we have no evidence to suggest that is even possible".
I think it's exceedingly improbable that we're any more than very advanced automatons, but I like to keep the door ajar and point out that the burden is on those claiming this to present even a single example of a function we can compute that is outside the Turing computable if they want to open that door..
> Physical computers are only equivalent to finite state machines (ignoring the internet)
Physical computers are equivalent to Turing machines without the tape as long as they have access to IO.
On the other hand as someone who has gone on week long (and longer) hiking, kayaking, and most frequently canoeing trips in Canada I was completely unaware of this service, and would have been completely uninterested in it is I knew about it.
It's pretty good for backcountry hiking/camping (or offroading in general) where you are potentially hours away from any kind of cellular service. Some of these weather radio stations have (had?) pretty good coverage. A cheapo radio that can receive weather radio frequencies could last weeks on a single battery charge. It's great to know if my planned hike for the next day is possible or if we should make alternate plans, or if a giant storm is due later in the day, that kind of thing. Once you've been out for a day or two, all the forecasts you had ahead of time are obsolete and incorrect, particularly in the mountains.
Yeah, forecasts are definitely pretty worthless past day 2 or 3, and I can see how someone could find it useful... but part of the charm with camping to me is definitely the decision making process being based on "look at the sky" and not "ask the technology". Definitely a personal taste sort of thing.
> Ok, so, it’s the same as before, but the outlet of the spout is now significantly deeper / lower. So the speed of the water should be higher, right?
> Ok, but if the water is faster at the bottom of the long spout… We could view the top part of this system as an exact copy of the short-spout version. At the interface between the tank bottom and the pipe-spout, the velocity of the water should be the same as in the no-pipe version, right? But that means the water inside the pipe is accelerating inside the pipe:
No, it's not the exact same. In the top part of the long-spout system there's a lack of airpressure holding the water above it back compared to the short-spout, and quite a bit of cohesion in the water pulling the water above it down faster if the lack of air pressure isn't enough. The water in the whole system moves faster as a result.
You'd theoretically get the air (actually vacuum) bubble if you ran the experiment in a vacuum with a liquid that has no cohesion... liquids with no cohesion are otherwise known as gasses though and behave differently in other ways as well.
It's a BUSL license - you can self host so long as your application uses at most a single SpacetimeDB instance (i.e. no replication) and isn't a database service... or you are running a 5+ year old version of the server (and since this DB hasn't existed for 5+ years...).
I agree the wording is a bit strange, but a quick grep of the repo shows that it doesn't imply that.
The only usages of unsafe are in src/ffi, which is only compiled when the ffi feature is enabled. ffi is fundamentally unsafe ("unsafe" meaning "the compiler can't automatically verify this code won't result in undefined behavior") so using it there is reasonable, and the rest of the crate is properly free of unsafe.
Fill the base with concrete and use it as a bookend?
reply