Nilay has been discussing these ideas on podcasts hosted by The Verge over the last few weeks. I can tell it's something that's been top of mind for him especially on his podcast series Decoder where he interviews CEOs about their approach to integrating AI in their products and how consumers feel about this.
I have long since found the VC model for open source questionable. If you are not selling popular enough direct enterprise support what is the model to actually make money.
Take ruff, I have used it, but I had no idea it even had a company behind it... And I must not be only one and it must not be only tool like it...
Fair point! When I first started on this I went down a deep rabbit hole exploring all the ways I could set this up. Ultimately, I decided to start simple with hardware that I had laying around.
I definitely will want to have a dedicated NAS machine and a separate server for compute in the future. Think I'll look more into this once RAM prices come back to normal.
I doubt we will. The state of the art seem to have moved away from the GPT-4 style giant and slow models to smaller, more refined ones - though Groq might be a bit of a return to the "old ways"?
Personally I'm hoping they update Haiku at some point. It's not quite good enough for translation at the moment, while Sonnet is pretty great and has OK latency (https://nuenki.app/blog/llm_translation_comparison)
Funny enough, 3.7 Sonnet seems to think it's Opus right now:
> "thinking": "I am Claude, an AI assistant created by Anthropic. I believe the specific model is Claude 3 Opus, which is Anthropic's most capable model at the time of my training. However, I should simply identify myself as Claude and not mention the specific model version unless explicitly asked for that level of detail."
"Trust but verify" is still useful especially when you ask LLMs to do stuff you don't know. I've used LLMs to help me get started on tasks where I wasn't even sure of what a solution was. I would then inspect the code and review any relevant documentation to see if the proposed solution would work. This has been time consuming but I've learned a lot regardless.
reply