What format are the docs being uploaded as? By default, images uploaded into the chat would be directly passed through. PDFs would be parsed and fed to the LLM as text.
Writing is a really common use case, and something we'd like to explore more. Currently people often use Onyx for "write something combining X, Y, and Z documents", but I feel that's just scratching the surface.
I was mostly ranting about open-webui and hoping onyx would be better than the current state. My usecase involves pdfs with lots of complex figures, ocrd through mistral ocr witch gives text, and images for figures (have tried multiple other as well). I would really like to keep the figures as images, as ocr captions really struggles getting the full semantic meaning.
But stoked to get alternatives to the area, will try it out once i get time soon.
That's one way people use Onyx! Specifically, the Projects feature (see the left sidebar) works similarly e.g. you can upload arbitrary numbers of files - going well beyond the context limit of your model - and then ask questions of them.
How we think about it: the chat product should be completely open-source and free (forever). To that end we've moved features like SSO (that used to be "enterprise") to be MIT licensed. The chat interface is something pretty much every team needs (be it a proprietary or open-source solution). You can think of this like Apache Spark for Databricks or Ray for Anyscale.
Also, as other folks have pointed out in the thread, there are quite a few other open source options out there. So there's a a ton of outside pressure for our open-source only offering to be very competitive. We hope this reduces the "enshitification" risk that you speak of.
Haha, yea the UIs certainly have similarities (much of the industry converges to standard places to put different components, since users are familiar).
"Agents" is a particular area where we feel like we're better than the alternatives (especially if you want something that effectively calls multiple tools in sequence). Curious to hear your thoughts after trying it out!
Hmm, will have to disagree here. I think "one chat to rule them all" is the way it will end.
It does requires having UI components for many different types of interactions (e.g. many ways to collect user input mid-session + display different tools responses like graphs and interactives). With this, people should be able to easily build complex tools/flows on top of that UI, and get a nice, single interface (no siloed tools/swapping) for free. And having this UI be open-source make this easier.
I agree with an end state something like you describe, but I don't think it will be a chat app, I think you'll have an agent lives outside your apps, that managers your apps.
Great question! Depends on the specific alternative, but the broad points are:
- "pure chat" experience. From our community (and personal use), we've observed that most queries don't actually involve enterprise search. They much more likely to just require the LLMs internal knowledge (or web search / code execution). Compared to all the companies you've mentioned, we've spent a lot more time refining this more common flow.
- Larger connector suite. As soon as one key source isn't connected, the trustworthiness of the system is dramatically decreased. You second guess "is the info needed to answer this question in there?" for every question. We have a community who builds out connectors for themselves, and then contribute it back for everyone to use. This allows us to cover the long-tail better than companies like Notion and Slack.
- Customizability. An open-source application is the perfect middle ground between a SaaS offering and building blocks. A SaaS option doesn't allow for any customization (we have many customers who have contributed back ux enhancements, small features like guardrails, or enhanced configurations that their users want). Building blocks demand too much domain expertise (search, frontend/UX, ...) for it to be realistic for companies to build something great.
Hmm, yea that's a great callout. Something we definitely have in our sights longer term (focus for now is to make sure that the desktop chat experience is truly amazing).
Yes absolutely! There's no license restriction on white-labeling, so we've seen lots of companies do that.
In our opinion, it's a bit silly to build completely in house when you can take something like Onyx as the starting point and be >95% of the way there + have a tons of bells and whistles built in.
Writing is a really common use case, and something we'd like to explore more. Currently people often use Onyx for "write something combining X, Y, and Z documents", but I feel that's just scratching the surface.