Hacker Newsnew | past | comments | ask | show | jobs | submit | skeptrune's commentslogin

lmao, i love this

this is awesome. beyond happy to see it

yea chromadb is not the point. multiple data storage solutions work

I see .. so you're not using the vectors at all. Where are the evaluations showing this chromaFS approach is performing better than vectors?

Working on publishing those, but publishing benchmarks requires a lot of attention to detail so it will likely be a bit longer.

agreed!

We would also be super interested to see that comparison. I agree that there isn't a specific reason why Chroma would be required to build something like this.

I agree that would have been the way to go given more time and resources. However, setting up a FUSE mount would have taken significantly longer and required additional infrastructure.

100% agree. However, if there were no resource tradeoffs, then a FUSE mount would probably be the way to go.

Modern OCR tooling is quite good. If the knowledge you are adding into your search database is able to be OCR'd then I think the approach we took here is able to be generalized.

Hmmm, the post is an attempt to explain that Mintlify migrated from embedding-retrieval->reranker->LLM to an agent loop with access to call POSIX tools as it desires. Perhaps we didn't provide enough detail?

That matches what I'm curious about. Where an LLM is doing the bulk of information discovery and tool calling directly. Most simpler RAGs have an LLM on the frontend mostly just doing simpler query clean up, subqueries and taxonomy, then again later to rerank and parse the data. So I'd imagine the prompting and guardrails part is much more complicated in an agent loop approach, since it's more powerful and open ended.

Vector search has moved from a "complete solution" to just one tool among many which you should likely provide to an agent.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: