Experimenting with LLMs to Research, Reflect, and Plan

summarity · on April 12, 2023

> One solution is to ensemble semantic search with keyword search. BM25 is a solid baseline when we expect at least one keyword to match. Nonetheless, it doesn’t do as well on shorter queries where there’s no keyword overlap with the relevant documents—in this case, averaged keyword embeddings may perform better. By combining the best of keyword search and semantic search, we can improve recall for various types of queries.

Oh hey I have a demo of that here: https://findsight.ai

For it I wrote a custom search engine, and KNN index implementation, which ranks and merges results across three stages (labels, full-text, embedding) effectively. To speed up retrieval, OpenAI embeddings are stored instead as SuperBit signatures. Rank merging turned out to be a really hard problem.

motoboi · on April 12, 2023

Wait. What!?! This is amazing. Did you write about that?

polyrand · on April 13, 2023

This sounds cool! I would be happy to read a blog post about it if you ever write one.

neatze · on April 12, 2023

Can this work with local files or at least with user uploaded library ?

summarity · on April 13, 2023

I'm experimenting with ways of letting users add content.

fudged71 · on April 13, 2023

I think this was also a recent addition to Llama Index

tudorw · on April 12, 2023

I think something akin to a mashup between Engleberts augmentation, Nelson's Xanadu (r) and Bucky's tensegrity system would make a great accompanying knowledge management system to manage branching conversations with AI, after a while handling the content generated becomes a task in itself. Visualising the created data would be ace.

tudorw · on April 12, 2023

'Sparks of AGI' https://youtu.be/qbIk7-JPB2c

digdugdirk · on April 13, 2023

What do you mean by Bucky's tensegrity system?

I only know tensegrity from the structural engineering concept, and although I'm not on a nickname level of familiarity with Buckmimster Fuller, I'm still confused as to how it applies here.

tudorw · on April 13, 2023

tensegrity structures are afaik natures most stable yet diverse systems, their strength combined with flexibility gave me the notion that combining data storage and vector databases would benefit from a strut that can have properties, for example tension and strength, then, you can feed semantic information to the struts, and the emerging structure could be mapped into less dimensions to be visualised in 3d, or something like that :)

tudorw · on April 13, 2023

the aim is to combine my 'truths', eg belief systems, Xanadu (r), irrefutable measurable facts, think wolfram, creative multimedia content, think TikTok meets twitter, machine learning, sentiment and content analysis with something like GPT to function as an advanced mind mapping tool wherein I can explore ideas pulling from all these 'experts' into a coherent chain of information that can be traversed and branched in a q and a style to extend the system again, or something like that...

pmoriarty · on April 13, 2023

"Most (if not all) embedding-based retrieval use approximate nearest neighbours (ANN). If we use exact nearest neighbours, we would get perfect recall of 1.0 but with higher latency (think seconds). In contrast, ANN offers good-enough recall (~0.95) with millisecond latency. I’ve previously compared several open-source ANNs and most achieved ~0.95 recall at throughput of hundreds to thousands of queries per second."

Can the results of multiple very fast, approximate queries somehow be used to get the equivalent of one very slow, reliable query?

reissbaker · on April 13, 2023

Might be useful to have an LLM generate a few versions of the query in order to account for imperfect recall — it seems like something gpt-3.5-turbo and friends would be pretty good at doing.

polyrand · on April 12, 2023

I've been thinking about using BM25 as a retrieval method to enhance LLMs, I'm happy to see it mentioned here. It can complement vector search, but if it turns out to be useful on its own (maybe with query expansion and other tricks), it could be used as an alternative to vector search when running locally, or in environments with lower compute resources.