This reminds me somewhat of the iSAX papers from ~2010 [0], which was focused on time series but used a pretty cool method to binarize/discretize the real values data and do search. I wonder how folks building things like FAISS or vector DBs incorporate ideas like this , or if the two worlds don’t overlap very often.
[0]. https://www.cs.ucr.edu/~eamonn/iSAX_2.0.pdf