> NOTE: Due to many WAFs employing JavaScript-level fingerprinting of web browsers, thermoptic also exposes hooks to utilize the browser for key steps of the scraping process. See this section for more information on this.
This reminds me of how Stripe does user tracking for fraude detection https://mtlynch.io/stripe-update/ I wonder if thermoptic could handle that.
Yes I definitely want to improve the search to be better. It is currently very text heavy and I (only recently) got image similarity indexing working. Hoping to leverage this to do something like you mentioned!
I'd also like to figure out how to turn an image into a description of whats in it. My ML/tensorflow knowledge is very weak though, so I still have a lot to learn here.
Have you tried something based on deep-learning that uses Transformers :
https://github.com/roatienza/deep-text-recognition-benchmark (available weights are for tasks that seem similar to OCR so there is a good chance you can use it out of the box). With a good gpu it should process hundreds to thousands image per seconds, so you likely can build your index in less than a day. (Maybe you can even port it to your iphone stack :) )
There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...
You can do better than a general image-to-text model reading memes, because they all use the same fonts - so you want something trained off synthetic data made with that font.