> This is the garbage in, garbage out principle in action. The utility of a model is bottlenecked by its inputs. The more garbage you have, the more likely hallucinations will occur.
Good read but I wouldn't fully extend the garbage in, garbage out principle to the LLMs. These massive LLMs are trained on internet-scale data, which includes a significant amount of garbage, and still do pretty good. Hallucinations are due to missing or misleading context than from the noise alone. Tech debt heavy code bases though unstructured still provides information-rich context.
I vibe coded the shit out of a Chrome extension that does that while waiting on CI/CD. Go read the content.js to make sure I'm not hacking your shit, download the repo to your computer, enable developer mode in chrome, "load unpacked", point it at the directory with those files, and enjoy your tool tips.