Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
[flagged] The future of code search is not regex – 100x faster than ripgrep (dmtrkovalenko.dev)
60 points by neogoose 15 days ago | hide | past | favorite | 44 comments


I ran across this fascinating tool a few days ago researching embedding models on hugging face.

Advertised as "ColGREP Semantic code search for your terminal and your coding agents",

I haven't put it in any harness yet but I probably should.

https://github.com/lightonai/next-plaid/tree/main/colgrep

I've also tried astgrep (also known as sg) but llms really mess up on them. I think you'd need to fine tune.

If anyone has cracked that case I'd love to hear about it


The future is lack of scrolling on mobile, and scanning getting stuck, apparently.


You don't miss much, don't worry, it also looks horrible on desktop.


it looks absolutely gorgeous btw but the idea is that you can try the search speed not actually use lmao


To save people the digging, here's the git repo:

https://github.com/dmtrKovalenko/fff.nvim

"FFF stands for freakin fast fuzzy file finder (pick 3) and it is an opinionated fuzzy file picker for your AI agent and Neovim. Just for file search, but we do the file search really fff well.

FFF is a tool for grepping, fuzzy file matching, globbing, and multigrepping with a strong focus on performance and useful search results. For humans - provides an unbelievable typo-resistant experience, for AI agents - implements the fastest file search with additional free memory suggesting the best search results based on various factors like frecency, git status, file size, definition matches, and more."


So the repo builds:

- C library

- neovim plugin

- MCP server

But not a plain binary, which is the main way ripgrep is directly used (...at least by humans), and compared with.


because it is meant to be used by the long running sdk not one shot search (this is where all the optimizations are coming from)


considering that ripgrep has marginal overhead over just reading the files to /dev/null, how exactly does this achieve 100x speedup?

I have a lot of use for something that can search ~1GB of text "instantly", but so far nothing beats rg/ag after the data has been moved into RAM.


The trick to optimization is not "doing faster" but "doing less". I already feel rg is missing a ton of results I want to see because it has a very large ignore list by default.


i see this - complaint? - often, but i use grep for finding text in files in the filesystem, like normal people. But specific datasets i'll use ag/rg. As an example, i have transcribed all of the "shows* i have access to for a couple of radio programs, when i want to do exploratory searches, i hit the set once with ag/rg, which takes 7-14 seconds to warm up once, then it's <1ms to search all 1500 text files or whatever.

So while i'm sure ag/rg may be frustrating to use in certain circumstances, by default it works great for searching text files, even structured text files, on disk.


alias rg="rg -iuu"


The crate says it uses SIMD, but the crate also says that content search is 20-50 times faster. Maybe the guy unsure how fast it is or how much speedup he should claim to get recognition.


it very much depends on the platform and the operating system

for example ripgrep doesn't do any memory mapping on macos which makes it 2-3x faster just becuase of that


you can try it yourself. ripgrep search for "MAX_FILE_SIZE" in the chromium repo takes 6-7 seconds, with fff it is 20milliseconds

so essentially in this specific case it is over 1000x faster, but the repo size is huge (66G, 500k files)


I have open sourced the fastest code search implementation. Comprehensive SDK for both file finder and grep file search that is over 100x faster than ripgrep


This looks cool!

You should add a link to the GitHub repo for the project itself, at first I wasn't even sure what it was called.

I found this link https://github.com/dmtrKovalenko/fff.nvim


I don't get this submission title. Your tool uses regex but the title claims the future is not about regex.


I think it is about input. Before I had to type regex, now I just type text and fuzzy finds more, regex style. Awkward wording, but code seems cool.


my tool is not using regex, it can use regex but it is not required


k, but what actually are you talking about?


Where can I find the benchmark for the "20-50 times faster than ripgrep" claim from the documentation, or the "100x faster" claim from the HN submission title?

Ripgrep already has optimizations for regex which don't contain any patterns (or even just regex which contain such substrings). So "not regex" shouldn't be what makes the difference.


I've entered "bazel" and got `shellPrefix.ts` which doesn't relate to bazel in any way.

If that's the future then I'll stay in the past with ripgrep.


It's O(1) with a correctness of O(0)


you absolutely missed the point


if you would search in the chromium repo you would see the correct match https://fff.dmtrkovalenko.dev/?repo=2&q=bazel


I don't get it how can I search anything but the file name?


Is there a write up of the underlying approach? The summary on the repo mentioned SIMD, but not a whole lot else.


Why is it "for neovim"? Surely such a thing would be useful in many applications?


Because it's being dishonest from multiple angles.

- it has regex, so the title is weird - it definitely wouldn't be 100x faster than rg - its an sdk, so its apples to oranges anyway


It has never been ripgrep for decades for those of us on IDEs.


To be fair, ripgrep is approximately one decade old, would be tricky to have used it for decades.

http://blog.burntsushi.net/ripgrep/

https://news.ycombinator.com/item?id=12564442


However, it's coming up on a decade (8 years) of vscode using ripgrep behind the scenes.


A programmer's editor. However with the right plugins, you get the same IDE capabilities for code searching in Java, C#, C++,...

Which basically runs an IDE headless (Eclipse, Netbeans, VS services,...), the joy of running an IDE + Electron, get to put those cores to use.


Has there been a general gentle decline in IDEs over the past 15 years or is it just me?


Maybe for a generation that has learnt to program with IDE poorly supported languages.


Zed has made me rethink this opinion.


Why do all vibecoded sites look the same? Same black on neon vibes and button styles


I saw this yesterday it claims it's faster than Ripgrep, it uses regex and rg : https://github.com/erogol/ngi.


Websites that don’t tell me what they’re doing are infuriating. I’m on mobile. This landing page experience is awful.


For desktops it's not different.


it is absolutely amazing experience on mobile if you guys do not understand how to use a search bar and a couple of segmeneted controls -- there is nothing much I can do about it


ctags, GNU Global and even "ugrep -Q" would like to have a few words with you ;)


How's it work? Embed tokens and use euclidean distance or something?


what even is this




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: