Something I wonder about sometimes: how did you know about the existence of this...

sqrt17 · on July 30, 2010

There's several central ideas in CS that most people know. (Not everyone knows every single of them, but most people know many of them). One of them is the idea of finite-state automata. If you have a finite set of strings, or a regular language, you can build a DFA to recognize that.

Once you know know about DFAs, and you realize that there are only finitely many words that have a Levenshtein distance of less than k, it's just a question of (i) practicality and (ii) technical details. But when you have the idea to combine concepts X and Y, you know what terms to search for and what papers you want to look at.

As to practicality, there's potentially many words that have a Levenshtein distance of at most two for some word, but you don't care about the bits that are different, so the automaton is much smaller.

Part of being a successful CS background is to know all the kinds of hammers that are around and having a feel for the right hammer to use given a particular nail.

In this case, the hammers are abstractions. Algorithms are fundamentally driven by abstractions into which you then fill necessary technical detail - getting the technical details wrong may leave you with a correct but inefficient algorithm at worst (but enable you to recognize a more efficient way when you come across it), whereas you're bound to reinvent a new duct-taped makeshift wheel wherever you go when you don't know the right abstractions.

Confusion · on July 30, 2010

Thanks for your explanation. It's a reasonably comforting answer, since it suggests that as long as I spend time picking up concepts (which I have been doing ever since I went into software engineering) and connecting my problems to those concepts, there is a reasonable chance I will stumble upon a beautiful solution like a Levenshtein automaton, if it exists for the problem at hand.

viggity · on July 30, 2010

I read a lot, that is how I know about most of the algorithms in my arsenal. But in this particular instance we dealt with Levenshtein Distance in a cryptography graduate class at Iowa State and I've used it in a couple other places other than Atomiq.

I'd recommend reading as much as you can. Go out to wikipedia and start with the article on Donald Knuth, anything cool worth noting about algorithms isn't more than 2 or 3 links removed from his page.