1. gives you a good introduction in how to write parsers from scratch (without lex/yacc'ish parsing frameworks), and is probably a good warm-up before the book you mention.
2. gives you an introduction to state-of-the parsing with a framework (antlr) + a some about compilation. Note: antlr also has a nice IDE for rapid developing/prototyping of parsers - antlrworks. See http://antlr.org for more info.
"You are adamantly opposed to function/methods over 20 lines of code."
There are (at least) 2 problems with long funtions/methods:
1) They typically lead to more duplicated (and less reusable) code, e.g. repeated blocks of code between functions that could have been replaced with to-the-point short functions. A typical sign of where to extract a function is when you see a long block with a comment above it inside an even longer function, then replace the block with a function with a name inspired by the comment.
2) The methods becomes harder to test, and possibly even more important, the test code becomes harder to maintain (and untested code of some complexity usually doesn't work), e.g. what happens to your existing tests if you add some new conditions at the top of a long method?
A good argument might be chunking: the (average) human brain can keep only about six or seven chunks of information in short term memory.
A method with 20 lines of codes is roughly 20 chunks. Move some parts of those 20 lines into other functions, and you have created more effective chunks. Presumably, the brain has a much easier time understanding things it can actually keep in memory.
One idea is to create functions instead of comments, that is instead of writing "compute the rank" followed by some code, create and call a method computeRank().
But I know a significant amount of scientific papers are published in other languages than English, e.g. in Chinese language on Wanfang Data (www.wanfangdata.com - affiliate of Chinese Ministry of Science and Technology). This is probably the case in many other non-English languages.
2. The definitive ANTLR reference http://www.pragprog.com/titles/tpantlr
1. gives you a good introduction in how to write parsers from scratch (without lex/yacc'ish parsing frameworks), and is probably a good warm-up before the book you mention.
2. gives you an introduction to state-of-the parsing with a framework (antlr) + a some about compilation. Note: antlr also has a nice IDE for rapid developing/prototyping of parsers - antlrworks. See http://antlr.org for more info.
A very nice parsing framework for Python is dparser. It allows you to write grammars as docstrings to methods, which makes it very easy to try out things http://www.ibm.com/developerworks/linux/library/l-cpdpars.ht... http://dparser.sourceforge.net/
Definite Clause Grammars for Prolog is also worth a look (at least for reference)