> The analogy to natural language carried further into practice than Larry Wall could have dreamed.
Over the years I've seen this mantra repeated a million times but I've yet to have anyone actually explain what it means beyond a piece of marketing fluff. What actually is this analogy in concrete terms?
A lot of languages — in fact, a lot of software in general — is designed to make things easy for the computer/programmer: "This is how it works and you just have to adapt, deal with it."
Larry Wall put a lot of effort into making Perl try to adapt to the way the user already thinks instead, including taking advantage of having studied linguistics to figure out how the "language" part of a computer language fits (or doesn't fit) the way our brains process languages.
The result was a language that has a lot of power and flexibility. People who like Perl tend to love it, because it fits like a tailor-made suit instead of all those off-the-rack languages. People who dislike Perl tend to hate it, because all that power and flexibility can be easily abused (even when you're trying not to — there's a reason so many people buy off the rack instead of making DIY suits).
I think there's a greater underlying dichotomy here that's relevant to your tailor-made suit analogy: the people who love Perl tend to be the people who are building and maintaining mostly only their own tooling and (usually small) codebases, and the people who don't love Perl tend to be the ones that are responsible for maintaining large piles of other people's code.
A language that's "tailor-made" for the individual is fantastic when you only have to deal with your own code. You get to write stuff that makes sense to you. It doesn't matter if it's unconventional or doesn't make sense to anyone else -- it fits the way that your brain works.
But when cooperating on larger projects with lots of other people involved, this backfires remarkably. You now need to understand the way that half a dozen or more other brains work.
As software development has gradually shifted away from the published efforts of lone hackers and towards the collaborative efforts of teams of specialized people, Perl's "TMTOWTDI" approach has made it a far more expensive language to develop software in than all of the alternatives.
I was stating a syllogism there: understanding and working with different coding styles is difficult, increased difficulty adds increased expense, and Perl supports many different coding styles.
As far as I know, nobody's done a comprehensive cost analysis of similar software built using different languages, methodologies, or architectures. Certainly nothing recent. I've harped on this before quite loudly and I think it's one of the reasons our industry can't be called "engineering".
People's individual experiences just don't make for good enough data because there are too many confounding variables.
Larry was trained as a missionary and educated as a linguist. He came to language development with the perspective of a linguist. This is why Perl has context and pronouns.
There's a scalar vs. list context when calling a subroutine. Some Perl builtins return different things in the different contexts. Your subroutines can as well, for example by using wantarray() in your code. For example, in list context returning an array or assigning an array to something copies the array. In scalar context an array's value is its number of elements. The readline operator <$filehandle> reads a line from the file with file $filehandle in either context. However, in scalar context it reads the "next line" (depending on line ending settings and seeks and such). In list context it reads and returns the list of lines in the file. This has the great benefit that a while loop with <> reads one line at a time into memory and sets a variable (either one you specify or by default $_) to the line just read. A foreach loop with <> reads all the lines into memory then operates on them, which is sometimes preferable.
There are operators for numbers and for strings and a single scalar variable can be accessed in either context. This allows for easy coercion from the string "42" to the numeric value 42 or back. It's possible with some trickery to set these two values independently, but don't do that.
There are the pronouns $_ and _ which recall the most recent subject in some situations. There's a magic array @_ which is the list of arguments passed to a subroutine or method. There are match variables for regular expressions, which refer back to the matches. There are the variables $a and $b which are reserved "magic" variables used to support user-defined comparator blocks for the built-in sort(). There are variables for which line of an input file you just read.
Ok, the analogy of magic variables as pronouns actually makes them make a little more sense as a concept, even though I still don't think that the ambiguity of meaning is a useful feature in a programming language overall. In a natural language an ambiguous pronoun can be clarified by a later sentence - hell, the appropriate context could come quite a bit later on if that's the intention, and comedy certainly uses this trick sometimes. But that's not true for a programming language, where magic variables aren't ever really ambiguous, they're just context dependent. My guess is that good Perl developers learn all the rules for all the different contexts and then internalise them, forgetting the bit between where they are and when they were a beginner and were only using simple things so didn't need to know more than use $_ here, use @_ there.
People looking at it for the first time often look at the variables available, the built-in functions and procedures available, the fact that regularish expressions have their own mini language that nests naked inside the main syntax and a few oddities like sigils and sometimes determine there's a huge breadth-first search to learn all of this. People almost immediately go to, "even if I don't need to use all of this, I need to understand all of it to maintain other people's code". If you follow a really unprincipled developer who's showing off, that may be the case.
I've been programming in Perl5 for some part of my job duties since 1998. In high school before that, I was studying two foreign human languages and was taught the basics of four or five programming languages. At university I started a third foreign language and another programming language. All this time I was also learning a bit of programming language here and there on hobby projects. I've learned several more programming languages since, some of which I use alongside Perl pretty much daily.
From my experience the core language of Perl5 contains a subset of the language which could be thought of as its virtual core language. Much of what's built in is used more like specialty modules that are only used when necessary. I think this has a lot to do with Perl up through the Perl 4 days not having great module support. There are other huge languages in the wild. Ada, PL/I, and any shell on a system with a big bin or sbin directory come to mind. It really is, in my experience, a depth-first language with lots of side paths to explore one at a time. I don't have hard data on people learning the language, but I've met many people who share this sort of thoughts about it.
One solid example of a core feature that's really complete but rarely used is formats. You can have output sent through these sieves that are a template language built into core Perl5. They are really handy if you're using Perl to match a format in RPG or COBOL or if you're just generating the sort of reports you might generate from those. You don't ever need to worry about them if you're not working on code that does that. I've written Perl formats two or three times and had to deal with them in maintenance maybe half a dozen times including the ones I wrote. There's a separate manual document on them. They are not part of the core in Perl6/Raku but someone's older program would break if they were taken out of Perl5. https://perldoc.perl.org/perlform.html
There's a separate document for pack() and unpack() which are very handy if you're hand-translating character sets or working with a binary protocol or something. Otherwise you'll not use those, but they are in that huge core language.
The data structures and references in Perl5 look different than in many other languages. There are docs on syntax, and others on referces, and then another specifically on data structures. Perl6 uses similar dot notation to other languages, which is one of the syntactical incompatibilities with Perl5.
One place where the "all those choices" is a real drawback is the number of ways to make and use objects in Perl5. The initial way Larry gave us is workable and performant but a bit ugly and utilitarian. Many CPAN modules brought their own module systems. Then, after the Perl6 team figured out what they wanted to do for objects, Perl5 got a backport called Moose. Moose is nice to use but is a behemoth and uses memory like one. Someone came along and gave us a lighter version named Moo, and then someone gave us Mouse and another group gave us Mo. In the industry and community, Moo is considered the best practice now. It even upgrades just specific objects to full Moose objects if the lightweight version won't do for some reason. So if you can write and maintain Moo code and the original Perl5 blessed data structure objects you're good for most object-oriented Perl code, but far from all. There's also Object::InsideOut, Class::Accessor, and I forget how many others.
One place where Perl6/Raku/Camelia really fixes things is having a really nice default object system. Another place it fixes things is there's really no need to graft another object system on in place of it, because it include a full and fully accessible MOP.
Writing and reading Perl was meant to feel more like producing and consuming natural language. When in comes to natural language, our ability to use it for practical purposes is a fact, and whether it can be understood rigorously or shoehorned into a logical framework is an academic question. With that in mind, Perl was designed without regard for whether the everyday user of the language could understand the logic beneath it. All that mattered was that people could learn to produce meaningful chunks of code and understand code produced by others who wished to be understood. Unlike any natural language, of course, there was a standard implementation, so there was a logic underlying it all, but being able to pull apart a piece of code and explain how the parts worked together was considered to be of secondary importance to being able to successfully produce working LOC.
You even saw "folk theories" about how the language worked, which were valid and useful to the extent that they helped people produce and understand a certain subset of valid Perl. These folk theories would trip up people like me who expected them to be mathematical truths that could be combined with each other in novel ways, while more Perl-attuned people knew that this kind of analysis was as hazardous as assuming that fingers fing, or that "black big dog" is as valid as "big black dog." There are always new rules to learn and more precise versions of rules you thought you already knew, and ideally you would end up following the rules without knowing them [1]. Ideally, the better you got at reading and writing Perl, the less you thought about those things, just like we study rules of grammar as children while we're learning to stop writing incomprehensible crap like "dog run big I saw," and the better we get at writing the more we realize that the "rules" of grammar are mostly wrong and beside the point anyway. You didn't expect the sysadmin producing reams of working Perl in the next cube to be able to explain what a reference was just like you didn't expect a master advertising copy writer to be able to diagram a sentence for you.
In my attempt to learn Perl the way you would learn math, or the way you would learn a traditional programming language that was designed to be used by logically combining completely understood parts, I discovered that there were some people who understood Perl and could confidently recombine pieces into new idioms and know what they would do without trying them, but in the Perl community it was considered okay that these extraordinary people were a small minority, just like we don't worry that there isn't a professor of linguistics standing by when you order a meal in a restaurant.
Personally, I found it to be a fascinating abomination. An abomination because it flies in the face of how I think good reliable software should be written, and fascinating because it seemed to work surprisingly well.
> or that "black big dog" is as valid as "big black dog."
Tangentially: it is, though; the two phrases are both valid, but have different meaning. “big dog” is an idiomatic phrase that, when used as such, functions as a single noun, which can be modified by “black” as a preceding adjective (if “black” is in the usual place between “big” and “dog”, the words have their individual meaning not that of the idiomatic phrase.)
Not a super common one though, and even if you were using the idiom you wouldn't write "black big dog" because it still sounds incorrect and makes it less clear you're using the idiom.
Over the years I've seen this mantra repeated a million times but I've yet to have anyone actually explain what it means beyond a piece of marketing fluff. What actually is this analogy in concrete terms?