Hack: How many programming languages are there?
Introduction
Last month, as I have many other times in the past, I ordered a trunk full of books from Amazon.com. This time, I bought Structure and Interpretation of Computer Programs (SICP) and several of the books that the Wikipedia article mentions as having been inspired by that book’s style. Naturally, I plan to eventually work my way through each and every one of those texts.
No guesses as to how far across the floor my grizzled white beard will stretch by that time, though.
I also picked up the rest of a set of books authored or coauthored Indiana University professor Daniel P. Friedman: The Seasoned Schemer (with Matthias Felleisen, the PLT Scheme founder) and The Reasoned Schemer (with William E. Byrd, and Oleg Kiselyov). I already had The Little Schemer (with Felleisen) and … The Little MLer (Felleisen and Friedman). I don’t speak Java and can’t drum up much interest in learning, so I’m going to pass on A Little Java, A Few Patterns, at least for the forseeable future.
Here are the titles that I’ve got (but haven’t read in their entirety yet), along with a blurb regarding the purpose of each book (gleaned from their prefaces):
- The Little Schemer
- How to think recursively. Revised version of Friedman’s 1974 The Little LISPer.
- The Reasoned Schemer
- Written with the goal of teaching the reader about
the nature of computation
. Sequel to The Little Schemer. - The Reasoned Schemer
- The subject here is relational programming. The authors lead the reader through extending Scheme to give it relational tools.
- The Little MLer
- Thinking recursively about types and programs. Also, program construction.
These are largely written in a style that, at least for the parts that I’ve scanned in the books that I’ve peeked into, seems a lot like Socratic Questioning. As far as I know, Friedman and his coauthors are the only compsci people who’ve turned out finished texts architected in this way and their approach has earned a lot of fans. For some of the languages that Friedman et al. haven’t gotten to writing about just yet, third party efforts along similar lines can be found online. Even when no one has gotten around to trying to translate the book into a given language, there are references to a certain book as being “The Little Xer” for programming language X, people adopting that as a handle/username, and lots of instances where that turn of phrase is tossed around. Sometimes, depending on the ending of the language name, people go with an “ist” ending instead (e.g. “The Little Rubyist”).
The Little Schemer and its brethren sit in one of the to-read piles scattered around my home office and, for the last several days, have been at about eye level whenever I’ve walked into the room. Last night, just after stepping back in after a trip to the kitchen to make a mug of hot cocoa, I got an idea.
The Little [X]er/ist (original plan)
How neat would it be to check, for “all” of the programming languages ever invented, whether anyone had ever, at least within the sources visible to the Eye of Sauron Google, used the phrase “The Little [X]er”? Of course it wouldn’t be precise and there’d be no way of knowing, without scanning through SERPs manually, the context in which the reference was made (a port of the book, praise for another existing book, someone using it as a nickname, etc.), but it still struck me as a fun little exercise.
There are lots of ways to approach the problem. With Chickenfoot close at hand and banking that Wikipedia must include some page with a list of all known programming languages (given how many obsessives have found their way into software engineering and compsci, this seemed like a sure bet), I thought of attacking it this way (all done from within Chickenfoot):
- Process the Wikipedia list of all programming languages page using XPath and JavaScript to get the names of the individual languages.
- Produce “The Little [X]er” and/or “The Little [X]ist” strings for each language name. Associate them with the original language name.
- Run one Google search for each string, then use XPath to obtain the number of results and the URL of the top link from the Google SERP.
-
Output a tidy table showing, for each search string, the number of results and giving a link to the top result. Maybe I would sort the table by the number of results or generate a bar graph using the
CANVASelement.
Second thoughts
I only fiddled with the idea for a few minutes last night because it dawned on me that what I was thinking of doing would be a violation of Google’s TOS and could possibly be viewed as some sort of microscopic DOS attack by either Google or my ISP. I could address the second concern by adding a (variable-length) delay between queries … but the enjoyability was beginning to evaporate.
How many programming languages (and caveats)
I did come up with a figure for the total number of programming languages ever invented: 548.
The number is not to be taken as an authoritative count. The Wikipedia page (Alphabetical list of programming languages) begins with the following statement:
The aim of this list of programming languages is to include all notable programming languages in existence, both those in current use and historical ones, in alphabetical order.
So, in addition to some languages just having died out and been forgotten, there’s the notabiity criterion. What about dialects/variants of languages. I can see several dialects of LISP listed, but only one entry for Fortran, for example. There’s also a note that the list of dialects of BASIC has been removed to its own page (List of BASIC dialects).
How I got the figure
I started by looking at the page source in Firebug - it’s nicer than View > Page Source because I don’t need to open a new window and Firebug folds the HTML for me so that I can expand and collapse sections as needed while digging down to find the bits that I want. After a moment’s look, I could see that the language names were inside of LI nodes, which were, in turn, inside ULs. The lists were inside of cells of tables with the class attribute “multicol”.
Next, I opened XPather (v 1.3, released on Oct 27 2006) and built up an XPath expression that returned the set of LI nodes:
//table[@class="multicol"]/tbody/tr/td/ul/li
Here’s a screenshot (click the thumbnail for a larger version):
XPather gives you a matching nodes count (they’re also individually numbered in the display presented in the XPather window), but you could get the figure itself using XPath’s count() function:
count(//table[@class="multicol"]/tbody/tr/td/ul/li)
In XPather, this yields:
I still want to do something interesting with Chickenfoot. Next time.

