When recovering from some kind of mental overload, I often fall into my old patterns of thinking about programming languages. Today is no exception!
Generally, when we talk about programming languages, we usually mean things like, C, C++, Java, Perl, PHP, etc. I think that, over time, this narrow definition of what a programming language is has actually hindered our ability to engineer software.
What is a programming language? The common definition generally only includes the syntax and semantics of, for lack of a better description, the “text” of a program. For instance, we think of languages as defining how variables are declared, how loops are specified, functions defined, if certain mechanisms like continuations or closures are first-class (or close to it), etc.
This definition fails to capture how the languages are really used, though. Just because a handy library is written in C++ doesn’t mean it will integrate nicely into my present C++ program. The reason is that it is the APIs that are the real language - not the obscure rules that define what a loop looks like or which characters are allowed in a variable name.
A good example of this is OpenGL. OpenGL is defined by a bunch of C functions. However, those functions are all named using an OpenGL-defined standard naming scheme. Immediately it may not really “fit in” with how you named your own functions in your program. So immediately you have a kind of “friction” from the fact that while OpenGL is defined via C functions, and your entire program may be in C as well, your naming conventions and handling of state and globals, etc. all may differ from OpenGL’s ways and assumptions.
Now let’s say you combine your C+OpenGL program with another library that uses C, but does so in an object-orientated way, of sorts, using structs almost like class instances, etc. Sure - it’s still C - but it’s going to feel like an entirely different way of doing things than, say, most of OpenGL or, perhaps, most of your program up until that point.
The fact that everything in our hypothetical program is C doesn’t mean anything when it comes to being able to understand it in the end. My analogy here is this: What if english had entirely different ways of presenting information for different fields of study or topics of conversation?
Sure, there are a variety of special-purpose words that only experts use - but those words still need to fit within the constructs of an english sentence and word construction in order to be understood. Imagine if, in order to talk about dogs, you had to reorder your words like so: Jump the black through the hoop labrador. (Or something equally nutty.) I think it’s pretty safe to say that even though the words are recognizable english words and in the dictionary and the phrase is somewhat understandable after some thought, they weren’t used in a way compatible with what we’d call english. And yet, in software, that’s exactly what we do - especially in languages like C++ where you have a huge variety of possible styles and API designs.
Many languages have tried to avoid this by declaring themselves as being pure OO or pure functional, etc. I think that helps a little bit, but it’s not enough. There’s something else missing. It doesn’t matter if you can only define and call functions, for instance, if you name them something weird: foo(bar(baz(42),12,”hello world”). What does that do? Who knows! And the kicker is that you may carefully architect your own program to follow a strict naming rule (a bit like that defined by Cocoa, for instance), but it still comes crashing down when you run into a stupid language-defined function named “cons” or some external library with functionality you need that has some oddly-structured naming scheme of it’s own because this so-called language doesn’t enforce any semantic sense of consistency in naming.
The real question is, I guess, how do real human languages avoid this problem? How is it that an english sentence that may include several words we don’t recognize still feel and look like english? This, I think, is what really needs to be solved in the space of programming languages. How can we define APIs that always fit in and feel right within a program? I dislike seeing awesome software libraries out in the world that, when trying to integrate into my own code, somehow destroy the “purity” and understandability of what I had built to that point simply because the APIs have drastically different views of how to use the syntax and semantics the “language” defines.
Update: I was pointed to this paper, “Architectural Mismatch” by David Garlan, Robert Allen, John Ockerbloom as a starting point. I haven’t worked my way through it all yet, but based on the synopsis, I’d say this paper from 1995 at least acknowledges I’m not the only person to have noticed this… 