Programming languages: There’s a lot to unpack there. For those new to the field, you probably have questions like
- How many programming languages are there (and, more importantly, how many should I know)?
- How do programming languages work?
- How and why are they made?
- How do programming languages make money?
To answer these questions, we sat down with someone with years of experience in the software world: Erik Dietrich. Erik has worked as a developer, architect, manager, CIO, and, eventually, independent management and strategy consultant. This breadth of experience allows him to speak to all industry personas. He’s also written several books and countless blog posts on dozens of sites.
Without further ado, let’s get to the interview.
We’ll start with an easy question: How many programming languages are there?
I’m not entirely sure off the cuff. The last time I looked, it’s a surprisingly large number. So, my guess is that it’s well into the hundreds.
There are probably in the neighborhood of 20 major programming languages that people use in a lot of commercial applications. So, ones that lots of people know would be C, C++, C#, Java, Ruby, and Python.
And so if you list through all those, there’s maybe 20 of them that are kind of big ones, if you will, but then things get interesting because there are a lot of pretty niche programming languages out there. There’s one that’s profane, almost. I can’t remember what it’s called, but people come up with these toy programs: languages that are officially programming languages but aren’t used. There are also old ones like COBOL and Fortran. So, I don’t know offhand how many there are, but call it hundreds and then maybe 20 or so that are used largely to drive actual serious applications, and maybe 50 that are broadly in use but have more niche applications.
Editor’s note: How many programming languages are there? Well, it depends on who you ask. According to CodeLani, there are as many as 25,000 active computer languages, up to 2,000 of which are general-purpose programming languages that are actively used. You can see a list of programming languages on Wikipedia.
Well, why are there so many programming languages?
My take on that is because, philosophically, there’s this idea that engineers like to reinvent wheels—and I say this lovingly as an engineer. The reason you get into writing code, being a programmer is typically that you like to build a thing. So, if you’re faced with a problem, you’d rather create a solution for yourself than download an existing one.
Let’s say that you’re building a web application, and there’s a widget that asks for a date, so you click on it and a calendar comes up. Most of us in the engineering world would rather build that calendar ourselves than download one that somebody else had made. And this is sometimes called not-invented-here syndrome.
The reason I’m talking about that, in the context of this question of why there are so many programming languages, is because a programming language is just like anything else that people program, which is to say that there are a lot of engineers out there who look at the landscape of programming languages and say, “I don’t like any of these, I’m going to write one of my own.” And that’s where you get weird toy languages.
If we trace roots back to the early days of programming, we can take something like the C programming language as an example. C has been around for about 50 years. And there’s an evolution of it called C++. C++ is C, but it’s evolved into what’s called an object-oriented language. So, in the ’90s, object-oriented languages started to be the thing—Java was one. There are several of them out there. And in the world of C, for C programmers who liked C but were getting into the object-oriented world, C++ was a natural thing for them because it was object oriented, but it was also very much like the language they were already familiar with.
Basically, a good way to say it that encapsulates not-invented-here syndrome and the evolutionary nature of languages is to think of programming languages as being like products. A lot of programmers don’t think of them that way. But they kind of are. And it’d be sort of like asking why there are so many soft drink flavors. It’s because there are a lot of entrants to the market, and they’re all trying to differentiate themselves. Maybe one finds in a niche that nobody else quite has, like grape-lime soda or something. And so a new flavor is born.
To continue with that idea of language as a product, how are programming languages made?
Full disclosure, I’ve never built a programming language or worked on a team that has, but I can talk about it a little bit. Without making this super technical, there are components and there’s a certain methodology to a programming language. So, there are a certain set of rules that you have to stick to.
When you speak in a language like English, which you can think of as a higher-order language, its structure is open ended in a certain sense. There’s a lot going on in spoken human language. You can have ambiguity and meanings, and there’s a lot of subtext.
When you get into the world of making first-order programming languages, there’s no room for ambiguity. You just can’t have it, because what you’re doing is you’re trying to take human concepts and translate them into unambiguous machine actions. It requires a lot of precision.
Because of this, when you design a programming language, you have to define a very precise set of rules for what everything in that programming language means.
Let me say it this way: If I were to write a note to my wife asking, “Hey, while you’re out, do you mind picking up some eggs?” she is reading that and she’s processing my instructions, for lack of a better word, in that natural language way. So, if I say, “Hey, would you mind going out and picking up some eggs, and if they’re out of eggs, you can get me eggbeaters,” she’ll understand what I’m talking about. But in the world of the programming language, if you’re not very precise and if you get the sequence of those instructions wrong, the programming language will take things very literally.
So, if you misplace the conditional if, for example, the programming language misinterprets this entirely. So, to get back to the metaphor of me and my wife, when she’s out running errands, you have a human that’s able to fill in a lot of contextual gaps in what I wrote. But when you have a programming language, it’s not my wife reading those instructions, it’s a program. And that program is usually called a compiler. And what the compiler does is it reads the things you’ve typed out in your programming language, like C or Java, and it turns those into instructions that another program, usually the operating system, knows how to execute.
So, basically, to recap, I go into Notepad, I type some stuff. And then I run this program called the compiler on that stuff, and it spits out another file. And then the operating system knows how to run that file. So, it opens a window and starts doing stuff.
Those are the mechanics of how programming languages work. So, the way you design a programming language is by defining the rules. You have to think, “OK, I’m going to give these instructions, so what are the rules for my instructions?” What do I do if I see the word is followed by some parentheses? What do I do if I see something that says return zero? So, you come up with a set of rules, and then you design the language in terms of what are called the keywords—the things that are significant in that language. How variables work, how functions work, you define all of that as what’s essentially the user experience of programming in that language.
That, at a very high level, is the nature of programming language design.
And how do programming languages make money?
Not all programming languages make money, but some of them do.
There are programming languages out there that are done by convention. C is one. So, there are programming languages that nobody owns—you can think of them as open source. They were created by a community working together in a somewhat ad hoc fashion. There are definitely languages that emerged in that fashion, where nobody really owned them.
And then at the opposite end of the spectrum, you have languages like C# that were developed by a for-profit company. In the case of C#, it was Microsoft. But when you’re talking about a language that was developed by a for-profit company, even the way that that language makes money for its company has evolved.
It was simple back in the early 2000s. When Microsoft released C#, you would pay for a license to be a C# developer and pay for the tools that you would use to develop C# code. And over the years, that’s evolved a little bit.
At some point, Microsoft started making it usually free to develop C#, in that there was an entry-level version of each tool that was free. And then the way Microsoft would make money from people programming in C# would be to sell add-ons. So, similar to many apps on your phone, the deal is that you can get started for free, but if you want the good version of the product that makes your developers more efficient, you’re going to pay for it all the way up to an enterprise license. So, if you’re some major company like a bank that uses C#, then the bank is paying for hundreds of enterprise licenses for its developers.
That was more common maybe 10 years ago. These days, it’s increasingly common for a programming language to be monetized, not so much by anyone selling the language or the tools around the language directly but by monetizing the ecosystem. And that’s true whether the programming language was made by a company or not.
A good example of this might be Android. Android is written in Java, but Google doesn’t own Java. Nothing in the Android stack there, if you will, is paid for—you don’t pay to develop in Java for Android the way developers used to pay Microsoft to use C#. But if this is your language of choice and your OS of choice is Android, there’s an ecosystem of developers that emerges.
So, if you want to write Android apps, often Java is going to be the language that you have to choose in order to do that. Because of that, what Google could do—and, to be clear, this doesn’t exist—is create its own proprietary flavor of Java. And then people are incentivized to learn that if they want to develop Android apps. So, Google can say, “Well, here’s this proprietary flavor of Java, which you can have, or you can use our Google candidate app development kit for this flavor.” That’s how you can kind of monetize that ecosystem in the way I was talking about.
It’s not very common anymore for somebody to develop a programming language and then sell the language or sell the tools for that language directly. What they’re really looking to do is build up a lot of users of that language and get those users invested in such a way that they won’t choose another programming language. It’s called vendor lock-in sometimes. For example, Apple, the king of this, will get users onto its system, and then they always have to use Apple’s stuff to in order to operate. That’s more commonly how a company would monetize a programming language these days.
Why are there different programming languages?
Make Me a Programmer asked: Would you say that that early competition you referenced is the reason why there are so many programming languages? So, for example, if Microsoft has the monopoly on C# and a programmer either wanted their own slice of the pie or didn’t want to pay Microsoft to be a C# developer, then that programmer would just create their own language?
That was certainly a factor in the ’90s and 2000s.
I’ll tell that in a moment, but to answer your question, yes, absolutely. So early on, the reason Microsoft developed C# was that it was, in the very beginning, almost identical to Java. Microsoft tried to create their own version of Java, and it didn’t really work. So, they created C# as a way to lure Java developers out of the ecosystem.
And then there were also things like you mentioned, where the open-source community wouldn’t want to pay for something, so they would develop a free version of it. I’m actually underselling how complicated this is because not only are there different programming languages, but you could say that there are different dialects of programming languages. So, for instance, C++ is a language that’s used across a lot of different platforms—Linux, Windows, you name it. And then Microsoft came up with Visual C++, which only worked on Windows, and different flavors of Linux might have done similar things.
So, there’s what’s called a language standard. And it’s probably comparable to actual spoken and written language where there’s a correct way to do it. And then you get these dialects. In the programming world, you would come up with this programming language that was almost like a superset of the accepted language standard. So, there was an official C++ standard, but then there was the Microsoft version that had other flavors of stuff in it. So, there are different languages being created to kind of jockey for market position, and then there are different dialects of those languages.
So, Microsoft and Netscape were both looking for a way to make their browser richer and the user experience more dynamic with the types of things we take for granted today. And to do that, they needed to create programming languages.
Well, Netscape had gotten behind Microsoft a little bit. Microsoft had this language called VBScript. And VB is Visual Basic, which is more of an application programming language, not a browser programming language. But Microsoft called its browser language VBScript so that it would remind programmers of VB and they would feel comfortable with it and maybe opt to pick Internet Explorer as their browser.
How many programming languages should I know?
My recommendation for somebody who’s just starting out as a software engineer would be to resist the impulse to learn multiple languages right out of the gate because it’ll just become confusing.
I want to draw a distinction here between what you’ll hear referred to as a tech stack and a programming language. The line can blur between what’s considered a programming language and what isn’t.
So, I talked about HTML. I think some people would call HTML a programming language. I’m not sure what I think of that. I don’t think of it as one for reasons that are a little in the weeds. But the reason I’m talking about this is that in order to make a webpage, you would need to know HTML, which is Hypertext Markup Language. It’s the tags like <h1> where, if you put <h1> before something, and then you have the text in there, and then you have an </h1> end tag, that tells the browser this text enclosed in the tags is a big heading. If you put a <b> tag before and a </b> tag after, then the browser knows that text is bold. That’s not really a programming language, per se. It’s almost formatting.
In modern software engineering, the way that you get around all that sort of unimportant debating is you talk about a web tech stack, or you call someone who knows all three of those technologies a front-end developer.
So, to answer the question, how many programming languages should you know really depends on what you’re trying to do. What you want to do instead is look at the tech stack of the thing that you want to do and focus on that.
A lot of bootcamps, if you were to go and join a bootcamp, will take care of that for you. They’re going to tell you the different things you need to learn in order to be a productive web developer when they spit you out the other end of their bootcamps. So, even if you’re not going through a bootcamp, you could go look at a bootcamp’s website and see what they’re going to teach you. And it would be a fairly safe bet that what they include is a pretty good list of the languages/technologies you ought to be familiar with.
What’s the difference between a front-end and back-end programming language?
Let me go back a little bit in time because talking about the front end and back end has become ubiquitous in the way that software engineers talk. But it wasn’t always that way.
Way back when in the pre-web era, there were just programming languages. So, you would write a program on your desktop, and then you would compile it and run it and it would pop up a game or whatever you programmed it to do. In the late ’90s, early 2000s was the rise of web development and web applications. This is the idea that we all take for granted now, which is that you can go to some site like Slack.com, and then you log in and you have this whole application experience right there on Slack.com where you’re talking to people.
This was originally novel 15 years ago or so. And it was called a web app. Now, the way that web apps work is that you have a browser that’s running on your phone, desktop, or laptop. And there’s a component to it running on the server. So, take Slack for instance. Over there on the server, that’s where Slack is keeping track of all the login information, the users, and all kinds of stuff and where they’re storing all the data. You’re not storing all of that on your phone; that gets stored on a bunch of servers with a bunch of information. And those servers facilitate the communication between you and your coworkers.
Then, on your phone or on the desktop, this is what’s called the client side. That’s where you’re seeing the actual text and people sharing their memes and whatnot.
So, front end and back end are the terms that evolved to distinguish those to parts. The part where you’re designing the experience of the user in the browser is called the front end or client side. And then there’s the back end or the server side, which is sort of the database code and everything that goes on behind the scenes.
Should developers know a language for each?
These days, you’ll hear a term called “full stack.” So, software engineers will congratulate themselves for being full-stack developers. That’s a shorthand way of saying, “Hey, I’m good either doing the front-end browser programming, or I’m good doing the server programming on the back end.” And it’s a way of saying that I, as an engineer, could soup to nuts build you a web application.
I can’t really weigh in on which of those any given person ought to do except maybe to say that if you like user experience and you’re really big on design and that type of thing, you could consider being a front-end developer. If you really don’t like that stuff and you prefer logic, you can think about working only on the back end. I think most bootcamps will teach you enough to be dangerous on both sides of that coin.
What are statistical programming languages?
Not to dive too far into technicalities, in a sense, you could call any programming language a statistical programming language. Because you could use any programming language to automate the compilation and analysis of data. So, I want to throw that out there as a technicality because it’s going to inform the answer, which is that a statistical programming language is one that’s built with the idea of being used by statisticians for the compilation and analysis of data.
I’m not an expert in these types of languages, so I can’t speak to all the nuances of it. But, generally speaking, there are a few paradigms of programming language: imperative and mathematical.
Some languages are more what’s known as imperative. And that basically means you can think of this programming language almost as giving somebody a series of instructions. It’s meant to be followed like a cooking recipe.
Then, there are other programming languages that are more mathematical in nature, and the structure there would be more like functions. So, you would have a function that, for example, takes two numbers as input and spits out the sum of those two numbers. It’s hard to go into more detail for a non-technical audience than that.
But you can think of it in the same way that we were talking about programming language design earlier: It’s almost the user experience of that programming language. Who are you trying to help? And what are they trying to do? So, a statistical programming language is one that’s designed to be optimized for and very good at munging numbers and datasets, and the language has to make it easy to describe how you go about doing that. For example, R is a very common statistical programming language.
Why are statistical programming languages important to data scientists?
Data scientists are going to gravitate toward using statistical programming languages because it gives them a way to do more involved things without seeking help. A data scientist typically hasn’t learned to be a programmer the way somebody who has taken a bootcamp or maybe even gotten a computer science degree has. They’re typically the flavor of their background in disciplines such as advanced statistics, data analysis, and so on.
So, think of somebody who’s really into data science and is gathering a lot of complex sabermetrics for baseball or will be analyzing polling data for politicians or something.
It’s more important for the people who do that to understand the nuances of statistical distribution curves and to really understand statistical theory. Their education probably hasn’t pushed them through application programming the way that, say, mine did. While getting my computer science degree, I learned how to do all kinds of algorithms and data structures to make me a generally useful programmer.
So, absent a statistical programming language, if you’re a data scientist and you haven’t really been trained in programming, you don’t have great options.
One is that you can try to do all of this stuff in Microsoft Excel, which gets really involved and convoluted and falls down quicker than you might think.
The other one would historically have been to enlist a programmer to write a program, and then you’re kind of like a marionette typing for them and telling them how the program needs to work. You don’t know how to write the program, and the programmer doesn’t know anything about your world. So, it’s an awkward collaboration.
So, the importance of the statistical programming languages is to give those data scientists a way to be autonomous but without going through a whole computer science degree. It’s giving them a programming language that’s tailored to their experience with a minimal learning curve that lets them get as much done as possible.