While I was in college in the 1990s, many of us considered databases as a solved problem. And by extension, wrote off SQL as a boring unavoidable necessity to persist application data. I mean, Boyce–Codd Normal Form (BCNF) had already been invented, and how hard it could be to read/write data from/to RDBMS tables!
When data lake and NoSQL became hip, many of us thought that SQL was the next COBOL. We would be forced to occasionally glance at its old wrinkled face, of course, without any affection or longing. Then we would preserve its mummy in digital computer science museums for future generations to see what we had suffered and endured. And that would be it.
But SQL had a different idea. It had decided to give us lessons on resilience and longevity, and I dare say youthfulness.
It does not stop there. It boldly challenges our language sensitivities shaped by modern functional languages. You feel all this SELECT, FROM, and WHERE hurts your eyes? Guess what… it is concise and precise. It is eloquent!
You don’t believe me?
SQL was developed in 1974 (interesting early history of SQL). The first SQL ISO standard was developed in 1986 and the latest in 2016, and the next version is under development. Different SQL implementations continue to invent extensions to put more power in developers’ hands.
The subtitle of the IEEE Spectrum Top Programming Languages 2022 report is: Python’s still №1, but employers love to see SQL skills. SQL is the number 1 language in job postings. Apparently a lot of employers “want a given language plus SQL.”
In retrospect, there were signs when Apache Pig Latin was born to raise the abstraction of programming “from the Java MapReduce idiom into a notation which makes MapReduce programming high level, similar to that of SQL.” The influence of SQL was unmistakable. Even that pretense to be more or better than SQL was discarded in Spark SQL.
The resurgence of data warehouses in the form of columnar data stores certainly added fuel to the fire. The Empire Strikes Back! But if SQL was not SQL, something else would have been crowned in this resurgence.
So what is the secret of SQL’s resilience?
One word: declarative. SQL is a declarative language. You write what you want to compute, not how to compute it. That is the headache of the query engine implementation.
Just as it relieves developers from worrying about how, it also frees the SQL implementations to optimize the hell out of compute, storage, and network (as BigQuery has done with Dreme). And query engine (along with compute, storage, and engine) can continue improving without breaking existing code.
The beauty is that SQL engines in a data/delta lake, like Databricks SQL, are free to pick a very different storage/compute architecture to serve SQL queries. The data lake that many of us thought would make SQL obsolete has actually found salvation in SQL.
So, here we are.
SQL is not going anywhere. So you got to befriend it and master it. Here are a few free resources (I am not being paid to recommend these):
LearnSQL: SQL Cheat Sheets
SQLite Online to practice
Hope you enjoyed reading this. It is an opinionated piece, and I would love to hear your opinions and counter-arguments.
ML4Devs Newsletter - Issue 17, published on 25 Nov 2022.