In the realm of technology, databases are the bedrock upon which much of our digital world rests. Behind the scenes of every app, website, and software lies a complex structure managing and organizing vast amounts of data. At the heart of this structure lies SQL (Structured Query Language), a powerful programming language used to communicate with and manipulate databases.
For beginners taking their first steps into the world of databases, SQL might seem like a daunting language with its own syntax and intricacies. Fear not! This beginner’s guide aims to demystify SQL, providing a foundational understanding of its principles and usage.
In case you would like to follow along this blog post, it presents a walkthrough using the PostgresSQL environment. You can install PostgresSQL here [PostgreSQL: Downloads].
Understanding Databases and SQL
At its core, a database is a structured collection of data that allows for efficient storage, retrieval, and management. Think of it as a library with neatly organized shelves. SQL acts as the language to interact with these databases, allowing users to perform various operations like retrieving data, inserting new records, updating existing information, and deleting unwanted entries. There are five command categories within the SQL framework: Data Definition Language, Data Manipulation Language, Data Control Language and Transaction Control Language.
For a data analyst, it is crucial to know how to define, manipulate, and query a database. We will now briefly explore the main commands that perform these actions.
1. Data Definition Language (DDL)
Imagine you’re an architect designing a blueprint for a library. You will make use of DDL commands to design your building
CREATE:
In SQL terms, the CREATE command is akin to creating the blueprint itself. You specify the structure, defining the tables, columns, data types, and constraints for a database. For instance, let’s create a table called Books that will have two columns called ‘Title’ and ‘Author’. Observe the Query and the Data Output panes to see the results below. We are using the SELECT statement to show our results in the Data Output window. This is a very useful SQL statement and we will cover it more deeply in the Data Query Language section.
ALTER:
Just as the librarians might decide to add new study rooms or renovate the library’s main hall, the ALTER command in SQL allows you to modify the structure of an existing table. For example, our librarians want to add the publishing year for each record in the Books table:
DROP:
Don’t get me wrong, I’m all for conserving buildings, but say we need to build a new library with better facilities that will accommodate more people and services to the people. When it’s time to tear down an old building, the DROP command is used in SQL to remove an entire table or database objects. It’s like demolishing a structure:
2. Data Manipulation Language (DML):
Statements belonging to the DML framework are used to insert, delete and update data. As new books and editions are published, it is important our library updates its catalogue as well. DML statements will be crucial for that.
INSERT:
New books keep coming up and libraries need to keep updated on the newest editions and titles, the INSERT command in SQL adds new records or rows into a table. Let’s add to our table the first books of our catalogue:
UPDATE:
When inserting the books, we missed the fact that some of the books also have subtitles, so now we need to update those records to reflect the most accurate information. In SQL, the UPDATE command modifies existing records within a table:
DELETE:
From time to time, libraries might want to give away some of the old editions and keep the new ones, the DELETE command in SQL removes specific records or rows from a table. In this example, we want to remove from the library catalogue all books published before the year 2000.
3. Data Query Language (DQL)
Now that we have built our library and stored all our records in it, we want to pull out the data we might need. DQL provides the most used statement in the SQL, the SELECT statement.
SELECT:
Think of SQL’s SELECT command as a search engine. You’re asking the database to retrieve specific information, much like entering keywords into a library search bar. SELECT is used with a number of other statements such as WHERE, used to limit the results by a condition, and ORDER BY, to sort rows according to a specific column. Another important part of the SELECT statement is aliases, determined by AS, which give the column or table a temporary name.
For instance, in this first query we want to get all the book titles from the ‘Books’ table ordered by year, whereas on the second query we want to get only the book titles and have a temporary name for the column ‘title’ and the table ‘Books’.
In summary, understanding these three SQL categories (DFL, DML, DQL) of commands forms the foundation for effectively interacting with databases using SQL. Each command serves a distinct purpose, empowering users to manage and manipulate data efficiently within a database system.