Even more SQL basics

09 Oct 2019

Notes based on Head First SQL

Table design

SQL is known as a Relational Database Management System (RDBMS). The key word here is 'Relational'. To design a table well, it's important to think about how the columns in a table relate to each other.

  1. What is the one thing you want your table to describe?
  2. Make a list of the information you need to know about your one thing when you're using the table
  3. Using the list, break down the information about your thing into pieces that you can use for organising your table.

Atomic Data

An atom is a little piece of information that can't or shouldn't be divided. Atomic data has been broken down into the smallest pieces of data that can't or shouldn't be divided.

A pizza delivery guy only needs the order number and the address the order belongs to. He never needs to look for a single street number on it's own, so the entire address should be stored in a single column. Whereas a real estate agent may need the street number and street name to search all the houses for sale by a specific street name or number, so it makes sense to seperate these out. In both examples, the address data is atomic, even though it is seperated in one use case and not in the other.

  1. What is the ONE thing your table describes?
  2. How will you USE the table to get at the one thing?
  3. Do your columns contain atomic data to make your queries short and to the point?

Making your data atomic means breaking it down into the smallest pieces that you need to create an efficient table, not just the smallest possible pieces you can.

Atomic table rules

Making your data atomic is the first step in creating a normal table.

Normal tables

There are two rules (in addition to the atomic rules) to make a table normal:

  1. Each row of data must contain atomic values
  2. Each row of data must have a unique identifier, known as a primary key.

Primary keys

A primary key is a column in your table that makes each record unique. They can't be NULL, they must be given a value when the record is inserted into the database, they must contain only the information they need to be unique and nothing more, and the primary key values can't be changed. If you have a table where none of the information in it is unique, you can make a column called 'id' that auto increments every time a new record is added.

An 'id' primary key is called a 'synthetic' primary key, whilst columns that are naturally unique in your table are called 'natural' primary keys. There is a debate in the SQL community about which one should be used, but neither side is conclusive.

See the original table creation statement

SHOW CREATE TABLE my_table;

The above code shows you the original create table statement that was used to create your table. You can copy it to make a new version of the table without any data in it, a blank slate.

Create a table with a primary key

To create a primary key, add the following line to the end of your create table statement, just before the closing brackets:

PRIMARY KEY (column_name)

If you want your primary key column to auto increment from the integer value 1 onwards, add the 'AUTO_INCREMENT' clause to your id column as follows:

column_name INT NOT NULL AUTO_INCREMENT

If you add a different value to the auto_incrementing column using the INSERT statement, then it will be ignored and auto_increment as usual.

Modify an existing table structure with ALTER

The ALTER command allows you to change almost everything in your table without having to reinsert your data. But if you change a column of one data type to a different one, you risk losing your data.


ALTER TABLE my_contacts
ADD COLUMN contact_id INT NOT NULL AUTO_INCREMENT FIRST,
ADD PRIMARY KEY (contact_id);

To add a primary column to an existing table, we can write the query above. The FIRST keyword at the end of the add column line adds the column as the first in the table before all of the other columns. As we used AUTO_INCREMENT, the column was automatically filled with auto incrementing numbers

There are a few more keywords besides FIRST that you can use to specify where you want to add new columns to your table:


ALTER TABLE my_contacts
ADD COLUMN phone VARCHAR(10)
AFTER first_name;

More ways to alter data

Renaming your tables


ALTER TABLE projekts
RENAME TO project_list;

Change data types


ALTER TABLE project_list
CHANGE COLUMN number proj_id INT NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY ('proj_id');

Change two columns with one SQL statement


ALTER TABLE project_list
CHANGE COLUMN descriptionofproj proj_dest VARCHAR(100),
CHANGE COLUMN contractoronjob con_name VARCHAR(30);

If the data you are changing is not compatible with the old data type, SQL will throw an error. If it is compatible, your data might be truncated. For example, going from VARCHAR(10) to char(1), your data will change from 'Unicorn' to just 'B'.

Change data type but not the name of the column


ALTER TABLE project_list
MODIFY COLUMN proj_desc VARCHAR(120);

The MODIFY keyword changes only the datatype of a column and leaves the name alone.

Deleting a column

It is good practice to have only the columns you need in your table. If you are not using a column, you can drop it and use ALTER to add it in again (you'll have to repopulate the data) if you need it in the future.


ALTER TABLE project_table
DROP COLUMN start_date;

Once you drop a column, everything that was stored in it will be removed too. You can use a SELECT statement to 'try' out and see what the DELETE statement will target to make sure it's going to delete the right things before you make it permanent.

Select a specified number of characters from a column

SELECT RIGHT(location, 2) FROM my_contacts;

The RIGHT keyword above accepts two parameters, a column name and a specified number of characters from the right of the column. In this case, we are selecting the last two characters in the column. We can use the LEFT keyword in the same way, to select characters starting from the left of the column.

Select everything in front of the comma

SELECT SUBSTRING_INDEX(location, ',', 1) FROM my_contacts;

The SUBSTRING_INDEX accepts three parameters, the name of the column, a character or a string, in the example above the character we have specified is a comma, and the number which specifies the number of matches to start from. In this case, we are grabbing everything that comes after the first comma. If the number in the last argument was a 2, then we'd be grabbing everything after the second comma.

A few more functions

NOTE: string functions do NOT change the data stored in your table; They only return the altered strings as a result of your query

Change the value of EVERY row in a column


UPDATE table_name
SET column_name = newvalue;

Each row in our table is set, one at a time to this new value.


UPDATE my_contacts
SET state = RIGHT(location, 2);

The statement above grabs the last two characters from the old location column and puts them into the new state column