Structured Query Language/Relational Databases

← Introduction to SQL | Data Query Language → Before learning SQL, relational databases have several concepts that are important to learn first. Databases store the data of an information system. We regroup data by groups of comparable data (all the employees, all the projects, all the offices...). For each group of comparable data, we create a table. This table is specially designed to suit this type of data (its attributes). For instance, a table named employee which stores all the employees would be designed like this:

employee the table
id_employee the primary key an integer
firstname a column a string of characters a column type
lastname a string of characters
phone 10 numbers
mail a string of characters

And the company employees would be stored like this:

employee
id_employee firstname lastname phone mail
1 a column value Big BOSS 936854270 big.boss@company.com
2 John DOE 936854271 john.doe@company.com
3 Linus TORVALDS 936854272 linus.torvalds@company.com
4 Jimmy WALES 936854273 jimmy.wales@company.com
5 Larry PAGE 936854274 larry.page@company.com


The data stored in a table is called entities. As a table is usually represented as an array, the data attributes (first name, last name...) are called columns and the records (the employees) are called rows. id_employee is a database specific technical identifier called a primary key. It is used to link the entities from a table to another. To do so, it must be unique for each row. A primary key is usually underlined. Any unique attribute (for instance, the mail) or group of attributes (for instance, the first name and last name) can be the table primary key but it is recommended to use an additional technical id (id_employee) for primary key.

Let's create a second table called project which stores the company projects:

employee
id_employee an integer
firstname a string of characters
lastname a string of characters
phone 10 numbers
mail a string of characters
project
id_project an integer
name a string of characters
created_on a date
ended_on a date
# manager an integer

And the company projects would be stored like this:

employee
id_employee firstname lastname phone mail
1 Big BOSS 936854270 big.boss@company.com
2 John DOE 936854271 john.doe@company.com
3 Linus TORVALDS 936854272 linus.torvalds@company.com
4 Jimmy WALES 936854273 jimmy.wales@company.com
5 Larry PAGE 936854274 larry.page@company.com
project
id_project name created_on ended_on # manager
1 Google 1998-09-08 NULL 5
2 Linux 1991-01-01 NULL 3
3 Wikipedia 2001-01-01 NULL 4

id_project is the primary key of the project table and manager is a foreign key. A foreign key is a technical id which is equal to one of the primary keys stored in another table (here, the employee table). Doing this, the Google project is linked to the employee Larry PAGE. This link is called a relationship. A foreign key is usually preceded by a sharp. Note that several projects can point to a common manager so an employee can be the manager of several projects.

Now, we want to create, not a single link, but multiple links. So we create a junction table. A junction table is a table that isn't used to store data but links the entities of other tables. Let's create a table called members which links employees to project:

employee
id_employee an integer
firstname a string of characters
lastname a string of characters
phone 10 numbers
mail a string of characters
members
# id_employee an integer
# id_project an integer
project
id_project an integer
name a string of characters
created_on a date
ended_on a date
# manager an integer

And the employees and the projects can be linked like this:

employee
id_employee firstname lastname phone mail
1 Big BOSS 936854270 big.boss@company.com
2 John DOE 936854271 john.doe@company.com
3 Linus TORVALDS 936854272 linus.torvalds@company.com
4 Jimmy WALES 936854273 jimmy.wales@company.com
5 Larry PAGE 936854274 larry.page@company.com
6 Max THE GOOGLER 936854275 max.the-googler@company.com
7 Jenny THE WIKIPEDIAN 936854276 jenny.the-wikipedian@company.com
project
id_project name created_on ended_on # manager
1 Google 1998-09-08 NULL 5
2 Linux 1991-01-01 NULL 3
3 Wikipedia 2001-01-01 NULL 4
members
# id_employee # id_project
3 2
2 1
4 3
5 1
2 3
6 1
7 3

An employee can be associated to several projects (John DOE with Google and Wikipedia) and a project can be associated to several employees (Wikipedia with Jimmy, John and Jenny), which is impossible with just a foreign key. A junction table hasn't its own primary key. Its primary key is the couple of foreign keys, as this couple is unique. A junction table can link more than two entity tables by containing more columns.

Relationships edit

So let's list the different types of relationships:

  • One to one,
  • One to many (for instance, the manager of a project),
  • Many to many (for instance, the members of the projects).

For each type of relationships, there is a way to link the entities :

  • One to many relationship: create a foreign key from an entity table to the other,
  • Many to many relationship: create a junction table,
  • One to one relationship: just merge the two tables.

Now you know how to design a database schema and to put the data of your information system into it.