← Introduction to SQL | Data Query Language →
Before learning SQL, relational databases have several concepts that are important to learn first. Databases store the data of an information system. We regroup data by groups of comparable data (all the employees, all the projects, all the offices...). For each group of comparable data, we create a table. This table is specially designed to suit this type of data (its attributes). For instance, a table named employee
which stores all the employees would be designed like this:
employee the table
|
id_employee the primary key |
an integer
|
firstname a column |
a string of characters a column type
|
lastname |
a string of characters
|
phone |
10 numbers
|
mail |
a string of characters
|
And the company employees would be stored like this:
employee
|
id_employee |
firstname |
lastname |
phone |
mail
|
1 a column value |
Big |
BOSS |
936854270 |
big.boss@company.com
|
2 |
John |
DOE |
936854271 |
john.doe@company.com
|
3 |
Linus |
TORVALDS |
936854272 |
linus.torvalds@company.com
|
4 |
Jimmy |
WALES |
936854273 |
jimmy.wales@company.com
|
5 |
Larry |
PAGE |
936854274 |
larry.page@company.com
|
The data stored in a table is called entities. As a table is usually represented as an array, the data attributes (first name, last name...) are called columns and the records (the employees) are called rows. id_employee
is a database specific technical identifier called a primary key. It is used to link the entities from a table to another. To do so, it must be unique for each row. A primary key is usually underlined. Any unique attribute (for instance, the mail) or group of attributes (for instance, the first name and last name) can be the table primary key but it is recommended to use an additional technical id (id_employee) for primary key.
Let's create a second table called project
which stores the company projects:
employee
|
id_employee |
an integer
|
firstname |
a string of characters
|
lastname |
a string of characters
|
phone |
10 numbers
|
mail |
a string of characters
|
|
project
|
id_project |
an integer
|
name |
a string of characters
|
created_on |
a date
|
ended_on |
a date
|
# manager |
an integer
|
|
And the company projects would be stored like this:
employee
|
id_employee |
firstname |
lastname |
phone |
mail
|
1 |
Big |
BOSS |
936854270 |
big.boss@company.com
|
2 |
John |
DOE |
936854271 |
john.doe@company.com
|
3 |
Linus |
TORVALDS |
936854272 |
linus.torvalds@company.com
|
4 |
Jimmy |
WALES |
936854273 |
jimmy.wales@company.com
|
5 |
Larry |
PAGE |
936854274 |
larry.page@company.com
|
|
project
|
id_project |
name |
created_on |
ended_on |
# manager
|
1 |
Google |
1998-09-08 |
NULL |
5
|
2 |
Linux |
1991-01-01 |
NULL |
3
|
3 |
Wikipedia |
2001-01-01 |
NULL |
4
|
|
id_project
is the primary key of the project table and manager
is a foreign key. A foreign key is a technical id which is equal to one of the primary keys stored in another table (here, the employee table). Doing this, the Google project is linked to the employee Larry PAGE. This link is called a relationship. A foreign key is usually preceded by a sharp. Note that several projects can point to a common manager so an employee can be the manager of several projects.
Now, we want to create, not a single link, but multiple links. So we create a junction table. A junction table is a table that isn't used to store data but links the entities of other tables. Let's create a table called members
which links employees to project:
employee
|
id_employee |
an integer
|
firstname |
a string of characters
|
lastname |
a string of characters
|
phone |
10 numbers
|
mail |
a string of characters
|
|
members
|
# id_employee |
an integer
|
# id_project |
an integer
|
|
project
|
id_project |
an integer
|
name |
a string of characters
|
created_on |
a date
|
ended_on |
a date
|
# manager |
an integer
|
|
And the employees and the projects can be linked like this:
employee
|
id_employee |
firstname |
lastname |
phone |
mail
|
1 |
Big |
BOSS |
936854270 |
big.boss@company.com
|
2 |
John |
DOE |
936854271 |
john.doe@company.com
|
3 |
Linus |
TORVALDS |
936854272 |
linus.torvalds@company.com
|
4 |
Jimmy |
WALES |
936854273 |
jimmy.wales@company.com
|
5 |
Larry |
PAGE |
936854274 |
larry.page@company.com
|
6 |
Max |
THE GOOGLER |
936854275 |
max.the-googler@company.com
|
7 |
Jenny |
THE WIKIPEDIAN |
936854276 |
jenny.the-wikipedian@company.com
|
|
project
|
id_project |
name |
created_on |
ended_on |
# manager
|
1 |
Google |
1998-09-08 |
NULL |
5
|
2 |
Linux |
1991-01-01 |
NULL |
3
|
3 |
Wikipedia |
2001-01-01 |
NULL |
4
|
|
members
|
# id_employee |
# id_project
|
3 |
2
|
2 |
1
|
4 |
3
|
5 |
1
|
2 |
3
|
6 |
1
|
7 |
3
|
|
An employee can be associated to several projects (John DOE with Google and Wikipedia) and a project can be associated to several employees (Wikipedia with Jimmy, John and Jenny), which is impossible with just a foreign key. A junction table hasn't its own primary key. Its primary key is the couple of foreign keys, as this couple is unique. A junction table can link more than two entity tables by containing more columns.
So let's list the different types of relationships:
- One to one,
- One to many (for instance, the manager of a project),
- Many to many (for instance, the members of the projects).
For each type of relationships, there is a way to link the entities :
- One to many relationship: create a foreign key from an entity table to the other,
- Many to many relationship: create a junction table,
- One to one relationship: just merge the two tables.
Now you know how to design a database schema and to put the data of your information system into it.