Structured Query Language/Create a simple Table


More than a Spreadsheet

edit

Let's start with a simple example. Suppose we want to collect information about people - their name, place of birth and some more items. In the beginning we might consider to collect this data in a simple spreadsheet. But what if we grow to a successful company and have to handle millions of those data items? Could a spreadsheet deal with this huge amount of information? Could several employees or programs simultaneously insert new data, delete or change it? Of course not. And this is one of the noteworthy advantages of a Database Management System (DBMS) over a spreadsheet program: we can imagine the structure of a table as a simple spreadsheet - but the access to it is internally organized in a way that huge amounts of data can be accessed by a lot of users at the same time.

In summary, it can be said that one can imagine a table as a spreadsheet optimized for bulk data and concurrent access.

Conceive the Structure

edit

To keep control and to ensure good performance, tables are subject to a few strict rules. Every table column has a fixed name, and the values ​​of each column must be of the same data type. Furthermore, it is highly recommended - though not compulsory - that each row can be identified by a unique value. The column in which this identifying value resides is called the Primary Key. In this Wikibook, we always name it id. But everybody is free to choose a different name. Furthermore, we may use the concatenation of more than one column as the Primary Key.

At the beginning we have to decide the following questions:

  1. What data about persons (in this first example) do we want to save? Of course, there is a lot of information about persons (e.g., eye color, zodiacal sign, ...), but every application needs only some of them. We have to decide which ones are of interest in our concrete context.
  2. What names do we assign to the selected data? Each identified datum goes into a column of the table, which needs to have a name.
  3. Of what type are the data? All data values within one column must be of the same kind. We cannot put an arbitrary string into a column of data type DATE.

In our example, we decide to save the first name, last name, date, and place of birth, social security number, and the person's weight. Obviously date of birth is of data type DATE, the weight is a number, and all others are some kind of strings. For strings, there is a distinction between those that have a fixed length and those in which the length usually varies greatly from row to row. The former is named CHAR(<n>), where <n> is the fixed length, and the others VARCHAR(<n>), where <n> is the maximum length.

Fasten Decisions

edit

The decisions previously taken must be expressed in a machine-understandable language. This language is SQL, which acts as the interface between end-users - or special programs - and the DBMS.

-- comment lines start with two consecutive minus signs '--'
CREATE TABLE person (
  -- define columns (name / type / default value / nullable)
  id             DECIMAL      NOT NULL,
  firstname      VARCHAR(50)  NOT NULL,
  lastname       VARCHAR(50)  NOT NULL,
  date_of_birth  DATE,
  place_of_birth VARCHAR(50),
  ssn            CHAR(11),
  weight         DECIMAL DEFAULT 0 NOT NULL,
  -- select one of the defined columns as the Primary Key and
  -- guess a meaningful name for the Primary Key constraint: 'person_pk' may be a good choice 
  CONSTRAINT person_pk PRIMARY KEY (id)
);

We choose person as the name of the table, which consists of seven columns. The id column is assigned the role of the Primary Key. We can store exclusively digits in the columns id and weight, strings of a length up to 50 characters in firstname, lastname and place_of_birth, dates in date_of_birth and a string of exactly eleven characters in ssn. The phrase NOT NULL is part of the definition of id, firstname, lastname and weight. This means that in every row, there must be a value for those four columns. Storing no value in any of those columns is not possible - but the 8-character-string 'no value' or the digit '0' are allowed because they are values. Or to say it the other way round: it is possible to omit the values of date_of_birth, place_of_birth and ssn.

The definition of a Primary Key is called a 'constraint' (later on, we will get to know more kinds of constraints). Every constraint should have a name - it's person_pk in this example.

The Result

edit

After execution of the above 'CREATE TABLE' command, the DBMS has created an object that one can imagine similar to the following Wiki-table:

id firstname lastname date_of_birth place_of_birth ssn weight

This Wiki-table shows 4 lines. The first line represents the names of the columns - not values! The following 3 lines are for demonstration purposes only. But in the database table, there is currently no single row! It is completely empty, no rows at all, no values at all! The only thing that exists in the database is the structure of the table.

Back to Start

edit

Maybe we want to delete the table one day. To do so, we can use the DROP command. It removes the table totally: all data and the complete structure are thrown away.

DROP TABLE person;

Don't confuse the DROP command with the DELETE command, which we present on the next page. The DELETE command removes only rows - possibly all of them. However, the table itself, which holds the definition of the structure, is retained.