MySQL/Language/Queries

SELECT

edit

select syntax is as follows:

 SELECT *
 FROM a_table_name
 WHERE condition
 GROUP BY grouped_field
 HAVING group_name condition
 ORDER BY ordered_field
 LIMIT limit_number, offset

List of fields

edit

You must specify what data you're going to retrieve in the SELECT clause:

 SELECT DATABASE() -- returns the current db's name
 SELECT CURRENT_USER() -- returns your username
 SELECT 1+1 -- returns 2

Any SQL expression is allowed here.

You can also retrieve all fields from a table:

 SELECT * FROM `stats`

If you SELECT only the necessary fields, the query will be faster.

The table's name

edit

If you are retrieving results from a table or a view, usually you specify the table's name in the FROM clause:

 SELECT id FROM `stats` -- retrieve a field called id from a table called stats

Or:

 SELECT MAX(id) FROM `stats`
 SELECT id*2 FROM `stats`

You can also use the `db_name`.`table_name` syntax:

 SELECT id FROM `sitedb`.`stats`

But you can also specify the table's name in the SELECT clause:

 SELECT `stats`.`id` -- retrieve a field called id from a table
 SELECT `sitedb`.`stats`.`id`

WHERE

edit

You can set a filter to decide what records must be retrieved.

For example, you can retrieve only the record which has an id of 42:

 SELECT * FROM `stats` WHERE `id`=42

Or you can read more than one record:

 SELECT * FROM `antiques` WHERE buyerid IS NOT NULL

GROUP BY

edit

You can group all records by one or more fields. The record which have the same value for that field will be grouped in one computed record. You can only select the grouped record and the result of some aggregate functions, which will be computed on all records of each group.

For example, the following will group all records in the table `users` by the field `city`. For each group of users living in the same city, the maximum age, the minimum age and the average age will be returned:

 SELECT city, MAX(age), MIN(age), AVG(age) GROUP BY `city`

In the following example, the users are grouped by city and sex, so that we'll know the max, min and avg age of male/female users in each city:

 SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city`, `sex`

HAVING

edit

The HAVING clause declares a filter for the records which are computed by the GROUP BY clause. It's different from the WHERE clause, that operates before the GROUP BY. Here's what happens:

  1. The records which match to the WHERE clause are retrieved
  2. Those records are used to compute new records as defined in the GROUP BY clause
  3. The new records that match to the HAVING conditions are returned

This means which WHERE decides what record are used to compose the new computed records.

HAVING decides what computed records are returned, so it can operate on the results of aggregate functions. HAVING is not optimized and can't use indexes.

Incorrect use of HAVING:

 SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city` HAVING sex='m'

This probably gives a wrong results. MAX(age) and other aggregate calculations are made using all values, even if the record's sex value is 'f'. This is hardly the expected result.

Incorrect use of HAVING:

 SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city`, `sex` HAVING sex='m'

This is correct and returns the expected results, but the execution of this query is not optimized. The WHERE clause can and should be used, because, so that MySQL doesn't computes records which are excluded later.

Correct use of HAVING:

 SELECT city, sex, MAX(age), MIN(age), AVG(age) GROUP BY `city` HAVING MAX(age) > 80

It must group all records, because can't decide the max age of each city before the GROUP BY clause is execute. Later, it returns only the record with a MAX(age)>80.

ORDER BY

edit

You can set an arbitrary order for the records you retrieve. The order may be alphabetical or numeric.

 SELECT * FROM `stats` ORDER BY `id`

By default, the order is ASCENDING. You can also specify that the order must be DESCENDING:

 SELECT * FROM `stats` ORDER BY `id` ASC -- default
 SELECT * FROM `stats` ORDER BY `id` DESC -- inverted

NULLs values are considered as minor than any other value.

You can also specify the field position, in place of the field name:

 SELECT `name`, `buyerid` FROM `antiques` ORDER BY 1 -- name
 SELECT `name`, `buyerid` FROM `antiques` ORDER BY 2 -- buyerid
 SELECT `name`, `buyerid` FROM `antiques` ORDER BY 1 DESC

SQL expressions are allowed:

 SELECT `name` FROM `antiques` ORDER BY REVERSE(`name`)

You can retrieve records in a random order:

 SELECT `name` FROM `antiques` ORDER BY RAND()

If a GROUP BY clause is specified, the results are ordered by the fields named in GROUP BY, unless an ORDER BY clause is present. You can even specify in the GROUP BY clause if the order must be ascending or descending:

 SELECT city, sex, MAX(age) GROUP BY `city` ASC, `sex` DESC

If you have a GROUP BY but you don't want the records to be ordered, you can use ORDER BY NULL:

 SELECT city, sex, MAX(age) GROUP BY `city`, `sex` ORDER BY NULL

LIMIT

edit

You can specify the maximum of rows that you want to read:

 SELECT * FROM `antiques` ORDER BY id LIMIT 10

This statement returns a maximum of 10 rows. If there are less than 10 rows, it returns the number of rows found. The limit clause is usually used with ORDER BY.

You can get a given number of random records:

 SELECT * FROM `antiques` ORDER BY rand() LIMIT 1 -- one random record
 SELECT * FROM `antiques` ORDER BY rand() LIMIT 3

You can specify how many rows should be skipped before starting to return the records found. The first record is 0, not one:

 SELECT * FROM `antiques` ORDER BY id LIMIT 10
 SELECT * FROM `antiques` ORDER BY id LIMIT 0, 10 -- synonym

You can use the LIMIT clause to get the pagination of results:

 SELECT * FROM `antiques` ORDER BY id LIMIT 0, 10 -- first page
 SELECT * FROM `antiques` ORDER BY id LIMIT 10, 10 -- second page
 SELECT * FROM `antiques` ORDER BY id LIMIT 20, 10 -- third page

Also, the following syntax is acceptable:

 SELECT * FROM `antiques` ORDER BY id LIMIT 10 OFFSET 10

You can use the LIMIT clause to check the syntax of a query without waiting for it to return the results:

 SELECT ... LIMIT 0

Optimization tips:

  • SQL_CALC_FOUND_ROWS may speed up a query [1][2]
  • LIMIT is particularly useful for SELECTs which use ORDER BY, DISTINCT and GROUP BY, because their calculations don't have to involve all the rows.
  • If the query is resolved by the server copying internally the results into a temporary table, LIMIT helps MySQL to calculate how much memory is required by the table.

DISTINCT

edit

The DISTINCT keyword can be used to remove all duplicate rows from the resultset:

 SELECT DISTINCT * FROM `stats` -- no duplicate rows
 SELECT DISTINCTROW * FROM `stats` -- synonym
 SELECT ALL * FROM `stats` -- duplicate rows returned (default)

You can use it to get the list of all values contained in one field:

 SELECT DISTINCT `type` FROM `antiques` ORDER BY `type`

Or you can use it to get the existing combinations of some values:

 SELECT DISTINCT `type`, `age` FROM `antiques` ORDER BY `type`

If one of the fields you are SELECTing is the PRIMARY KEY or has a UNIQUE index, DISTINCT is useless. Also, it's useless to use DISTINCT in conjunction with the GROUP BY clause.

IN and NOT IN

edit
 SELECT id
 FROM stats
 WHERE position IN ('Manager', 'Staff')

 SELECT ownerid, 'is in both orders & antiques'
 FROM orders, antiques WHERE ownerid = buyerid
 UNION
 SELECT buyerid, 'is in antiques only'
 FROM antiques WHERE buyerid NOT IN (SELECT ownerid FROM orders)

EXISTS and ALL

edit

(Compatible: Mysql 4+)

 SELECT ownerfirstname, ownerlastname
 FROM owner 
 WHERE EXISTS (SELECT * FROM antiques WHERE item = 'chair')

 SELECT buyerid, item 
 FROM antiques
 WHERE price = ALL (SELECT price FROM antiques)

Optimization hints

edit

There are some hints you may want to give to the server to better optimize the SELECTs. If you give more than one hints, the order of the keywords is important:

 SELECT [ALL | DISTINCT | DISTINCTROW ]
    [HIGH_PRIORITY] [STRAIGHT_JOIN]
    [SQL_SMALL_RESULT | SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
    [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    ...

HIGH_PRIORITY

Usually, DML commands (INSERT, DELETE, UPDATE) have higher priority than SELECTs. If you specify HIGH_PRIORITY though, the SELECT will have higher priority than DML statements.

STRAIGHT_JOIN Force MySQL to evaluate the tables of a JOIN in the same order they are named, from the leftmost.

SQL_SMALL_RESULT It's useful only while using DISTINCT or GROUP BY. Tells the optimizer that the query will return few rows.

SQL_BIG_RESULT It's useful only while using DISTINCT or GROUP BY. Tells the optimizer that the query will return a many rows.

SQL_BUFFER_RESULT Force MySQL to copy the result into a temporary table. This is useful to remove LOCKs as soon as possible.

SQL_CACHE Forces MySQL to copy the result into the query cache. Only works if the value of query_cache_type is DEMAND or 2.

SQL_NO_CACHE Tells MySQL not to cache the result. Useful if the query occurs very seldom or if the result often change.

SQL_CALC_FOUND_ROWS Useful if you are using the LIMIT clause. Tells the server to calculate how many rows would have been returned if there were no LIMIT. You can retrieve that number with another query:

 SELECT SQL_CALC_FOUND_ROWS * FROM `stats` LIMIT 10 OFFSET 100;
 SELECT FOUND_ROWS();

Index hints

edit
  • USE INDEX: specifies to research some records preferably by browsing the tables indexes[3].
  • FORCE INDEX: idem in more restrictive. A table will be browsed without index only if the optimizer doesn't have the choice.
  • IGNORE INDEX: request to not favor the indexes.

Example:

SELECT *
FROM table1 USE INDEX (date)
WHERE date between '20150101' and '20150131'
SELECT *
FROM table1 IGNORE INDEX (date)
WHERE id between 100 and 200

UNION and UNION All

edit

(Compatible: Mysql 4+)

Following query will return all the records from both tables.

 SELECT * FROM english 
 UNION ALL
 SELECT * FROM hindi

UNION is the same as UNION DISTINCT.
If you type only UNION, then it is considered that you are asking for distinct records. If you want all records, you have to use UNION ALL.

 SELECT word FROM word_table WHERE id = 1
 UNION
 SELECT word FROM word_table WHERE id = 2

 (SELECT magazine FROM pages)
 UNION DISTINCT
 (SELECT magazine FROM pdflog)
 ORDER BY magazine

 (SELECT ID_ENTRY FROM table WHERE ID_AGE = 1)
 UNION DISTINCT 
 (SELECT ID_ENTRY FROM table WHERE ID_AGE=2)

Joins

edit

 


The Most important aspect of SQL is its relational features. You can query, compare and calculate two different tables having entirely different structure. Joins and subselects are the two methods to join tables. Both methods of joining tables should give the same results. The natural join is faster on most SQL platforms.

In the following example a student is trying to learn what the numbers are called in Hindi.

 CREATE TABLE english (Tag int, Inenglish varchar(255));
 CREATE TABLE hindi (Tag int, Inhindi varchar(255));

 INSERT INTO english (Tag, Inenglish) VALUES (1, 'One');
 INSERT INTO english (Tag, Inenglish) VALUES (2, 'Two');
 INSERT INTO english (Tag, Inenglish) VALUES (3, 'Three');

 INSERT INTO hindi (Tag, Inhindi) VALUES (2, 'Do');
 INSERT INTO hindi (Tag, Inhindi) VALUES (3, 'Teen');
 INSERT INTO hindi (Tag, Inhindi) VALUES (4, 'Char');
 select * from english select * from hindi
TagInenglish Tag Inhindi
1 One 2 Do
2 Two 3 Teen
3 Three 4 Char

Inner Join

edit
 SELECT hindi.Tag, english.Inenglish, hindi.Inhindi
 FROM english, hindi
 WHERE english.Tag = hindi.Tag
 -- equal
 SELECT hindi.Tag, english.Inenglish, hindi.Inhindi
 FROM english INNER JOIN hindi ON english.Tag = hindi.Tag
Tag Inenglish Inhindi
2 Two Do
3 Three Teen

You can also write the same query as

 SELECT hindi.Tag, english.Inenglish, hindi.Inhindi
 FROM english INNER JOIN hindi
 ON english.Tag = hindi.Tag

Remark: in MySQL, JOIN is equivalent to INNER JOIN and CROSS JOIN (Cartesian join)[4].

A Cartesian join is when you join every row of one table to every row of another table.

 SELECT * FROM english, hindi

It is also called Cross Join and may be written in this way:

 SELECT * FROM english CROSS JOIN hindi
TagInenglishTag Inhindi
1 One 2 Do
2 Two 2 Do
3 Three 2 Do
1 One 3 Teen
2 Two 3 Teen
3 Three 3 Teen
1 One 4 Char
2 Two 4 Char
3 Three 4 Char

Natural Join

edit

Natural Joins give the same result as an INNER JOIN on all the two tables common columns.

The following statement using "USING" method will display the same results as the previous INNER JOIN (compatible: MySQL 4+; but changed in MySQL 5).

 SELECT hindi.tag, hindi.Inhindi, english.Inenglish
 FROM hindi NATURAL JOIN english
 USING (Tag)

Outer Joins

edit
Tag Inenglish Tag Inhindi
1 One   
2 Two 2 Do
3 Three 3 Teen
    4 Char

Left Join / Left Outer Join

edit

The syntax is as follows:

 SELECT field1, field2 FROM table1 LEFT JOIN table2 ON field1=field2

 SELECT e.Inenglish as English, e.Tag, '--no row--' as Hindi
 FROM english AS e LEFT JOIN hindi AS h
 ON e.Tag=h.Tag 
 WHERE h.Inhindi IS NULL
English  tag   Hindi
One      1     --no row-

Right Outer Join

edit
 SELECT '--no row--' AS English, h.tag, h.Inhindi AS Hindi
 FROM english AS e RIGHT JOIN hindi AS h
 ON e.Tag=h.Tag
 WHERE e.Inenglish IS NULL

English tag Hindi --no row-- 4 Char

  • Make sure that you have the same name and same data type in both tables.
  • The keywords LEFT and RIGHT are not absolute, they only operate within the context of the given statement: we can reverse the order of the tables and reverse the keywords, and the result would be the same.
  • If the type of join is not specified as inner or outer then it will be executed as an INNER JOIN.

Full Outer Join

edit

As for v5.1, MySQL does not provide FULL OUTER JOIN. You may emulate it this way:

     (SELECT a.*, b* 
         FROM tab1 a LEFT JOIN tab2 b
         ON a.id = b.id)
 UNION
     (SELECT a.*, b* 
         FROM tab1 a RIGHT JOIN tab2 b
         ON a.id = b.id)

Multiple joins

edit

It is possible to join more than just two tables:

 SELECT ... FROM a JOIN (b JOIN c on b.id=c.id) ON a.id=b.id

Here is an example from Savane:

 mysql> SELECT group_type.type_id, group_type.name, COUNT(people_job.job_id) AS count 
        FROM group_type
         JOIN (groups JOIN people_job ON groups.group_id = people_job.group_id) 
         ON group_type.type_id = groups.type
        GROUP BY type_id ORDER BY type_id
 +---------+--------------------------------------+-------+
 | type_id | name                                 | count |
 +---------+--------------------------------------+-------+
 |       1 | Official GNU software                |   148 |
 |       2 | non-GNU software and documentation   |   268 |
 |       3 | www.gnu.org portion                  |     4 |
 |       6 | www.gnu.org translation team         |     5 |
 +---------+--------------------------------------+-------+
 4 rows in set (0.02 sec)

Subqueries

edit

(Compatible: MySQL 4.1 and later)

  • SQL subqueries let you use the results of one query as part of another query.
  • Subqueries are often natural ways of writing a statement.
  • Let you break a query into pieces and assemble it.
  • Allow some queries that otherwise can't be constructed. Without using a subquery, you have to do it in two steps.
  • Subqueries always appear as part of the WHERE (or HAVING) clause.
  • Only one field can be in the subquery SELECT. It means Subquery can only produce a single column of data as its result.
  • ORDER BY is not allowed; it would not make sense.
  • Usually refer to name of a main table column in the subquery.
  • This defines the current row of the main table for which the subquery is being run. This is called an outer reference.

For e.g. If RepOffice= OfficeNbr from Offices table, list the offices where the sales quota for the office exceeds the sum of individual salespersons' quotas

 SELECT City FROM Offices WHERE Target > ???

??? is the sum of the quotas of the salespeople, i.e.

 SELECT SUM(Quota)
 FROM SalesReps 
 WHERE RepOffice = OfficeNbr

We combine these to get

 SELECT City FROM Offices 
 WHERE Target > (SELECT SUM(Quota) FROM SalesReps 
 WHERE RepOffice = OfficeNbr)

Display all customers with orders or credit limits > $50,000. Use the DISTINCT word to list the customer just once.

 SELECT DISTINCT CustNbr 
 FROM Customers, Orders 
 WHERE CustNbr = Cust AND (CreditLimit>50000 OR Amt>50000);

References

edit

Resources

edit