mysql> explain select distinct ticker from messages where date>'2001-09-23';Which index is used is shown by the key column. Which value from another table will be looked up in this index is shown by the ref column.
+----------+-------+---------------+------+---------+------+-------+-------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+----------+-------+---------------+------+---------+------+-------+-------+
| messages | range | date | date | NULL | NULL | 11321 | |
+----------+-------+---------------+------+---------+------+-------+-------+
Here's how an index is used for an equijoin:
mysql> explain select distinct ticker from messages, sorted where messages.id=sorted.k;The product of rows from each table is an estimate of the number of combinations to be processed. How the index is used is shown by the type column. The least efficient value for type is ALL, then index, then range, ref, eq_ref.
+----------+--------+---------------+---------+---------+----------+--------+-------------+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+----------+--------+---------------+---------+---------+----------+--------+-------------+
| sorted | index | k | k | 4 | NULL | 124113 | Using index |
| messages | eq_ref | PRIMARY | PRIMARY | 4 | sorted.k | 1 | |
+----------+--------+---------------+---------+---------+----------+--------+-------------+
If only index is shown under Extra, then the the actual
table does not need to be accessed at all, which is the major efficiency
bonus mentioned on Monday.
The most important guideline is to avoid redundancy. Consider this schema:
messages: first name, last name, address, zip, title, bodyTo reduce redundancy, we should have two tables. We still have redundancy in the names, so we should introduce unique ids for authors:
messages: aid, title, bodyNote that we will often want unique ids for many different entities, e.g. for authors and separately for messages. That's why the new field is named aid and not just id.
person: aid, first name, last name, address, zip
Why is redundancy bad? There are at least four reasons:
Different database servers can enforce more or fewer types of constraint. For example, MySQL can enforce the first constraint by saying that the id field is not null in the messages table. The fourth constraint can be enforced with a primary key declaration on the pair of fields (name,address).
The second constraint is accommodated by not declaring id to be unique for the messages table. The third constraint is accommodated by making the messages and person tables separate. This constraint would be violated if we used one big table.
The last constraint is an example of a functional dependency. Unlike more sophisticated database systems, MySQL has no features to enforce functional dependencies.
Guideline: Before choosing a database design, write down what constraints
you know should be true. Then select a design that enforces these
constraints.