Index
    Preface
      What This Book Is About
      What You Need to Know
      How This Book Is Organized
      How to Use This Book
      Conventions Used in This Book
      Using Code Examples
      How to Contact Us
      Web Site and Code Examples
      Acknowledgments
      Chapter 1.  Database Applications and the Web
      Section 1.1.  The Web
      Section 1.2.  Three-Tier Architectures
      Chapter 2.  The PHP Scripting Language
      Section 2.1.  Introducing PHP
      Section 2.2.  Conditions and Branches
      Section 2.3.  Loops
      Section 2.4.  Functions
      Section 2.5.  Working with Types
      Section 2.6.  User-Defined Functions
      Section 2.7.  A Working Example
      Chapter 3.  Arrays, Strings, and Advanced Data Manipulation in PHP
      Section 3.1.  Arrays
      Section 3.2.  Strings
      Section 3.3.  Regular Expressions
      Section 3.4.  Dates and Times
      Section 3.5.  Integers and Floats
      Chapter 4.  Introduction to Object-Oriented Programming with PHP 5
      Section 4.1.  Classes and Objects
      Section 4.2.  Inheritance
      Section 4.3.  Throwing and Catching Exceptions
      Chapter 5.  SQL and MySQL
      Section 5.1.  Database Basics
      Section 5.2.  MySQL Command Interpreter
      Section 5.3.  Managing Databases and Tables
      Section 5.4.  Inserting, Updating, and Deleting Data
      Section 5.5.  Querying with SQL SELECT
      Section 5.6.  Join Queries
      Section 5.7.  Case Study: Adding a New Wine
      Chapter 6.  Querying Web Databases
      Section 6.1.  Querying a MySQL Database Using PHP
      Section 6.2.  Processing User Input
      Section 6.3.  MySQL Function Reference
      Chapter 7.  PEAR
      Section 7.1.  Overview
      Section 7.2.  Core Components
      Section 7.3.  Packages
      Chapter 8.  Writing to Web Databases
      Section 8.1.  Database Inserts, Updates, and Deletes
      Section 8.2.  Issues in Writing Data to Databases
      Chapter 9.  Validation with PHP and JavaScript
      Section 9.1.  Validation and Error Reporting Principles
      Section 9.2.  Server-Side Validation with PHP
      Section 9.3.  JavaScript and Client-Side Validation
      Chapter 10.  Sessions
      Section 10.1.  Introducing Session Management
      Section 10.2.  PHP Session Management
      Section 10.3.  Case Study: Using Sessions in Validation
      Section 10.4.  When to Use Sessions
      Section 10.5.  PHP Session API and Configuration
      Chapter 11.  Authentication and Security
      Section 11.1.  HTTP Authentication
      Section 11.2.  HTTP Authentication with PHP
      Section 11.3.  Form-Based Authentication
      Section 11.4.  Protecting Data on the Web
      Chapter 12.  Errors, Debugging, and Deployment
      Section 12.1.  Errors
      Section 12.2.  Common Programming Errors
      Section 12.3.  Custom Error Handlers
      Chapter 13.  Reporting
      Section 13.1.  Creating a Report
      Section 13.2.  Producing PDF
      Section 13.3.  PDF-PHP Reference
      Chapter 14.  Advanced Features of Object-Oriented Programming in PHP 5
      Section 14.1.  Working with Class Hierarchies
      Section 14.2.  Class Type Hints
      Section 14.3.  Abstract Classes and Interfaces
      Section 14.4.  Freight Calculator Example
      Chapter 15.  Advanced SQL
      Section 15.1.  Exploring with SHOW
      Section 15.2.  Advanced Querying
      Section 15.3.  Manipulating Data and Databases
      Section 15.4.  Functions
      Section 15.5.  Automating Querying
      Section 15.6.  Table Types
      Section 15.7.  Backup and Recovery
      Section 15.8.  Managing Users and Privileges
      Section 15.9.  Tuning MySQL
      Chapter 16.  Hugh and Dave's Online Wines:A Case Study
      Section 16.1.  Functional and System Requirements
      Section 16.2.  Application Overview
      Section 16.3.  Common Components
      Chapter 17.  Managing Customers
      Section 17.1.  Code Overview
      Section 17.2.  Customer Validation
      Section 17.3.  The Customer Form
      Chapter 18.  The Shopping Cart
      Section 18.1.  Code Overview
      Section 18.2.  The Winestore Home Page
      Section 18.3.  The Shopping Cart Implementation
      Chapter 19.  Ordering and Shipping at the Online Winestore
      Section 19.1.  Code Overview
      Section 19.2.  Credit Card and Shipping Instructions
      Section 19.3.  Finalizing Orders
      Section 19.4.  HTML and Email Receipts
      Chapter 20.  Searching and Authentication in the Online Winestore
      Section 20.1.  Code Overview
      Section 20.2.  Searching and Browsing
      Section 20.3.  Authentication
      Appendix A.  Linux Installation Guide
      Section A.1.  Finding Out What's Installed
      Section A.2.  Installation Overview
      Section A.3.  Installing MySQL
      Section A.4.  Installing Apache
      Section A.5.  Installing PHP
      Section A.6.  What's Needed for This Book
      Appendix B.  Microsoft Windows Installation Guide
      Section B.1.  Installation Overview
      Section B.2.  Installing with EasyPHP
      Section B.3.  What's Needed for This Book
      Appendix C.  Mac OS X Installation Guide
      Section C.1.  Getting Started
      Section C.2.  Installing MySQL
      Section C.3.  Setting Up Apache and PHP
      Section C.4.  What's Needed for This Book
      Appendix D.  Web Protocols
      Section D.1.  Network Basics
      Section D.2.  Hypertext Transfer Protocol
      Appendix E.  Modeling and Designing Relational Databases
      Section E.1.  The Relational Model
      Section E.2.  Entity-Relationship Modeling
      Appendix F.  Managing Sessions in theDatabase Tier
      Section F.1.  Using a Database to Keep State
      Section F.2.  PHP Session Management
      Section F.3.  MySQL Session Store
      Appendix G.  Resources
      Section G.1.  Client Tier Resources
      Section G.2.  Middle-Tier Resources
      Section G.3.  Database Tier Resources
      Section G.4.  Security and Cryptography Resources
      Appendix H.  The Improved MySQL Library
      Section H.1.  New Features
      Section H.2.  Getting Started
      Section H.3.  Using the New Features
    Colophon
    Copyright



 

Previous Section  < Day Day Up >  Next Section

15.2 Advanced Querying

In Chapter 5, we covered most of the querying techniques you'll need to develop web database applications. In this section, we show you selected advanced techniques including shortcuts for joins, other join types, how to use aliases, using MySQL's new nested query support, working with user variables, and obtaining subtotals using WITH ROLLUP. This section concludes with a list of what we've omitted, and what MySQL doesn't yet include.

15.2.1 Advanced Join Types

This section introduces you to the INNER JOIN, LEFT JOIN, RIGHT JOIN, and UNION statements. The INNER JOIN statement is a shortcut that can save you some typing (and we use it throughout many examples in this chapter), LEFT JOIN and RIGHT JOIN add new functionality to find rows that don't have a match in another table, and UNION brings together the results from two separate queries.

15.2.1.1 Natural and inner joins

In the Chapter 5, we showed you how to perform a join between two or more tables. For example, to join the customer and orders tables, with the goal of displaying customers who've placed orders, you would type:

SELECT DISTINCT surname, firstname, customer.cust_id FROM customer, orders

  WHERE customer.cust_id = orders.cust_id;

The join condition in the WHERE clause limits the output to only those rows where there's a matching customer and order (and the DISTINCT clause presents each customer's details once).

We've referred to our example query as a natural join, but this isn't strictly correct. A natural join (which is also introduced in the Chapter 5) produces the same results, but doesn't actually require you to specify what the join condition is. Consider an example:

SELECT DISTINCT surname, firstname, customer.cust_id 

  FROM customer NATURAL JOIN orders;

The MySQL server determines what attributes have the same name in the tables orders and customer, and creates the WHERE clause behind the scenes to join those attributes. For readable queries, we recommend you make your joins explicit by adding the WHERE clause and listing the attributes.

Just to make querying more confusing, the previous examples are also an example of an inner join. You can express the same query using the INNER JOIN syntax and the USING clause:

SELECT DISTINCT surname, firstname, customer.cust_id 

  FROM customer INNER JOIN orders USING (cust_id);

This query matches rows between the customer and orders tables using the cust_id attribute that's common to both tables. It's required that the attribute (or comma-separated) attributes listed in the USING clause are enclosed in brackets. If you leave out the USING clause, you'll get a Cartesian product and that's not what you want.

The join in the previous example is an inner join because only the rows that match between the two tables are output. Customers that haven't placed orders aren't output, and nor are orders that don't have a matching customer. The INNER JOIN with a USING clause can be used interchangeably with the comma-based syntax of the first example in this section (and, for that matter, all joins of two or more tables in Chapter 5). We use the INNER JOIN syntax frequently throughout this chapter.

The USING clause is a handy shortcut when two tables share a join attribute with the same name. When they don't, you can use the ON clause to achieve the same result. Consider an example that joins the wine and wine_type tables to discover the type of wine #100:

SELECT wine_type.wine_type 

  FROM wine INNER JOIN wine_type ON wine.wine_type=wine_type.wine_type_id

  WHERE wine.wine_id=100;

In general, you should use the ON clause only to specify a join condition. You should use a WHERE clause to specify which rows should be output.

You can have several conditions in an ON clause. For example, to find all of the wines in customer #20's first order, use:

SELECT wine_id FROM orders INNER JOIN items 

  ON orders.order_id=items.order_id AND orders.cust_id=items.cust_id

  WHERE orders.cust_id=20 AND orders.order_id=1;

In this case, since the attributes have the same name in the two tables, the shortcut with USING works too:

SELECT wine_id FROM orders INNER JOIN items USING (cust_id,order_id)

  WHERE orders.cust_id=20 AND orders.order_id=1;

15.2.1.2 Left and right joins

The queries in the previous section output rows that match between tables. But what if you want to output data from a table, even if it doesn't have a matching row in the other table? For example, suppose you want to output a list of all countries and the customers who live in that country, and you want to see a country listed even if it has no customers. You can do this with a LEFT JOIN query:

SELECT country, surname, firstname, cust_id

  FROM countries LEFT JOIN customer USING (country_id);

In part, this outputs the results:

| Australia                | Stribling   | Michelle  |     646 |

| Australia                | Skerry      | Samantha  |     647 |

| Australia                | Cassisi     | Betty     |     648 |

| Australia                | Krennan     | Jim       |     649 |

| Australia                | Woodburne   | Lynette   |     650 |

| Austria                  | NULL        | NULL      |    NULL |

| Azerbaijan               | NULL        | NULL      |    NULL |

| Bahamas                  | NULL        | NULL      |    NULL |

| Bahrain                  | NULL        | NULL      |    NULL |

| Bangladesh               | NULL        | NULL      |    NULL |

The LEFT JOIN clause outputs all rows from the table listed to the left of the clause. In this example, all countries are listed because the countries is on the left in the clause countries LEFT JOIN customer. When there are no matching rows in the customer table then NULL values are output for the customer attributes. So, for example, none of our customers live in Austria. The syntax of the LEFT JOIN is the same as the INNER JOIN clause, except that a USING or ON clause is required.

The RIGHT JOIN clause is identical, except that it outputs all rows from the table listed to the right of the clause, and NULL values are shown for the table on the left of the clause when there's no matching data. It's included in MySQL for convenience, so that you can write joins with the tables in the order you want in a query. However, we use only LEFT JOIN in our queries to keep things simple.

There's also a variation of NATURAL JOIN that does the same thing as LEFT JOIN:

SELECT country, surname, firstname, cust_id

  FROM countries NATURAL LEFT JOIN customer;

This just allows you to omit the USING and ON clauses, and to rely on the MySQL server figuring it out instead. Of course, there's NATURAL RIGHT JOIN too. Again, we recommend not using either and instead including an ON or USING clause to make the join condition explicit.

As we've seen, the LEFT JOIN clause outputs NULL values when there's no matching row in the table listed to the right of the clause. You can use this to limit your output to only those rows in the left table that don't have matching rows in the right table. For example, suppose you want to find all customers who've never placed an order. You can do this with the query:

SELECT surname, firstname, orders.cust_id

  FROM customer LEFT JOIN orders USING (cust_id)

  WHERE orders.cust_id IS NULL;

The query performs a left join, and then only outputs those rows where the cust_id in the orders table has been set to NULL in the join process. In part, the output is:

+------------+-----------+---------+

| surname    | firstname | cust_id |

+------------+-----------+---------+

| Sorrenti   | Caitlyn   |    NULL |

| Mockridge  | Megan     |    NULL |

| Krennan    | Samantha  |    NULL |

| Dimitria   | Melissa   |    NULL |

| Oaton      | Mark      |    NULL |

| Cassisi    | Joshua    |    NULL |

15.2.1.3 Unions

The UNION clause allows you to combine the results of two or more queries. Most of the time you won't need it because a WHERE clause, GROUP BY, or HAVING clause provides the features you need to extract rows. However, there are occasions where it's not possible to write one query that'll do a task, and UNION sometimes saves you merging results manually after two queries have been executed.

To use UNION, you need to have attributes of the same type listed in the same order in the SELECT statement. Consider a simple example where we want to list the three oldest and three newest customers from the customer table:

(SELECT cust_id, surname, firstname 

  FROM customer ORDER BY cust_id LIMIT 3)

UNION

(SELECT cust_id, surname, firstname 

  FROM customer ORDER BY cust_id DESC LIMIT 3);

The query produces the following results:

+---------+-----------+-----------+

| cust_id | surname   | firstname |

+---------+-----------+-----------+

|       1 | Rosenthal | Joshua    |

|       2 | Serrong   | Martin    |

|       3 | Leramonth | Jacob     |

|     650 | Woodburne | Lynette   |

|     649 | Krennan   | Jim       |

|     648 | Cassisi   | Betty     |

+---------+-----------+-----------+

6 rows in set (0.01 sec)

You can also combine queries from different tables, with different attributes of the same type. When you do this, the output is labeled with the attribute names from the first query. As an example, suppose you want to produce a list of regions and wineries. You could do this with:

(SELECT winery_name FROM winery)

UNION

(SELECT region_name FROM region);

The first and last four rows from the output are wineries and regions respectively:

+---------------------------------+

| winery_name                     |

+---------------------------------+

| Anderson and Sons Premium Wines |

| Anderson and Sons Wines         |

| Anderson Brothers Group         |

| Anderson Creek Group            |

...

| Riverland                       |

| Rutherglen                      |

| Swan Valley                     |

| Upper Hunter Valley             |

+---------------------------------+

310 rows in set (0.01 sec)

15.2.2 Aliases

To save typing, add additional functionality, or just improve the labeling of columns, aliases are sometimes used for attribute and table names in querying. Attribute aliases are particularly useful in PHP as they can help you rename duplicate attribute names as discussed in Chapter 6. Table aliases add functionality when you want to join a table with itself and they're essential in some aspects of nested queries as discussed later in Section 15.2.3.

Consider an example query that uses table aliases:

SELECT * FROM inventory i, wine w

  WHERE i.wine_id = 183 AND i.wine_id = w.wine_id;

In this query, the FROM clause specifies aliases for the table names. The alias inventory i means than the inventory table can be referred to as i elsewhere in the query. For example, i.wine_id is the same as inventory.wine_id. This just saves typing in this example.

Aliases are very useful for complex queries that need to use the same table twice but in different ways. For example, to find any two customers with the same surname, you can use:

SELECT c1.cust_id, c2.cust_id FROM customer c1, customer c2 

  WHERE c1.surname = c2.surname AND c1.cust_id != c2.cust_id;

Here we used the customer table twice but gave it two aliases (c1 and c2) so we can compare two customers. The final clause, c1.cust_id != c2.cust_id, is essential because, without it, all customers are reported as answers; this would occur because all customers are rows in tables c1 and c2, and each customer row would match itself.

Attribute aliases are similar to table aliases. Consider an example:

SELECT surname AS s, firstname AS f FROM customer

  WHERE surname = "Krennan" ORDER BY s, f;

In part, this outputs:

+---------+----------+

| s       | f        |

+---------+----------+

| Krennan | Andrew   |

| Krennan | Betty    |

| Krennan | Caitlyn  |

| Krennan | Caitlyn  |

| Krennan | Dimitria |

An attribute alias can be used in the ORDER BY, GROUP BY, and HAVING clauses, but not in the WHERE clause; it can't be used in a WHERE clause (or USING or ON) because an attribute may not be known when the WHERE clause is executed. The alias is also used for the column headings in the output (and, as discussed in Chapter 6, you'll find this useful when you're working with PHP's mysql_fetch_array( ) function or PEAR DB's DB::fetchRow( )).

Attribute aliases can also be used with functions. In the next example, we're finding out how many customers are resident in each city (but only for cities that have more the five customers):

SELECT count(*) AS residents, city FROM customer

  GROUP BY city HAVING residents>5 ORDER by residents DESC;

Here, residents is an alias that refers to the count function. In part, the query outputs:

+-----------+---------------+

| residents | city          |

+-----------+---------------+

|        16 | Portsea       |

|        14 | Alexandra     |

|        13 | Kidman        |

|        13 | Montague      |

|        13 | Doveton       |

|        13 | Mohogany      |

15.2.3 Nested Queries

MySQL 4.1 supports nested queries, solving MySQL's most frequently discussed weakness. Nested queries are those that contain another query—they are both elegant and powerful but, unfortunately, can be difficult to learn to use. This section presents an overview of nested queries, but you'll find much longer discussions in the relational database texts listed in Appendix G.

15.2.3.1 Introduction

Consider an example nested query that finds the names of the wineries that are in the Margaret River region:

SELECT winery_name FROM winery WHERE region_id 

  = (SELECT region_id FROM region WHERE region_name = "Margaret River");

The inner query (the one in brackets) returns the region_id value of the Margaret River region. The outer query (the one listed first) finds the winery_name values from the winery table where the region_id matches the result of the inner query.

You can nest to any level, as long as you get the brackets right. Here's another example that finds the name of the region that makes wine #17:

SELECT region_name FROM region WHERE region_id = 

  (SELECT region_id FROM winery WHERE winery_id = 

    (SELECT winery_id FROM wine WHERE wine_id = 17));

Both of our previous examples can be easily rewritten as a single query with a WHERE clause and an AND operator. Indeed, you should always try to write join queries where possible and avoid nesting unless you need it; MySQL isn't good at optimizing nested queries and they are therefore usually slower to run. However, sometimes, you need a nested query.

Here's an example where a nested query is the only practical solution. Suppose you want to find which customers have made the largest single purchase of a wine. You can find which wine was sold for the highest total price using:

SELECT MAX(price) FROM items;

This reports the maximum price:

+------------+

| MAX(price) |

+------------+

|     329.12 |

+------------+

1 row in set (0.01 sec)

You could then write a second query to find the customers who bought the wine:

SELECT customer.cust_id FROM customer INNER JOIN items USING (cust_id)

  WHERE price = 329.12;

However, with nesting you can put the queries together into a single step:

SELECT DISTINCT customer.cust_id FROM customer 

  INNER JOIN items USING (cust_id)

  WHERE price = (SELECT MAX(price) FROM items);

It's not possible to write this query in one step without nesting. As we discussed in Chapter 8, using the output of a SELECT query as the input to an UPDATE, INSERT, or DELETE can cause concurrency problems and, therefore, nested queries allow you to avoid locking for many (but not all) queries.

Nesting can also be used in the HAVING clause. We don't discuss this in detail here.

In the examples so far, we've used the equals = operator. You can also use other comparison operators, including <, >, <=, >=, and !=. These operators are discussed in more detail later in "Functions." Also, all of our examples return single values from the inner query, and only one attribute or aggregate. If the inner query returns more than one value or attribute, MySQL reports an error; this can be solved using the IN clause we discuss next.

15.2.3.2 The IN clause

Suppose you want to find the wines that have been purchased by customers who've placed at least six orders. You can't use the techniques we've discussed in the previous section because the inner query (which finds the customers who've made more than six purchases) is likely to return more than one cust_id value. However, you can still use a nested query for the task by using the IN clause.

Let's consider how you'd find customers who've placed six or more orders. You'd use a query such as this:

SELECT customer.cust_id FROM customer 

  INNER JOIN orders USING (cust_id)

  GROUP BY cust_id HAVING count(order_id) >= 6;

When you test this query on the winestore database, you'll find there are 107 customers returned as answers. However, to make things easy, let's look only for the three customers returned with cust_id values of 7, 14, and 107.

You could find all of the wines purchased by those three customers using the following query:

SELECT DISTINCT wine_id FROM items 

  WHERE cust_id = 7 OR cust_id = 14 OR cust_id = 107;

Of course, you could extend this to find all wines for all 107 customers, but that requires a lot of typing!

Here's how you can do it with a nested query and the IN clause:

SELECT DISTINCT wine_id FROM items WHERE cust_id IN 

  (SELECT customer.cust_id FROM customer

   INNER JOIN orders USING (cust_id)

   GROUP BY cust_id HAVING count(order_id) >= 6);

The outer query finds all wine_id values from the items table where the cust_id is in the set of values returned from the inner query. The inner query finds all customers who've made at least six orders. If you run this query on your MySQL installation, you'll find it's very slow to execute because MySQL isn't yet that good at optimizing nested queries. However, the result is exactly what you want.

You can compare several attributes in the nesting condition by listing more than one attribute before an IN clause, as long as the attributes are of the same type and order as those listed in the nested query. This isn't a common requirement, because most of the time you can do this with a WHERE clause and a join query. But to illustrate the syntax, suppose we had a table that contained a list of contacts, and we wanted to find out which of our contacts had the same name as a customer. A possible query would be:

SELECT * FROM contacts WHERE (surname, firstname) IN

  (SELECT surname, firstname FROM customer);

The query would return the set of people whose surnames and firstnames are the same in the contacts and customer tables. Of course, this query could be rewritten as:

SELECT * FROM contacts INNER JOIN 

  customer USING (surname, firstname);

With MySQL, the join query runs much faster, and should be used in preference.

Nested queries can also use the NOT IN clause. This has the opposite effect to IN, and is analogous to the != operator but is applied to more than one row. Here's an example, where we want to find those customers who've not made at least five orders:

SELECT customer.cust_id, surname, firstname FROM customer 

  WHERE customer.cust_id NOT IN

    (SELECT customer.cust_id FROM customer 

     INNER JOIN orders USING (cust_id)

     GROUP BY cust_id HAVING count(*) >= 5);

15.2.3.3 The EXISTS clause

Perhaps the least intuitive (and most complicated) of the nested querying tools is the EXISTS clause. However, it's very useful. The EXISTS clause is used to return output from the outer query if the inner query returns any results. Consider an example:

SELECT region_name FROM region WHERE EXISTS

  (SELECT region_id FROM winery GROUP BY region_id HAVING count(*) > 35);

MySQL first runs the inner query which, if you run it yourself, returns the following regions that contain at least 35 wineries:

+-----------+

| region_id |

+-----------+

|         4 |

|         5 |

|         9 |

+-----------+

3 rows in set (0.00 sec)

Because the inner query returns a result, the outer query is executed, and so the overall output of the nested query is:

+---------------------+

| region_name         |

+---------------------+

| All                 |

| Barossa Valley      |

| Coonawarra          |

| Goulburn Valley     |

| Lower Hunter Valley |

| Margaret River      |

| Riverland           |

| Rutherglen          |

| Swan Valley         |

| Upper Hunter Valley |

+---------------------+

10 rows in set (0.01 sec)

This perhaps isn't what you expected: it's a list of all regions, and it has nothing to do with those regions that have at least 35 wineries!

You're probably wondering now whether EXISTS is actually useful. It is, but only when the inner query contains an outer reference. An outer reference creates a relationship between the inner query and the outer query, in the same way as IN or a comparison operator such as = does in the previous sections. Consider an example that corrects our previous one:

SELECT region_name FROM region WHERE EXISTS 

  (SELECT * FROM winery WHERE region.region_id = winery.region_id 

   GROUP BY region_id HAVING count(*) > 35);

The query returns the following results:

+---------------------+

| region_name         |

+---------------------+

| Coonawarra          |

| Upper Hunter Valley |

| Margaret River      |

+---------------------+

3 rows in set (0.00 sec)

The query is now returning the results we expected: a list of regions that contain more than 35 wineries. You'll notice that the region table's region_id attribute is referenced in the inner query but the region table isn't listed in its FROM clause. This is the outer reference, and it causes MySQL to run the inner query for every value returned from the outer query and output is only produced when the inner query returns a result. The region table's region_id is used in the inner query in the WHERE and GROUP BY clauses, and the count(*) in the HAVING clause therefore refers to the number of wineries in a region.

Figure 15-1 shows how MySQL evaluates the previous query using the EXISTS clause. For each region_name that's in the region table, MySQL runs an inner query. If the inner query produces results, the region_name from the outer query is added to the results; if it doesn't produce results, the outer result isn't shown. As the figure illustrates, MySQL evaluates an inner query for every outer result, and this can be slow.

Figure 15-1. Evaluating a nested query that uses EXISTS
figs/wda2_1501.gif


Whether you use the IN clause or the EXISTS clause is almost always a personal preference. You'll recall this query from the previous section:

SELECT DISTINCT wine_id FROM items WHERE cust_id IN 

  (SELECT customer.cust_id FROM customer 

   INNER JOIN orders USING (cust_id)

   GROUP BY cust_id HAVING count(order_id) >= 6);

This query revolves around the use of cust_id in both the inner and outer queries. The inner query joins tables using cust_id and groups results by cust_id. Therefore, the outer query finds the cust_id associated with six or more orders.

It can be rewritten using the EXISTS clause as follows:

SELECT DISTINCT wine_id FROM items WHERE EXISTS 

  (SELECT * FROM orders WHERE orders.cust_id = items.cust_id

   GROUP BY cust_id HAVING count(order_id) >= 6);

This query revolves around orders in the inner query and items in the outer query. We group all orders by cust_id to find the count of six or greater. The items are irrelevant to this inner SELECT, but we refer to the items attribute so that we can search for items in the outer query (items is the outer reference).

Interestingly, while the EXISTS version is much slower than a typical join, the IN version is more than fives times slower again in our test environment. This again illustrates how poor MySQL is at optimizing nested queries, and this is hopefully something that will change soon. However, when you can, you should always write a join query in preference to a nested query.

One final note: there's also a NOT EXISTS clause that has the opposite function to EXISTS. Using NOT EXISTS, the outer query is executed if the inner query doesn't return any results.

15.2.3.4 Nested queries in the FROM clause

Nested queries can also be used in the FROM clause of a query to create an artificial table. Consider a (contrived) example where you want to find the alphabetically last customer. The following query outputs consolidated customer names:

SELECT concat(surname, " ", firstname) AS name FROM customer;

The concat( ) function joins together strings, and is discussed later in Section 15.4. The first few lines of output from the query are:

+----------------------+

| name                 |

+----------------------+

| Rosenthal Joshua     |

| Serrong Martin       |

| Leramonth Jacob      |

| Keisling Perry       |

Adding the nesting to the query allows you to find the maximum (alphabetically last) customer:

SELECT max(cust.name) FROM 

  (SELECT concat(surname, " ", firstname) AS name FROM customer) AS cust;

This outputs:

+-------------------+

| max(cust.name)    |

+-------------------+

| Woodestock Sandra |

+-------------------+

1 row in set (0.01 sec)

You'll notice that we've aliased concat(surname, " ", firstname) as name so that it's easily referenced in the outer query. Also, you'll notice we've aliased the inner query using a table alias as cust. It's mandatory to alias the results returned from the inner query using a table alias, otherwise MySQL doesn't know how to reference the results in the outer query.

Of course, more simply, the previous query could have been accomplished using an unnested query and MySQL's proprietary LIMIT clause:

SELECT surname, firstname FROM customer

  ORDER BY surname DESC, firstname DESC LIMIT 1;

15.2.4 User Variables

User variables are used to store intermediate results and use them in later statements. We explain how to use them and discuss their advantages and limitations in this section.

Consider an example. In Chapter 8, we showed you the following sequence of statements as an example of when to use locking:

mysql> LOCK TABLES items READ, temp_report WRITE;



mysql> SELECT sum(price) FROM items WHERE cust_id=1;

+------------+

| sum(price) |

+------------+

|     438.65 |

+------------+

1 row in set (0.04 sec)



mysql> UPDATE temp_report SET purchases=438.65

       WHERE cust_id=1;

mysql> UNLOCK TABLES;

The example is a little clumsy. It requires that you write down or copy the value 438.65, and then use type it in or paste it into the UPDATE statement. In contrast, if you were executing the statements using PHP, you would retrieve the row produced by the SELECT statement using mysql_fetch_array( ), save the value in a PHP variable, and then execute the UPDATE statement and include the value of the PHP variable in its WHERE clause.

A better approach than recording the value or using PHP variables is to save the value in a MySQL user variable. MySQL user variables allow you to save results for a connection without using PHP and, therefore, without transferring results to the web server. Here's the previous example rewritten to use this approach:

mysql> LOCK TABLES items READ, temp_report WRITE;



mysql> SELECT @total:=sum(price) FROM items WHERE cust_id=1;

+--------------------+

| @total:=sum(price) |

+--------------------+

|             438.65 |

+--------------------+

1 row in set (0.14 sec)



mysql> UPDATE temp_report SET purchases=@total

       WHERE cust_id=1;

mysql> UNLOCK TABLES;

User variables are prefixed with an @ character and the assignment operator is :=. In this example, the result of the SELECT statement is saved in the MySQL variable @total. In the UPDATE statement, the value of the variable @total is assigned to the attribute purchases. The benefit is that you don't have to remember (or cut and paste) the result of the SELECT statement.

Consider another example. Suppose you want to find which customers bought the most expensive wine (or wines). First, you run a query that finds the price of the most expensive wine and save the result in a MySQL variable @max_cost:

mysql> SELECT @max_cost:=max(cost) FROM inventory;

+----------------------+

| @max_cost:=max(cost) |

+----------------------+

|                29.92 |

+----------------------+

1 row in set (0.01 sec)

Now that the maximum cost is saved, you can use it in the WHERE clause of a query to find the names of the customers who've bought the most expensive wine. To do this, you join together the customer, items, and inventory tables in the following query:

mysql> SELECT customer.cust_id, surname, firstname FROM

    -> customer INNER JOIN items USING (cust_id) 

    -> INNER JOIN inventory USING (wine_id)

    -> WHERE cost = @max_cost;

+---------+------------+-----------+

| cust_id | surname    | firstname |

+---------+------------+-----------+

|      32 | Archibald  | Joshua    |

|      33 | Galti      | Lynette   |

|      44 | Mellili    | Michelle  |

|      54 | Woodestock | George    |

|      71 | Mellaseca  | Lynette   |

|     144 | Nancarral  | Joshua    |

|     156 | Cassisi    | Joshua    |

|     236 | Mockridge  | Megan     |

|     274 | Eggelston  | Melissa   |

|     320 | Mellaseca  | Craig     |

|     334 | Serrong    | Caitlyn   |

|     408 | Patton     | Joshua    |

|     510 | Sorrenti   | Joel      |

|     531 | Nancarral  | Michelle  |

|     551 | Skerry     | Joel      |

|     622 | Serrong    | Peter     |

+---------+------------+-----------+

16 rows in set (0.08 sec)

The WHERE clause uses the MySQL user variable saved from the previous SELECT query.

There are three issues you need to remember with MySQL user variables:

  • They only work for a connection. You can't see or use them from other connections, and they're lost when the connection closes.

  • They can only contain alphanumeric characters, the underscore character, the dollar sign, or a period.

  • They usually only work when you assign the variable in one statement and use its value in another. To avoid unexpected behavior, do not assign and use the variable in the same statement.

In general, we recommend using MySQL variables where possible in preference to saving intermediate values in your PHP scripts.

15.2.5 ROLLUP with GROUP BY

MySQL 4.1.1 and later versions support the WITH ROLLUP modifier that provides subtotaling of grouped columns in output. To show you how it works, first consider a simple example without WITH ROLLUP, where we want to find the sales of wines made in each year:

SELECT year, sum(price) FROM wine 

  INNER JOIN items USING (wine_id)

  GROUP BY year;

In part, this reports:

+------+------------+

| year | sum(price) |

+------+------------+

| 1970 |   20562.89 |

| 1971 |   16273.73 |

...

| 1997 |   18009.39 |

| 1998 |   20739.53 |

| 1999 |   18890.10 |

+------+------------+

30 rows in set (0.13 sec)

Now, consider what happens if you add WITH ROLLUP to the query:

SELECT year, sum(price) FROM wine INNER JOIN items USING (wine_id)

  GROUP BY year WITH ROLLUP;

You get one extra row in the results:

| 1997 |   18009.39 |

| 1998 |   20739.53 |

| 1999 |   18890.10 |

| NULL |..577975.66 |

+------+------------+

31 rows in set (0.13 sec)

The extra row has a NULL value for the year, and the sum(price) column is the sum of all sales in all years. It's a shortcut that saves you running the following extra query:

SELECT sum(price) FROM items INNER JOIN wine USING (wine_id);

Now, consider a more sophisticated example that finds the total sales of each wine. In the example, we've included the region and winery tables so that we can use WITH ROLLUP to get a subtotal of wines sold by each winery and region. The key to obtaining the subtotal is to use unique values from the region and winery tables (in addition to the wine_id) in the GROUP BY clause. Here's the query:

SELECT region_name, winery_name, wine.wine_id, sum(price) FROM region

  INNER JOIN winery USING (region_id) 

  INNER JOIN wine USING (winery_id) 

  INNER JOIN items USING (wine_id) 

  GROUP BY region_name, winery_name, wine.wine_id WITH ROLLUP;

The output reports, in part, the following:

+----------------+---------------------+---------+------------+

| region_name    | winery_name         | wine_id | sum(price) |

+----------------+---------------------+---------+------------+

| Barossa Valley | Anderson Daze Wines |     214 |     978.25 |

| Barossa Valley | Anderson Daze Wines |     215 |      31.62 |

| Barossa Valley | Anderson Daze Wines |     216 |     576.25 |

| Barossa Valley | Anderson Daze Wines |     217 |     225.39 |

| Barossa Valley | Anderson Daze Wines |     218 |     190.26 |

| Barossa Valley | Anderson Daze Wines |    NULL |    2001.77 |

...

| Barossa Valley | NULL                |    NULL |   68403.90 |

...

| NULL           | NULL                |    NULL |  577975.66 |

+----------------+---------------------+---------+------------+

The sixth row shows that the total sales of the Anderson Daze Wines winery (which equals the sum of rows one to five) is $2001.77. Similarly, a total for each winery is listed immediately after that winery. After all wines for all wineries in the Barossa Valley are listed, the total for the Barossa Valley region of $68403.90 is shown using NULL values for winery_name and wine_id. Again, a region subtotal is shown immediately after each region. The last row in the table has NULL values for all attributes, and is the total sales of $577,975.66 for all regions, wineries, and wines.

The WITH ROLLUP modifier has a few peculiarities. First, the ORDER BY clause cannot be used with WITH ROLLUP. Second, the LIMIT clause is applied after the output is produced, and so it includes the subtotal rows that have NULL values. Other limitations are discussed in the MySQL manual.

15.2.6 Other MySQL Topics

We've gone as far as we're going with querying, and further than you'll need for most web database applications that you'll develop. There are topics that we've left out, including optimizing queries, using procedures, and full-text searching. Some of these topics are discussed in books dedicated solely to MySQL that are listed in Appendix E.

You'll find more about optimizing queries by diagnosis with the EXPLAIN statement in Section 5.2.1 of the MySQL manual, and we briefly discuss it in the section Tuning MySQL. Section 5 of the MySQL also includes an explanation of how MySQL optimizes most join queries, but it doesn't yet discuss nested queries. Forcing MySQL to use or ignore an index is discussed in Section 6.4.1 of the MySQL manual. Other general ideas about improving the performance of MySQL are also discussed in Section 15.9.

MySQL procedures are C++ code that can be called within a SELECT statement to post-process data. Writing procedures is discussed in Section 12 of the MySQL manual. If you're familiar with other database servers, you might also be familiar with stored procedures, which aren't the same thing. Stored procedures are SQL statements that are precompiled and stored in the server so that the client application can call the procedure instead of re-running the query, with the result that performance is substantially improved. Stored procedure support is planned for MySQL 5. Triggers are another common database server component; they are similar to stored procedures but triggers are invoked by the server when a condition is met. Triggers support is also planned for MySQL 5.

Views aren't supported in MySQL, but support is planned in MySQL 5.1. Views consolidate read-only access to several tables based on a join condition. For example, a view might allow a user to browse the sales made up to April without the need to create a temporary table. Other limitations that we don't discuss here include the lack of support for foreign keys in some table types and the lack of cursor support. Both are planned for MySQL 5.

    Previous Section  < Day Day Up >  Next Section







    Copyright © 2010 | Domen maybe sale - bye this domen