Index
    Preface
      What This Book Is About
      What You Need to Know
      How This Book Is Organized
      How to Use This Book
      Conventions Used in This Book
      Using Code Examples
      How to Contact Us
      Web Site and Code Examples
      Acknowledgments
      Chapter 1.  Database Applications and the Web
      Section 1.1.  The Web
      Section 1.2.  Three-Tier Architectures
      Chapter 2.  The PHP Scripting Language
      Section 2.1.  Introducing PHP
      Section 2.2.  Conditions and Branches
      Section 2.3.  Loops
      Section 2.4.  Functions
      Section 2.5.  Working with Types
      Section 2.6.  User-Defined Functions
      Section 2.7.  A Working Example
      Chapter 3.  Arrays, Strings, and Advanced Data Manipulation in PHP
      Section 3.1.  Arrays
      Section 3.2.  Strings
      Section 3.3.  Regular Expressions
      Section 3.4.  Dates and Times
      Section 3.5.  Integers and Floats
      Chapter 4.  Introduction to Object-Oriented Programming with PHP 5
      Section 4.1.  Classes and Objects
      Section 4.2.  Inheritance
      Section 4.3.  Throwing and Catching Exceptions
      Chapter 5.  SQL and MySQL
      Section 5.1.  Database Basics
      Section 5.2.  MySQL Command Interpreter
      Section 5.3.  Managing Databases and Tables
      Section 5.4.  Inserting, Updating, and Deleting Data
      Section 5.5.  Querying with SQL SELECT
      Section 5.6.  Join Queries
      Section 5.7.  Case Study: Adding a New Wine
      Chapter 6.  Querying Web Databases
      Section 6.1.  Querying a MySQL Database Using PHP
      Section 6.2.  Processing User Input
      Section 6.3.  MySQL Function Reference
      Chapter 7.  PEAR
      Section 7.1.  Overview
      Section 7.2.  Core Components
      Section 7.3.  Packages
      Chapter 8.  Writing to Web Databases
      Section 8.1.  Database Inserts, Updates, and Deletes
      Section 8.2.  Issues in Writing Data to Databases
      Chapter 9.  Validation with PHP and JavaScript
      Section 9.1.  Validation and Error Reporting Principles
      Section 9.2.  Server-Side Validation with PHP
      Section 9.3.  JavaScript and Client-Side Validation
      Chapter 10.  Sessions
      Section 10.1.  Introducing Session Management
      Section 10.2.  PHP Session Management
      Section 10.3.  Case Study: Using Sessions in Validation
      Section 10.4.  When to Use Sessions
      Section 10.5.  PHP Session API and Configuration
      Chapter 11.  Authentication and Security
      Section 11.1.  HTTP Authentication
      Section 11.2.  HTTP Authentication with PHP
      Section 11.3.  Form-Based Authentication
      Section 11.4.  Protecting Data on the Web
      Chapter 12.  Errors, Debugging, and Deployment
      Section 12.1.  Errors
      Section 12.2.  Common Programming Errors
      Section 12.3.  Custom Error Handlers
      Chapter 13.  Reporting
      Section 13.1.  Creating a Report
      Section 13.2.  Producing PDF
      Section 13.3.  PDF-PHP Reference
      Chapter 14.  Advanced Features of Object-Oriented Programming in PHP 5
      Section 14.1.  Working with Class Hierarchies
      Section 14.2.  Class Type Hints
      Section 14.3.  Abstract Classes and Interfaces
      Section 14.4.  Freight Calculator Example
      Chapter 15.  Advanced SQL
      Section 15.1.  Exploring with SHOW
      Section 15.2.  Advanced Querying
      Section 15.3.  Manipulating Data and Databases
      Section 15.4.  Functions
      Section 15.5.  Automating Querying
      Section 15.6.  Table Types
      Section 15.7.  Backup and Recovery
      Section 15.8.  Managing Users and Privileges
      Section 15.9.  Tuning MySQL
      Chapter 16.  Hugh and Dave's Online Wines:A Case Study
      Section 16.1.  Functional and System Requirements
      Section 16.2.  Application Overview
      Section 16.3.  Common Components
      Chapter 17.  Managing Customers
      Section 17.1.  Code Overview
      Section 17.2.  Customer Validation
      Section 17.3.  The Customer Form
      Chapter 18.  The Shopping Cart
      Section 18.1.  Code Overview
      Section 18.2.  The Winestore Home Page
      Section 18.3.  The Shopping Cart Implementation
      Chapter 19.  Ordering and Shipping at the Online Winestore
      Section 19.1.  Code Overview
      Section 19.2.  Credit Card and Shipping Instructions
      Section 19.3.  Finalizing Orders
      Section 19.4.  HTML and Email Receipts
      Chapter 20.  Searching and Authentication in the Online Winestore
      Section 20.1.  Code Overview
      Section 20.2.  Searching and Browsing
      Section 20.3.  Authentication
      Appendix A.  Linux Installation Guide
      Section A.1.  Finding Out What's Installed
      Section A.2.  Installation Overview
      Section A.3.  Installing MySQL
      Section A.4.  Installing Apache
      Section A.5.  Installing PHP
      Section A.6.  What's Needed for This Book
      Appendix B.  Microsoft Windows Installation Guide
      Section B.1.  Installation Overview
      Section B.2.  Installing with EasyPHP
      Section B.3.  What's Needed for This Book
      Appendix C.  Mac OS X Installation Guide
      Section C.1.  Getting Started
      Section C.2.  Installing MySQL
      Section C.3.  Setting Up Apache and PHP
      Section C.4.  What's Needed for This Book
      Appendix D.  Web Protocols
      Section D.1.  Network Basics
      Section D.2.  Hypertext Transfer Protocol
      Appendix E.  Modeling and Designing Relational Databases
      Section E.1.  The Relational Model
      Section E.2.  Entity-Relationship Modeling
      Appendix F.  Managing Sessions in theDatabase Tier
      Section F.1.  Using a Database to Keep State
      Section F.2.  PHP Session Management
      Section F.3.  MySQL Session Store
      Appendix G.  Resources
      Section G.1.  Client Tier Resources
      Section G.2.  Middle-Tier Resources
      Section G.3.  Database Tier Resources
      Section G.4.  Security and Cryptography Resources
      Appendix H.  The Improved MySQL Library
      Section H.1.  New Features
      Section H.2.  Getting Started
      Section H.3.  Using the New Features
    Colophon
    Copyright



 

Previous Section  < Day Day Up >  Next Section

5.6 Join Queries

You'll often want to output data that's based on relationships between two or more tables. For example, in the winestore database, you might want to know which customers have placed orders, which customers live in Australia, or how many bottles of wine Lucy Williams has bought. These are examples of join queries, queries that match rows between tables based (usually) on primary key values. In SQL, a join query matches rows from two or more tables based on a condition in a WHERE clause and outputs only those rows that meet the condition.

As part of the process of converting the winestore entity-relationship model to SQL statements, we've included the attributes required in any practical join condition. To understand which tables can be joined in the winestore database, and how the joins are processed, it's helpful to have a copy of the ER model at hand as you work your way through this section.

5.6.1 Beware of the Cartesian Product

Suppose you want to find out the names of the wineries in the winestore database and, for each winery, the name of the region that it's located in. To do this, you examine the ER model and discover that the region and winery tables are related, and that they both contain attributes that you need in the answer to your query. Specifically, you need to retrieve the winery_name attribute from the winery table and the region_name attribute from the region table, and you need to join the two tables together to find the result.

Consider this query, which we might intuitively, but wrongly, use to find all the wineries in a region:

SELECT winery_name, region_name FROM winery, region;

This query produces (in part) the following results:

+-------------------------------+-------------+

| winery_name                   | region_name |

+-------------------------------+-------------+

| Durham and Sons Premium Wines | Coonawarra  |

| Durham Brook Group            | Coonawarra  |

| Durham Creek                  | Coonawarra  |

| Durham Estates                | Coonawarra  |

| Durham Hill Vineyard          | Coonawarra  |

The impression here is that, for example, Durham Creek winery is located in the Coonawarra region. This might not be the case. Why? First, you can use the techniques covered so far in this chapter to check which region the Durham Creek winery is located in:

SELECT region_id FROM winery WHERE winery_name='Durham Creek';

The result is:

+-----------+

| region_id |

+-----------+

|         9 |

+-----------+

1 row in set (0.01 sec)

Now, you can query the region table to find the name of the region using:

mysql> SELECT region_name FROM region WHERE region_id=9;

+----------------+

| region_name    |

+----------------+

| Margaret River |

+----------------+

1 row in set (0.00 sec)

So, Durham Creek winery isn't in Coonawarra at all!

What happened in the first attempt at a join query? The technical answer is that you just evaluated a Cartesian product: you produced as output all the possible combinations of wineries and regions, most of which don't make any sense. These odd results can be seen if you add an ORDER BY clause to the original query:

SELECT winery_name, region_name FROM winery, region 

  ORDER BY winery_name, region_name;

Recall that the ORDER BY clause sorts the results after the query has been evaluated and that it has no effect on which rows are returned from the query. Here is the first part of the output:

+---------------------------------+---------------------+

| winery_name                     | region_name         |

+---------------------------------+---------------------+

| Anderson and Sons Premium Wines | All                 |

| Anderson and Sons Premium Wines | Barossa Valley      |

| Anderson and Sons Premium Wines | Coonawarra          |

| Anderson and Sons Premium Wines | Goulburn Valley     |

| Anderson and Sons Premium Wines | Lower Hunter Valley |

| Anderson and Sons Premium Wines | Margaret River      |

| Anderson and Sons Premium Wines | Riverland           |

| Anderson and Sons Premium Wines | Rutherglen          |

| Anderson and Sons Premium Wines | Swan Valley         |

| Anderson and Sons Premium Wines | Upper Hunter Valley |

The query produces all possible combinations of the 10 region names and 300 wineries in the sample database! In fact, the number of rows output is the total number of rows in the first table multiplied by the total rows in the second table. In this case, the output is 10 x 300 = 3,000 rows.

5.6.2 Elementary Natural Joins

A cartesian product isn't the join we want. Instead, we want to limit the results to only the sensible rows, where the winery is actually located in the region. To do this, you need to understand how the relationship between the region and winery tables is maintained. If you examine the ER model, you'll see that many wineries are located in a region.

In the database tables, the relationship between the winery and region tables is maintained using the primary key of the region table, the attribute region_id that's also an attribute in the winery table. To understand this, consider the first three rows from the winery table:

mysql> SELECT * FROM winery LIMIT 3;

+-----------+--------------------------+-----------+

| winery_id | winery_name              | region_id |

+-----------+--------------------------+-----------+

|         1 | Hanshaw Estates Winery   |         2 |

|         2 | De Morton and Sons Wines |         5 |

|         3 | Jones's Premium Wines    |         3 |

+-----------+--------------------------+-----------+

3 rows in set (0.04 sec)

The first winery has a region_id of 2, the second a region_id of 5, and the third a region_id of 3. Consider now the first five rows of the region table:

mysql> SELECT * FROM region LIMIT 5;

+-----------+---------------------+

| region_id | region_name         |

+-----------+---------------------+

|         1 | All                 |

|         2 | Goulburn Valley     |

|         3 | Rutherglen          |

|         4 | Coonawarra          |

|         5 | Upper Hunter Valley |

+-----------+---------------------+

5 rows in set (0.04 sec)

If you match up each winery's region_id value with a region's region_id value, you can determine the relationship and answer the query. For example, you can now see that the first winery (Hanshaw Estates Winery) is located in region 2, the Goulburn Valley.

From a querying perspective, we want to output winery_name and region_name values where the region_id in the winery table matches the corresponding region_id in the region table. This is a natural join.

You can perform a natural join on the winery and region tables using:

SELECT winery_name, region_name FROM winery NATURAL JOIN region

  ORDER BY winery_name;

The query produces (in part) the following sensible results:

+---------------------------------+---------------------+

| winery_name                     | region_name         |

+---------------------------------+---------------------+

| Anderson and Sons Premium Wines | Coonawarra          |

| Anderson and Sons Wines         | Coonawarra          |

| Anderson Brothers Group         | Rutherglen          |

| Anderson Creek Group            | Riverland           |

| Anderson Daze Group             | Rutherglen          |

| Anderson Daze Vineyard          | Margaret River      |

| Anderson Daze Wines             | Barossa Valley      |

| Anderson Ridge Wines            | Lower Hunter Valley |

A natural join query relies on the DBMS matching attributes with the same name across the two tables. In this example, MySQL discovers that there's a region_id attribute in the winery and region tables, and it only outputs combinations where the region_id in both tables is the same.

You can write a join query that explicitly specifies which attributes should be matched to produce the correct result. The following query uses a WHERE clause to produce identical results to our previous example:

SELECT winery_name, region_name FROM winery, region 

  WHERE winery.region_id = region.region_id 

  ORDER BY winery_name;

We recommend writing out your joins so that they include the join condition in the WHERE clause. This is safer and clearer than relying on the NATURAL JOIN operator to discover common attribute names across tables and allowing the DBMS to figure out how the join is done.

Several features are shown in this second example:

  • The FROM clause contains the two table names winery and region, and so retrieves rows from both tables.

  • Attributes in the WHERE clause are specified using both the table name and attribute name, separated by a period. This is useful because the same attribute name is often used in different tables, and the query can't figure out which table is meant unless you include it. When an attribute name occurs in only one table, you can omit the table name.

  • In this example, region_id in the region table and region_id in the winery table have to be specified unambiguously as region.region_id and winery.region_id. In contrast, winery_name and region_name don't need the table name because they occur only in the winery and region tables respectively.

  • The use of both the table and attribute name can also be used for clarity in queries, even if it isn't required. So, for example, you could write winery.winery_name in the example query. It can also be used in all parts of the query, not just the WHERE clause.

  • The WHERE clause includes a join clause that matches rows between the multiple tables. In this example, the output is reduced to those rows where wineries and regions have matching region_id attributes, resulting in a list of all wineries and which region they are located in. This is the key to joining two or more tables to produce sensible results.

5.6.2.1 Examples

A join can be used to find lots of useful information from the winestore database. Suppose we want to find the names of wineries and the wines they make. Again, after examining the ER model, you'll see that you need to join together the related wine and winery tables to get the required names. Here's the query you'd need to write to get the correct result:

SELECT winery_name, wine_name FROM winery, wine 

  WHERE wine.winery_id = winery.winery_id;

This query joins the winery and wine tables by matching the winery_id attributes. The result is the names and wineries of the 1,048 wines stocked at the winestore.

You can extend this query to produce a list of wines made by a specific winery or group of wineries. For example, to find all wines made by wineries with a name beginning with Borg, use:

SELECT winery_name, wine_name FROM winery, wine 

  WHERE wine.winery_id = winery.winery_id 

  AND winery.winery_name LIKE 'Borg%';

The LIKE clause is discussed in detail in Chapter 15.

Here are two more example join queries:

  • To find the name of the region that the Ryan Ridge Winery is situated in:

    SELECT region_name FROM region, winery 
    
      WHERE winery.region_id=region.region_id 
    
      AND winery_name='Ryan Ridge Winery';

  • To find which wineries make Tonnibrook wines:

    SELECT winery_name FROM winery, wine 
    
      WHERE wine.winery_id=winery.winery_id 
    
      AND wine_name='Tonnibrook';

5.6.2.2 Using DISTINCT in joins

The next example uses the DISTINCT operator to find wines that cost less than $10:

SELECT DISTINCT wine.wine_id FROM wine, inventory 

  WHERE wine.wine_id=inventory.wine_id AND cost<10;

Wines can have more than one inventory row, and the DISTINCT operator shows each wine_id once by removing any duplicates.

Here are two examples that use DISTINCT to show only one matching answer:

  • To find which countries the customers live in:

    SELECT DISTINCT country FROM customer, countries 
    
      WHERE customer.country_id = countries.country_id;

  • To find which customers have ordered wines:

    SELECT DISTINCT surname,firstname FROM customer,orders 
    
      WHERE customer.cust_id = orders.cust_id
    
      ORDER BY surname,firstname;

5.6.3 Joins with More than Two Tables

Queries can join more than two tables. Suppose you want to find the details of the wine purchases made by a customer, including the customer's details, the dates they made an order, and the quantity and price of the items purchased. You examine the ER model, and see that the customer table that contains the customer information is related to the orders table that contains the date, and the orders table is related to the items table that contains the quantities and prices. So, to get the information you need, you have to join all three tables together.

By examining the database structure or the CREATE TABLE statements, you can see that the cust_id attribute can be used to join together the customer and the orders table. Joining the orders table and items table is a little trickier: the primary key of the orders table isn't just the order_id, it's both the cust_id and the order_id. So, for example there are many rows with an order_id of 1, but what makes a row unique is the combination of the cust_id for a customer and the order_id. These two attributes together are used to join the orders and items tables.

Suppose now that we want run this query for customer #2. Here's the query you'd use:

SELECT * FROM customer, orders, items

  WHERE customer.cust_id = orders.cust_id AND 

  orders.order_id = items.order_id AND

  orders.cust_id = items.cust_id AND customer.cust_id = 2;

The WHERE clause contains the join condition between the three tables, customer, orders, and items, and the rows selected are those in which the cust_id is the same for all three tables, the cust_id is 2, and the order_id is the same in the orders and items tables. The example illustrates how frequently the Boolean operators AND and OR are used.

If you remove the cust_id=2 clause, the query outputs all items from all orders by all customers. This is a large result set, but still a sensible one that is much smaller than the cartesian product!

Here are two more examples that join three tables:

  • To find which wines are made in the Margaret River region:

    SELECT wine_id FROM wine, winery, region
    
      WHERE wine.winery_id=winery.winery_id AND
    
      winery.region_id=region.region_id AND 
    
      region.region_name='Margaret River';

  • To find which region contains the winery that makes wine #28:

    SELECT region_name FROM wine, winery, region
    
      WHERE wine.winery_id=winery.winery_id AND
    
      winery.region_id=region.region_id AND
    
      wine.wine_id=28;

Extending to four or more tables generalizes the approach further. To find the details of customers who have purchased wines from the Ryan Estates Group winery, use:

SELECT DISTINCT customer.cust_id, surname, firstname 

  FROM customer, winery, wine, items

  WHERE customer.cust_id=items.cust_id AND

    items.wine_id=wine.wine_id AND

    wine.winery_id=winery.winery_id AND

    winery.winery_name='Ryan Estates Group'

  ORDER BY surname, firstname;

This query is the most complex so far and has four parts. The easiest way to understand a query is usually to start at the end of the WHERE clause and work toward the SELECT clause:

  1. The WHERE clause restricts the winery rows to the Ryan Estates Group (which, in this case, only matches one winery).

  2. The resultant winery row is joined with the wine table to find all wines made by the Ryan Estates Group.

  3. The wines made by Ryan Estates Group are joined with the items that have been purchased by joining to the items table.

  4. The purchased wines are then joined with the customer rows to find the purchasers. You can leave out the orders table, because the items table contains a cust_id for the join; if you need the order number or credit card number (or another orders attribute), the orders table needs to be included in the query.

  5. The result is the details of customers who have purchased Ryan Estates Group wines. The DISTINCT clause is used to show each customer only once. ORDER BY sorts the customer rows into telephone directory order.

Designing a query like this is a step-by-step process. We began by testing a query to find the winery_id of wineries with the name Ryan Estates Group. Then, after testing the query and checking the result, we progressively added additional tables to the FROM clause and the join conditions. Finally, we added the ORDER BY clause.

The next example uses three tables. It queries the complex many-to-many relationship that exists between the wines and grape_variety tables via the wine_variety table. A wine can have one or more grape varieties and these are listed in a specific order (e.g., Cabernet, then Sauvignon). From the other perspective, a grape variety such as Cabernet can be in hundreds of different wines. The many-to-many relationship is managed by creating an intermediate table between grape_variety and wine called wine_variety. The id attribute value stored in that table represents the order in which the grape varieties should appear for the wine. You can find a longer discussion of how these tables were designed and how they're used in Appendix E.

Here is the example query that joins the three tables to find what grape varieties are in wine #1004:

SELECT variety FROM grape_variety, wine_variety, wine

  WHERE wine.wine_id=wine_variety.wine_id AND

  wine_variety.variety_id=grape_variety.variety_id AND

  wine.wine_id=1004

  ORDER BY wine_variety.id;

The result of the query is:

+-----------+

| variety   |

+-----------+

| Cabernet  |

| Sauvignon |

+-----------+

2 rows in set (0.00 sec)

The join condition is the same as any three-table query. The only significant difference is the ORDER BY clause that presents the results in id order (the first listed variety was stored with ID=1, the second ID=2, and so on).

    Previous Section  < Day Day Up >  Next Section

    We provide debt help advice and Cheap.
    We search the market for car finance online
    Guaranteed Low Rate debt help
    We search the market for mortgages online





    Copyright © 2010 | Domen maybe sale - bye this domen