SQL Reports

All Things Being SQL – SQL Reports and SQL Reporting Resources

SELECTING CONDITIONALLY PART – 6.1 SQL Database

This SQL Database topic would deal with the advantages, disadvantages and the actual intricacies that are involved when two or more tables are joined in an SQL database query. But before we actually try to join a table with some another, there are a few concepts about tables that should be made clear. Here are a few of them.

SQL Database – PRIMARY KEYS IN A TABLE

So what exactly is a primary key? A simple definition of the same would be as follows. A primary key is a column (or one or more column combination(s)) that has the following properties.

  1. No value in that column(s) is NULL.
  2. Every value is unique. I.e. there are no repetitions of values in the same column (or column combinations).

Any column that satisfies the above properties can be considered to be a potential primary key. A primary key is almost always specified explicitly to the system by the database designer. It is then the systems responsibility to ensure that the above two conditions are not violated in any case.

If you remember the Invoices table, you would realize that the only column that satisfies the above criteria is the “INVOICE_NO” column. Similarly the SUPPLIER_ID column is the primary key for the suppliers table. The importance of the primary key can be understood from the analogy of a spreadsheet. Every cell in a spreadsheet can be uniquely identified by a row – column number combination. Similarly every “cell” in a table can also be identified by a primary key value – column name combination.

However, the greatest advantage of a primary key is reflected when we actually join a table, the primary key forms the basis of the join itself in most cases. Here is a technical diagram that illustrates the relationship between the suppliers and the invoices tables.

FIGURE 40: A relationship diagram

This diagram is technically called an “Entity Relationship Diagram”. For the time being, just focus on the part highlighted in green. It tells you about the relation between the two tables we have used so far. It tells us that the SUPPLIER_ID column in the Invoices column actually references to the primary key column SUPPLIER_ID in another table called suppliers. Though the names of the columns are the same for both the tables, this may not be the case always.

This diagram also conveys a lot of other information to a database designer, but for the time being we will satisfy ourselves with the above fact. This also leads us to another concept that is essential when joining a table.

FOREIGN KEYS IN A TABLE

Referring to FIGURE 40, it is clear that although the SUPPLIER_ID column in the Invoices table is not a primary key for that table, it is a primary key in another table, namely the Suppliers table. Such a column is termed as a “Foreign key“. A foreign key must satisfy the following properties:

  1. It must be a primary key in the table which it references.
  2. A foreign key value in a table must always reference some row in another table. This can be made clear with an example. There is no Invoice No. “7” that refers to “SUPPLIER_ID” 5 anywhere (we have only 4 suppliers – FIGURE 19). This is simply because there is no supplier with SUPPLIER_ID 5. Had such a value existed in the Invoices table, the SUPPLIER_ID column would have ceased to be a foreign key at all.

Once again, it is the Database designer’s job to explicitly specify a foreign key column to the system and it is then the systems responsibility to ensure that a “SUPPLIER_ID” value in the Invoices table always references some supplier in the Suppliers table.

The fact that system ensures that the above rules are not violated is what defines the “integrity” of a database system. The below mentioned join example will make this and many more concepts clear.

  1. LIST THE INVOICE_NO, SUPPLIER NAME, INVOICE DATE AND INVOICE AMOUNT FOR EVERY INVOICE ISSUED TO US.

SELECT INVOICES.INVOICE_NO, SUPPLIERS.NAME,

INVOICES.INVOICE_DATE, INVOICES.AMOUNT FROM INVOICES, SUPPLIERS

WHERE INVOICES.SUPPLIER_ID = SUPPLIERS.SUPPLIER_ID;

OR

SELECT INVOICE_NO, NAME, INVOICE_DATE, INVOICES.AMOUNT FROM

INVOICES, SUPPLIERS

WHERE INVOICES.SUPPLIER_ID = SUPPLIERS.SUPPLIER_ID;

INVOICE_NO

NAME

INVOICE_DATE

AMOUNT

1

ABC Corp

24-Feb-2011

1234

2

DEF Inc

24-Feb-2011

2345

3

GHI Associates

24-Feb-2011

3455

4

GHI Associates

25-Feb-2011

2314

5

DEF Inc

26-Feb-2011

9876

6

ABC Corp

27-Feb-2011

4500

FIGURE 41: THE JOINED TABLE

Both queries would return the same result and the queries themselves are relatively straightforward. Here are a few points that would clarify a few things about how this query works:

  1. The DOT (“.”) operator: In the first query we have referenced each column in the following manner: “TABLE NAME”.”COLUMN NAME”. This is the correct format for referencing any table. We have never used this format till now because SQL gives the user some freedom in this regard.
  2. The second query illustrates (1) above and shows that the DOT operator is not really necessary.
  3. Also note how the two tables are referenced after the FROM clause.
  4. The JOIN condition: The clause “WHERE INVOICES.SUPPLIER_ID = SUPPLIERS.SUPPLIER_ID” is what forms the basis of this join. The system matches the “SUPPLIER_ID” column in both tables and then inputs the corresponding supplier name from the “SUPPLIER.NAME” column in the result.
  5. The DOT operator becomes essential when referencing columns from two tables that have identical names. Hence, in the WHERE clause, we have written: “INVOICES.SUPPLIER_ID = SUPPLIERS.SUPPLIER_ID”. This tells the system, which columns to choose without any ambiguity. In general, it is a good habit to always qualify the table name to a column with a dot operator while writing any query, especially when the query involves two or more tables.
  6. Also note the column names in the result. Instead of INVOICES.INVOICE_NO only INVOICE_NO is displayed. Analogous remarks apply to all other columns.

AN IMPORTANT POINT

Why this join was made possible in the first place? The answer to this lies in the “integrity” that a database system enforces. As said earlier, it is the systems responsibility to ensure that every value in a foreign key column references a unique row in the table which is being referenced. Because this was ensured, we knew that every value in the INVOICES.SUPPLIER_ID column would reference to a corresponding SUPPLIERS.SUPPLIER_ID, and therefore a corresponding SUPPLIERS.NAME column.

In the next section we would learn more in depth things about method of table joining and its effects and uses

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>