How to Design Your Dictionary and Database

Top  Previous  Next

This topic provides a quick review of relational database theory. Planning and organizing your application's database design up front can result in a more efficient application for the end user, not to mention saving hours of redesign later.

The relational model concerns itself with three aspects of data management: structure, integrity, and manipulation. For our purposes, we will discuss the three practical requirements of these aspects: data normalization, keys, and relational operations.

Normalization

At its simplest, data normalization means that a data item should be stored at only one location. There are two benefits to this: lowered disk space requirements, and easier data maintenance. To achieve this end, a relational database design splits the data into separate, related files. For example, assume a very simple order-entry system which needs to store the following data:

 

     Customer Name

     Customer Address

     ShipTo Address

     Order Date

     Product Ordered

     Quantity Ordered

     Unit Price

 

This data could all be stored in each record of one file, but that would be very inefficient. The Customer Name, Address, ShipTo Address, and Order Date would be duplicated for every item ordered on every order. To eliminate the duplication, you split the data into separate files.

 

Customer File:  Customer Name

         Customer Address

 

Order File:    ShipTo Address

         Order Date

 

Item File:    Product Ordered

         Quantity Ordered

         Unit Price

 

With this file configuration, the Customer File contains all the customer information, the Order File contains all the information that is pertinent to one order, and the Item File contains all the information for each item in the order. This certainly eliminates duplicate data. However, how do you tell which record in what file relates to what other records in which other files? This is the purpose of the relational terms "Primary Key" and "Foreign Key."

A Primary Key is an index into a file based on a field (or fields) that cannot contain duplicate or null values. To translate this to Clarion language terms: a Primary Key would be a unique KEY (no DUP attribute) with key components that are all REQuired fields for data entry. In strict relational database design, one Primary Key is required for every file.

A Foreign Key is an index into a file based on a field (or fields) which contain values that duplicate the values contained in the Primary Key fields of another, related, file. To re-state this, a Foreign Key contains a "reference" to the Primary Key of another file.

Primary Keys and Foreign Keys form the basis of file relationships in Relational Database. The matching values contained in the Primary and Foreign Keys are the "pointers" to the related records. The Foreign Key records in "File A" point back to the Primary Key record in "File B", and the Primary Key in "File B" points to the Foreign Key records in "File A."

Defining the Primary and Foreign Keys for the above example requires that you add some fields to the files to fulfill the relational requirements.

 

Customer File:  Customer Number - Primary Key

         Customer Name

         Customer Address

 

Order File:    Order Number - Primary Key

         Customer Number - Foreign Key

         ShipTo Address

         Order Date

 

Item File:    Order Number - 1st Primary Key Component and Foreign Key

         Product Ordered - 2nd Primary Key Component

         Quantity Ordered

         Unit Price

 

In the Customer File, there is no guarantee that there could not be duplicate Customer Names. Therefore, the Customer Number field is added to become the Primary Key. The Order Number has been added to the Order File as the Primary Key because there is no other field that is absolutely unique in that file. The Customer Number was also added as a Foreign Key to relate the Order File to the Customer File. The Item File now contains the Order Number as a Foreign Key to relate to the Order File. It also becomes the first component of the multiple component (Order Number, Product Ordered) Primary Key.

The Relational definitions of Primary Key and Foreign Key do not necessarily require the declaration of a Clarion KEY based on the Primary or Foreign Key. This means that, despite the fact that these Keys exist in theory, you will only declare a Clarion KEY if your application actually needs it for some specific file access. Generally speaking, most all Primary Keys will have a Clarion KEY, but fewer Foreign Keys need have Clarion KEYs declared.

 

File Relationship

There are three types of relationships that may be defined between any two files in a relational database: One-to-One; One-to-Many (also called Parent-Child) and its reverse view, Many-to-One; and Many-to-Many. These relationships refer to the number of records in one file that are related to some number of records in the second file.

In the previous example, the relationship between the Customer File and the Order File is One-to-Many. One Customer File record may be related to multiple Order File records. The Order File and the Item File also have a One-to-Many relationship, since one Order may have multiple Items. In business database applications, One-to-Many (Parent-Child) is the most common relationship between files.

A One-to-One relationship means that exactly one record in one file may be related to exactly one record in another file. This is useful in situations where a particular file may, or may not, need to have data in some fields. If all the fields are contained in one file, you can waste a lot of disk space with empty fields in those records that don't need the extra information. Therefore, you create a second file with a One-to-One relationship to the first file, to hold the possibly unnecessary fields.

To expand the previous example, an Order may, or may not, need to have a separate ShipTo Address. So, you could add a ShipTo File to the database design.

 

Order File: Order Number - Primary Key

           Customer Number - Foreign Key

           Order Date

 

ShipTo File:  Order Number - Primary Key and Foreign Key

             ShipTo Address

 

In this example, a record would be added to the ShipTo File only if an Order has to be shipped to some address other than the address in the Customer File. The ShipTo File has a One-to-One relationship with the Order File.

Many-to-Many is the most difficult file relationship with which to deal. It means that multiple records in one file are related to multiple records in another file. Expand the previous example to fit a manufacturing concern which buys Parts and makes Products. One Part may be used in many different Products, and one Product could use many Parts.

 

Parts File:   Part Number - Primary Key

        Part Description

 

Product File:  Product Number - Primary Key

        Product Description

 

Without going into the theory, let me simply state that this situation is handled by defining a third file, commonly referred to as a "Join" file. This Join file creates two One-to-Many relationships, as in this example:

 

Parts File:   Part Number - Primary Key

        Part Description

 

Parts2Prod File: Part Number - 1st Primary Key Component and Foreign Key

         Product Number - 2nd Primary Key Component and Foreign Key

         Quantity Used

 

Product File:  Product Number - Primary Key

        Product Description

 

The Parts2Prod File has a multiple component Primary Key and two Foreign Keys. The relationship between Parts and Parts2Prod is One-to-Many, and the relationship between Product and Parts2Prod is also One-to-Many. This makes the Join file the "middle-man" between two files with a Many-to-Many relationship.

An advantage of using a Join file is that there is usually some more information that logically should be stored there. In this case, the Quantity Used (of a Part in a Product) logically only belongs in the Parts2Prod file.

 

Translating the Theory to Clarion

In practical relational database design, a Clarion KEY may not need to be declared for the Primary Key on some files. If there is never a need to directly access individual records from that file, then a KEY definition based on the Primary Key is not necessary. Usually, this would be the Child file (of a Parent-Child relationship) whose records are only needed in conjunction with the Parent record.

A Clarion KEY also may not need to be declared for a Foreign Key. The determination to declare a KEY is dependent upon how you are going to access the file containing the Foreign Key. If you need to access the Foreign Key records from the Primary Key, a Clarion KEY is necessary. However, if the only purpose of the Foreign Key is to ensure that the value in the Foreign Key field value is valid, no Clarion KEY is needed. Take the previous theoretical examples and create Clarion file definitions:

 

Customer  FILE,DRIVER('Clarion'),PRE(Cus)

CustKey   KEY(Cus:CustNo)  !Primary KEY

Record     RECORD

CustNo      LONG   !Customer Number - Primary Key

Name        STRING(30)   !Customer Name

Address     STRING(30)   !Customer Address

          END

         END

 

Order    FILE,DRIVER('Clarion'),PRE(Ord)

OrderKey KEY(Ord:OrderNo)  !Primary KEY

CustKey  KEY(Ord:CustNo),DUP  !Foreign KEY

Record    RECORD

OrderNo    LONG   !Order Number - Primary Key

CustNo     LONG   !Customer Number - Foreign Key

Date       LONG    !Order Date

         END

        END

 

ShipTo  FILE,DRIVER('Clarion'),PRE(Shp)

OrderKey KEY(Shp:OrderNo)  !Primary KEY

Record    RECORD

OrderNo    LONG   !Order Number - Primary Key and Foreign Key

Address    STRING(30)   !ShipTo Address

         END

       END

 

Item    FILE,DRIVER('Clarion'),PRE(Itm)

OrderKey KEY(Itm:OrderNo,Itm:ProdNo) !Primary KEY

Record    RECORD

OrderNo    LONG   !Order - Primary Component and Foreign Key

ProdNo     LONG   !Prod. - Primary Component and Foreign Key

Quantity   SHORT   !Quantity Ordered

Price      DECIMAL(7,2)  !Unit Price

        END

       END

 

Product  FILE,DRIVER('Clarion'),PRE(Pro)

ProdKey   KEY(Pro:ProdNo)  !Primary KEY

Record    RECORD

ProdNo      LONG   !Product Number - Primary Key

Description STRING(30)   !Product Description

         END

        END

 

Parts2Prod  FILE,DRIVER('Clarion'),PRE(P2P)

ProdPartKey KEY(P2P:ProdNo,P2P:PartNo) !Primary KEY

PartProdKey KEY(P2P:PartNo,P2P:ProdNo) !Alternate KEY

Record      RECORD

PartNo       LONG   !Part - Primary Component and Foreign Key

ProdNo       LONG   !Prod. - Primary Component and Foreign Key

Quantity     SHORT

           END

           END

 

Parts     FILE,DRIVER('Clarion'),PRE(Par)

PartKey   KEY(Par:PartNo)  !Primary KEY

Record     RECORD

PartNo      LONG   !Part Number - Primary Key

Description STRING(30)   !Part Description

          END

         END

 

Notice that only one Foreign Key (in the Order file) was explicitly declared as a Clarion KEY. A number of Foreign Keys were included as part of Primary Key declarations, but this was simply good fortune.

The Primary Key (Itm:OrderKey) defined on the Item file is there to ensure that an order does not contain duplicate Products Ordered. If this were not a consideration, Itm:OrderKey would only contain Itm:OrderNo, and would have the DUP attribute to allow duplicate KEY values. This would make it a Foreign Key instead of a Primary Key, and the file would not have a KEY defined for the Primary Key.

The Item file and the Product file have a Many-to-One relationship, which is One-to-Many looked at from the reverse perspective. This reverse view is most often used for data entry verification look-up. This means the Product Number entered into the Item file's data entry procedure can look-up and verify the Product Number against the records in the Product file.

 

Referential Integrity

There is one more fundamental issue in the Relational Model which should be addressed: "Referential Integrity." This is an issue which must be resolved in the executable source code for an application, because it involves the active, run-time inter-relationship of the data within the database.

Referential Integrity means that no Foreign Key can contain a value that is not matched by some Primary Key value. Maintaining Referential Integrity in your database begets two questions which must be resolved:

.2.

What do you do when the user wants to delete the Primary Key record?

.3.

What do you do when the user wants to change the Primary Key value?

The three most common answers to each of these questions are: Restrict the action, Cascade the action, or (less commonly) Nullify the Foreign Key values. Of course, there may also be application-specific answers, such as copying all information to history files before performing the action, which should be implemented as required in individual programs.

Restrict the action means that when the user attempts to delete the Primary Key record, or change the Primary Key value, the action is only allowed if there are no Foreign Keys that reference that Primary Key. If related Foreign Keys do exist, the action is not allowed.

Cascade the action means that when the user attempts to delete the Primary Key record, or change the Primary Key value, the action cascades to include any Foreign Keys that reference that Primary Key. If related Foreign Keys do exist, the delete action also deletes those records, and the change action also changes the values in the Foreign Keys that reference that Primary Key.

There is one consideration that should be noted when you Cascade the action. What if the file you Cascade to (the Child file) is also the Parent of another Child file? This is a situation which you must detect and handle, because the Cascade action should affect all the dependent file records. When you are writing source code to handle this situation, you need to be aware of the file relationships and write code that Cascades the action as far it needs to go to ensure that nothing is "left hanging."

Nullify the Foreign Key means that when the user attempts to delete the Primary Key record, or change the Primary Key value, the Foreign Keys that reference that Primary Key are changed to null values (if the Foreign Key fields allow null values).

The Nullify option does not require as many changes as the Cascade option. This is because the Cascade has to delete all the related records in as many files as are related. Nullify only needs to null out the individual Foreign Keys that reference the Primary Key being changed or deleted.

 

Summary

·

Each data item should be stored once.

·

Separate files are used to eliminate data duplication.

·

Files are related by Primary and Foreign Keys.

·

A Primary Key is a unique (and non-null) index into a file which provides for individual record access.

·

A Foreign Key contains a reference to the Primary Key of some other file.

·

One-to-Many file relationships are the most common. They are also referred to as Parent-Child and Many-to-One (same relationship, reverse view).

·

One-to-One file relationships are most commonly created to hold data that is not always needed in every record.

·

Many-to-Many relationships require a "Join" file which acts as a broker between the two files. The Join file inserts two One-to-Many relationships between the Many-to-Many relationship.

·

Only those Primary and Foreign Keys that the application needs (as a practical consideration) for specific access to the files need to have Clarion KEYs declared.

·

Referential Integrity means that all Foreign Keys contain valid references to Primary Keys.

·

Maintaining Referential Integrity requires executable code that tests for Update or Delete of the Primary Key values.

·

The three common solutions to maintaining Referential Integrity are: Restricting (update/delete not allowed), Cascading (also update/delete the Foreign Key), or Nullifying the Foreign Key (assign null values to the Foreign Key).