Reduction to Relational Schemas

We can represent a database that conforms to an E-R database schema by a col- lection of relation schemas. For each entity set and for each relationship set in the database design, there is a unique relation schema to which we assign the name of the corresponding entity set or relationship set.

Both the E-R model and the relational database model are abstract, logical representations of real-world enterprises. Because the two models employ similar design principles, we can convert an E-R design into a relational design.

In this section, we describe how an E-R schema can be represented by relation schemas and how constraints arising from the E-R design can be mapped to constraints on relation schemas.

Representation of Strong Entity Sets with Simple Attributes

Let E be a strong entity set with only simple descriptive attributes a1, a2, . . . , an. We represent this entity by a schema called E with n distinct attributes. Each tuple in a relation on this schema corresponds to one entity of the entity set E.

For schemas derived from strong entity sets, the primary key of the entity set serves as the primary key of the resulting schema. This follows directly from the fact that each tuple corresponds to a specific entity in the entity set.

As an illustration, consider the entity set student of the E-R diagram in Fig- ure 7.15. This entity set has three attributes: ID, name, tot cred. We represent this entity set by a schema called student with three attributes:

student (ID, name, tot_cred)

Note that since student ID is the primary key of the entity set, it is also the primary key of the relation schema.

Continuing with our example, for the E-R diagram in Figure 7.15, all the strong entity sets, except time slot, have only simple attributes. The schemas derived from these strong entity sets are:

classroom (building, room number, capacity) department (dept name, building, budget) course (course id, title, credits) instructor (ID, name, salary) student (ID, name, tot cred)

As you can see, both the instructor and student schemas are different from the schemas we have used in the previous chapters (they do not contain the attribute dept name). We shall revisit this issue shortly.

7.6.2 Representation of Strong Entity Sets with Complex Attributes

When a strong entity set has nonsimple attributes, things are a bit more complex. We handle composite attributes by creating a separate attribute for each of the component attributes; we do not create a separate attribute for the composite attribute itself. To illustrate, consider the version of the instructor entity set de- picted in Figure 7.11. For the composite attribute name, the schema generated for instructor contains the attributes first name, middle name, and last name; there is no separate attribute or schema for name. Similarly, for the composite attribute address, the schema generated contains the attributes street, city, state, and zip code. Since street is a composite attribute it is replaced by street number, street name, and apt number. We revisit this matter in Section 8.2.

Multivalued attributes are treated differently from other attributes. We have seen that attributes in an E-R diagram generally map directly into attributes for the appropriate relation schemas. Multivalued attributes, however, are an exception; new relation schemas are created for these attributes, as we shall see shortly.

Derived attributes are not explicitly represented in the relational data model. However, they can be represented as “methods” in other data models such as the object-relational data model, which is described later in Chapter 22.

The relational schema derived from the version of entity set instructor with complex attributes, without including the multivalued attribute, is thus:

instructor (ID, first_name, middle_name, last_name,street_number,street_name, apt_number,city,state, zip_code, date_of_birth)

For a multivalued attribute M, we create a relation schema R with an attribute A that corresponds to M and attributes corresponding to the primary key of the entity set or relationship set of which M is an attribute.

As an illustration, consider the E-R diagram in Figure 7.11 that depicts the entity set instructor, which includes the multivalued attribute phone number. The primary key of instructor is ID. For this multivalued attribute, we create a relation schema instructor phone (ID, phone number)

Each phone number of an instructor is represented as a unique tuple in the relation on this schema. Thus, if we had an instructor with ID 22222, and phone numbers 555-1234 and 555-4321, the relation instructor phone would have two tuples (22222, 555-1234) and (22222, 555-4321).

We create a primary key of the relation schema consisting of all attributes of the schema. In the above example, the primary key consists of both attributes of the relation instructor phone.

In addition, we create a foreign-key constraint on the relation schema created from the multivalued attribute, with the attribute generated from the primary key of the entity set referencing the relation generated from the entity set. In the above example, the foreign-key constraint on the instructor phone relation would be that attribute ID references the instructor relation.

In the case that an entity set consists of only two attributes — a single primary- key attribute B and a single multivalued attribute M — the relation schema for the entity set would contain only one attribute, namely the primary-key attribute B. We can drop this relation, while retaining the relation schema with the attribute B and attribute A that corresponds to M.

To illustrate, consider the entity set time slot depicted in Figure 7.15. Here, time slot id is the primary key of the time slot entity set and there is a single multivalued

attribute that happens also to be composite. The entity set can be represented by just the following schema created from the multivalued composite attribute:

time slot (time slot id, day, start time, end time)

Although not represented as a constraint on the E-R diagram, we know that there cannot be two meetings of a class that start at the same time of the same day-of- the-week but end at different times; based on this constraint, end time has been omitted from the primary key of the time slot schema.

The relation created from the entity set would have only a single attribute time slot id; the optimization of dropping this relation has the benefit of simplifying

the resultant database schema, although it has a drawback related to foreign keys, which we briefly discuss in Section 7.6.4.

7.6.3 Representation of Weak Entity Sets

Let A be a weak entity set with attributes a_1, a_2_, . . . , am_. Let B be the strong entity set on which A depends. Let the primary key of B consist of attributes b_1, b_2_, . . . , bn_. We represent the entity set A by a relation schema called A with one attribute for each member of the set:

{a 1, a 2, . . . , am}{b 1, b 2, . . . , bn}

For schemas derived from a weak entity set, the combination of the pri- mary key of the strong entity set and the discriminator of the weak entity set serves as the primary key of the schema. In addition to creating a primary key, we also create a foreign-key constraint on the relation A, specifying that the attributes b 1, b 2, . . . , bn reference the primary key of the relation B. The foreign- key constraint ensures that for each tuple representing a weak entity, there is a corresponding tuple representing the corresponding strong entity.

As an illustration, consider the weak entity set section in the E-R diagram of Figure 7.15. This entity set has the attributes: sec id, semester, and year. The primary key of the course entity set, on which section depends, is course id. Thus, we represent section by a schema with the following attributes:

section (course id, sec id, semester, year)

The primary key consists of the primary key of the entity set course, along with the discriminator of section, which is sec id, semester, and year. We also create a foreign-key constraint on the section schema, with the attribute course id refer- encing the primary key of the course schema, and the integrity constraint “on delete cascade”.7 Because of the “on delete cascade” specification on the foreign key constraint, if a course entity is deleted, then so are all the associated section entities.

7.6.4 Representation of Relationship Sets

Let R be a relationship set, let a_1, a_2_, . . . , am_ be the set of attributes formed by the union of the primary keys of each of the entity sets participating in R, and let the descriptive attributes (if any) of R be b_1, b_2_, . . . , bn_. We represent this relationship set by a relation schema called R with one attribute for each member of the set:

{a 1, a 2, . . . , am}{b 1, b 2, . . . , bn}

We described earlier, in Section 7.3.3, how to choose a primary key for a binary relationship set. As we saw in that section, taking all the primary-key attributes from all the related entity sets serves to identify a particular tuple, but for one-to- one, many-to-one, and one-to-many relationship sets, this turns out to be a larger set of attributes than we need in the primary key. The primary key is instead chosen as follows:

• For a binary many-to-many relationship, the union of the primary-key at- tributes from the participating entity sets becomes the primary key.

• For a binary one-to-one relationship set, the primary key of either entity set can be chosen as the primary key. The choice can be made arbitrarily.

• For a binary many-to-one or one-to-many relationship set, the primary key of the entity set on the “many” side of the relationship set serves as the primary key.

• For an n-ary relationship set without any arrows on its edges, the union of the primary key-attributes from the participating entity sets becomes the primary key.

• For an n-ary relationship set with an arrow on one of its edges, the primary keys of the entity sets not on the “arrow” side of the relationship set serve as the primary key for the schema. Recall that we allowed only one arrow out of a relationship set.

We also create foreign-key constraints on the relation schema R as follows: For each entity set Ei related to relationship set R, we create a foreign-key con- straint from relation schema R, with the attributes of R that were derived from primary-key attributes of Ei referencing the primary key of the relation schema representing Ei .

As an illustration, consider the relationship set advisor in the E-R diagram of Figure 7.15. This relationship set involves the following two entity sets:

instructor with the primary key ID.

student with the primary key ID.

Since the relationship set has no attributes, the advisor schema has two attributes, the primary keys of instructor and student. Since both attributes have the same name, we rename them i ID and s ID. Since the advisor relationship set is many- to-one from student to instructor the primary key for the advisor relation schema is s ID.

We also create two foreign-key constraints on the advisor relation, with at- tribute i ID referencing the primary key of instructor and attribute s ID referencing the primary key of student.

Continuing with our example, for the E-R diagram in Figure 7.15, the schemas derived from a relationship set are depicted in Figure 7.16.

Observe that for the case of the relationship set prereq, the role indicators associated with the relationship are used as attribute names, since both roles refer to the same relation course.

Similar to the case of advisor, the primary key for each of the relations sec course, sec time slot, sec class, inst dept, stud dept and course dept consists of the primary key of only one of the two related entity sets, since each of the corresponding relationships is many-to-one.

Foreign keys are not shown in Figure 7.16, but for each of the relations in the figure there are two foreign-key constraints, referencing the two relations created from the two related entity sets. Thus, for example, sec course has foreign keys referencing section and classroom, teaches has foreign keys referencing instructor and section, and takes has foreign keys referencing student and section.

The optimization that allowed us to create only a single relation schema from the entity set time slot, which had a multivalued attribute, prevents the creation of a foreign key from the relation schema sec time slot to the relation created from entity set time slot, since we dropped the relation created from the entity set time

teaches (ID, course id, sec id, semester, year) takes (ID, course id, sec id, semester, year, grade) prereq (course id, prereq id) advisor (s ID, i ID) sec course (course id, sec id, semester, year) sec time slot (course id, sec id, semester, year, time slot id) sec class (course id, sec id, semester, year, building, room number) inst dept (ID, dept name) stud dept (ID, dept name) course dept (course id, dept name)

Figure 7.16 Schemas derived from relationship sets in the E-R diagram in Figure 7.15.

slot. We retained the relation created from the multivalued attribute, and named it time slot, but this relation may potentially have no tuples corresponding to a time slot id, or may have multiple tuples corresponding to a time slot id; thus, time slot id in sec time slot cannot reference this relation.

The astute reader may wonder why we have not seen the schemas sec course, sec time slot, sec class, inst dept, stud dept, and course dept in the previous chapters. The reason is that the algorithm we have presented thus far results in some schemas that can be either eliminated or combined with other schemas. We ex- plore this issue next.

7.6.4.1 Redundancy of Schemas

A relationship set linking a weak entity set to the corresponding strong entity set is treated specially. As we noted in Section 7.5.6, these relationships are many-to- one and have no descriptive attributes. Furthermore, the primary key of a weak entity set includes the primary key of the strong entity set. In the E-R diagram of Figure 7.14, the weak entity set section is dependent on the strong entity set course via the relationship set sec course. The primary key of section is {course id, sec id, semester, year} and the primary key of course is course id. Since sec course has no descriptive attributes, the sec course schema has attributes course id, sec id, semester, and year. The schema for the entity set section includes the attributes course id, sec id, semester, and year (among others). Every (course id, sec id, semester, year) combination in a sec course relation would also be present in the relation on schema section, and vice versa. Thus, the sec course schema is redundant.

In general, the schema for the relationship set linking a weak entity set to its corresponding strong entity set is redundant and does not need to be present in a relational database design based upon an E-R diagram.

7.6.4.2 Combination of Schemas

Consider a many-to-one relationship set AB from entity set A to entity set B. Using our relational-schema construction algorithm outlined previously, we get three schemas: A, B, and AB. Suppose further that the participation of A in the relationship is total; that is, every entity a in the entity set B must participate in the relationship AB. Then we can combine the schemas A and AB to form a single schema consisting of the union of attributes of both schemas. The primary key of the combined schema is the primary key of the entity set into whose schema the relationship set schema was merged.

To illustrate, let’s examine the various relations in the E-R diagram of Fig- ure 7.15 that satisfy the above criteria:

inst dept. The schemas instructor and department correspond to the entity sets A and B, respectively. Thus, the schema inst dept can be combined with the instructor schema. The resulting instructor schema consists of the attributes {ID, name, dept name, salary}.

stud dept. The schemas student and department correspond to the entity sets A and B, respectively. Thus, the schema stud dept can be combined with the student schema. The resulting student schema consists of the attributes {ID, name, dept name, tot cred}.

course dept. The schemas course and department correspond to the entity sets A and B, respectively. Thus, the schema course dept can be combined with the course schema. The resulting course schema consists of the attributes {course id, title, dept name, credits}.

sec class. The schemas section and classroom correspond to the entity sets A and B, respectively. Thus, the schema sec class can be combined with the section schema. The resulting section schema consists of the attributes {course id, sec id, semester, year, building, room number}.

sec time slot. The schemas section and time slot correspond to the entity sets A and B respectively, Thus, the schema sec time slot can be combined with the section schema obtained in the previous step. The resulting section schema consists of the attributes {course id, sec id, semester, year, building, room number, time slot id}.

In the case of one-to-one relationships, the relation schema for the relationship set can be combined with the schemas for either of the entity sets.

We can combine schemas even if the participation is partial by using null values. In the above example, if inst dept were partial, then we would store null values for the dept name attribute for those instructors who have no associated department.

Finally, we consider the foreign-key constraints that would have appeared in the schema representing the relationship set. There would have been foreign-key constraints referencing each of the entity sets participating in the relationship set. We drop the constraint referencing the entity set into whose schema the relationship set schema is merged, and add the other foreign-key constraints to the combined schema. For example, inst dept has a foreign key constraint of the attribute dept name referencing the department relation. This foreign constraint is added to the instructor relation when the schema for inst dept is merged into instructor.


Classes
Quiz
Videos
References
Books