Removing Redundant Attributes in Entity Sets
When we design a database using the E-R model, we usually start by identifying those entity sets that should be included. For example, in the university organiza- tion we have discussed thus far, we decided to include such entity sets as student, instructor, etc. Once the entity sets are decided upon, we must choose the appro- priate attributes. These attributes are supposed to represent the various values we want to capture in the database. In the university organization, we decided that for the instructor entity set, we will include the attributes ID, name, dept name, and salary. We could have added the attributes: phone number, office number, home page, etc. The choice of what attributes to include is up to the designer, who has a good understanding of the structure of the enterprise. Once the entities and their corresponding attributes are chosen, the relationship sets among the various entities are formed. These relationship sets may result in a situation where attributes in the various entity sets are redundant and need to be removed from the original entity sets. To illustrate, consider the entity sets instructor and department:
• The entity set instructor includes the attributes ID, name, dept name, and salary, with ID forming the primary key.
• The entity set department includes the attributes dept name, building, and bud- get, with dept name forming the primary key.
We model the fact that each instructor has an associated department using a relationship set inst dept relating instructor and department.
The attribute dept name appears in both entity sets. Since it is the primary key for the entity set department, it is redundant in the entity set instructor and needs to be removed.
Removing the attribute dept name from the instructor entity set may appear rather unintuitive, since the relation instructor that we used in the earlier chap- ters had an attribute dept name. As we shall see later, when we create a relational schema from the E-R diagram, the attribute dept name in fact gets added to the relation instructor, but only if each instructor has at most one associated depart- ment. If an instructor has more than one associated department, the relationship between instructors and departments is recorded in a separate relation inst dept.
Treating the connection between instructors and departments uniformly as a relationship, rather than as an attribute of instructor, makes the logical relationship explicit, and helps avoid a premature assumption that each instructor is associated with only one department.
Similarly, the student entity set is related to the department entity set through the relationship set student dept and thus there is no need for a dept name attribute in student.
As another example, consider course offerings (sections) along with the time slots of the offerings. Each time slot is identified by a time slot id, and has associated with it a set of weekly meetings, each identified by a day of the week, start time, and end time. We decide to model the set of weekly meeting times as a multivalued composite attribute. Suppose we model entity sets section and time slot as follows:
• The entity set section includes the attributes course id, sec id, semester, year, building, room number, and time slot id, with (course id, sec id, year, semester) forming the primary key.
• The entity set time slot includes the attributes time slot id, which is the primary key,4 and a multivalued composite attribute {(day, start time, end time)}.5
These entities are related through the relationship set sec time slot. The attribute time slot id appears in both entity sets. Since it is the primarykey for the entity set time slot, it is redundant in the entity set section and needs to be removed.
As a final example, suppose we have an entity set classroom, with attributes building, room number, and capacity, with building and room number forming the primary key. Suppose also that we have a relationship set sec class that relates section to classroom. Then the attributes {building, room number} are redundant in the entity set section.
A good entity-relationship design does not contain redundant attributes. For our university example, we list the entity sets and their attributes below, with primary keys underlined:
• classroom: with attributes (building, room number, capacity).
• department: with attributes (dept name, building, budget).
• course: with attributes (course id, title, credits).
• instructor: with attributes (ID, name, salary).
• section: with attributes (course id, sec id, semester, year).
• student: with attributes (ID, name, tot cred).
• time slot: with attributes (time slot id, {(day, start time, end time) }).
The relationship sets in our design are listed below:
• inst dept: relating instructors with departments.
• stud dept: relating students with departments.
• teaches: relating instructors with sections.
• takes: relating students with sections, with a descriptive attribute grade.
• course dept: relating courses with departments.
• sec course: relating sections with courses.
• sec class: relating sections with classrooms.
• sec time slot: relating sections with time slots.
• advisor: relating students with instructors.
• prereq: relating courses with prerequisite courses.
You can verify that none of the entity sets has any attribute that is made redundant by one of the relationship sets. Further, you can verify that all the information (other than constraints) in the relational schema for our university database, which we saw earlier in Figure 2.8 in Chapter 2, has been captured by the above design, but with several attributes in the relational design replaced by relationships in the E-R design.