Wednesday, October 28, 2009

What is Database Modeling?

Database Modeling is the act of planning a database. Obviously, so far the relationships seem really confusing. This is because quite a bit of planning usually goes into database construction. The diagram below is an example of an old-fashioned drawn ER-Diagram (or an Entity-Relationship Diagram, which is a diagram relating the tables of a database to each other with symbols expressing whether the relationship is one-to-many, one-to-one, or many-to-many).


Each circle here is an "Entity," which is just a fancy way of saying table in this case. Each line represents a relationship, and each ratio represents the type of relationship. This appears to be a database of road law violators.

If you see the line between "Citation Notice" and "Location," you will see "1:1." This means that each citation notice will only have one location. This makes sense... right? You aren't going to have the same violation instance in two places, or else it would be two different instances. However, it seems to me that this relationship should be "M:1" as per the notation here, because you can have many different violation instances in the same location. Relationships that are 1:1 typically are pointless in database construction, because this says that each time you have a violation you are going to have a different location, and if that were the case you would just put that as a field in "Citation Notice" rather than as a separate table. However, this is just my opinion, and it wasn't my database. That's what I get for googling "ER diagram."

Now look at the line between "Citation Notice" and "Violation." That says "M:M." This makes sense, because the "Violation" table would be just the types of violations that are possible, and the "Citation Notice" table is all of the actual occurrences of the violations. At one instance a person could be committing several violations, and obviously violations are going to be committed in more than one instance by more than one person. Many-to-many relationships in databases pose a funny problem. This may seem confusing, but you are basically saying that we should put several violations connecting with several citation notices. This is impossible. In each table you only have one field to make the connection. As a result, with many-to-many relationships one has to create a child table, or a table that only exists to connect two many-to-many relationships. This table would typically be called "Violation to Citation Notice" or vise versa, and would simply have one field for violationID and one field for citationNoticeID. The uniqueness wouldn't occur with a primary key (mentioned "General Database Rules..."), but rather with both of the fields in combination. So the data would still be unique (which is the point of a database), it just wouldn't be in one column.

Below there is a more advanced ER Diagram.


The interesting things about this diagram is that (1.) this diagram is created automatically from access when you create relationships via linking keys, and (2.) this includes the field values of each table and illustrates which fields are linking with other tables, and which fields are receiving the link.

In the recent past, database modeling and diagramming has lost emphasis, because with more complex data structures, like hospital data needs, planning is only going to get you so far. It is also usually a tedious process. More emphasis has been placed on experimenting with data structures to see what works. Also, data structures are available for common data needs, like hospitals, banks, and small businesses.

The following is an excellent video on how to take a business' data needs and convert it into a database diagram:


No comments:

Post a Comment