Tuesday, October 27, 2009

What is a Database? Why Do We Use Them?

Before we get into what a database is, we should probably clarify what data are. Data are stored representations of objects and events that have meaning and importance in the user’s environment. Yeah, it’s hard to define concepts like data, and when you try, it ends up sounding so general that it could include anything. Basically, data are any collections of information that we find relevant or valuable. Your name, address, and social security number are data.

Now that we know what data are, how do we organize and collect data? This is the purpose of a database. A database is an organized collection of logically related data. This definition is much more easily graspable. It’s a collection of data that relate to each other. This isn’t as complicated as some people may think. A teacher’s hard copy gradebook is technically a database. It holds the data of the students’ grades and how they calculate into the final grade. A wedding guest book is a database. Folders of receipts could be considered a database. Anything that you use as a reference to look at data collected is considered a database.

So… what makes this type of database different from the database we always hear about? "Our database crashed, but thankfully we had a couple backups." Articles citing "databases of phone numbers" used in tracking crime. Most of the databases we hear about today are computer-based. These databases can be accessed automatically with programs and websites to show information to the user. So while with a hard copy gradebook the teacher would have to open up the book and consult the hard data, typically with computer-based databases another program accesses it and displays it in a nice fashion. For example, facebook has a (pretty extensive) database with all of the users, their information, and the relationship between users. Obviously if we looked at the database we would just see lines of information, but facebook’s programs access the database and serve it up nice and pretty.

What are your options with creating a database, you ask (or even if you didn’t)? Well, one way you can create a database easily is with Microsoft Excel. Many people create databases with excel, putting information like bookkeeping figures, class information, hour logs, inventory, etc. into spreadsheets. This is obviously data, correct? And it’s being collected in one spot that relates it, correct? Therefore, we consider it one type of database. However, there is another type of database that is most commonly thought of when the term is used. A Relational Database, the type of database created in Microsoft Access, establishes the relationships between entities by means of common fields included in a file, called a “relation.” WOAH! We just skipped over a lot of information. Basically the difference between a relational database and the databases we’ve been talking about are the following…

While “traditional” databases simply have columns of information and look more like spreadsheets, relational databases have what we call tables. These tables are basically like mini-spreadsheets. Instead of having one huge spreadsheet with all of the information in it, much of it being repeated, you have many that all relate to each other. So a personal finance table instead of saying "food" three dozen times in the same column and "clothing" two dozen times, it will say "1" and "2" in the respective columns. This table then relates to another table, called... say... "Expenditure Type" that has the code "1" in one column and what it means, "food" in the second column. This same concept is used for every part of data in a table that might be repeated, to eliminate repeated data (this process of eliminating all repeating data is called data normalization). So a different table would be created called "Restaurants," and a code would be given to each, and perhaps a different table would be created for each member of the family who might spend money. While this process may seem to cause more trouble for the creator, a relational database is MUCH more useful than a spreadsheet-like database, and will be much less of a hassle in the long run. (To see why, read the later post, "What's so Great About a Relational Database?")

To conclude, a database is basically a collection of information that you want to store. The kind of databases that are used on computers are typically relational database. A relational database is a database which includes several tables of data and links them together rather than repeating the same data over and over.

No comments:

Post a Comment