0:01 In this video, we'll learn about the Relational Model. 0:03 The Relational Model is more than 0:05 35 years old, and it's 0:07 really the foundation of database management systems. 0:09 It's spawned a many billion dollar industry. 0:13 The relational model underlies all 0:15 commercial database systems at this point in time. 0:18 It's actually an extremely simple model and that's one of its benefits. 0:22 Furthermore, it can be queried. 0:24 By that I mean we can ask 0:25 questions of databases in the 0:26 model using High Level Languages. 0:29 High Level Languages are simple, yet 0:31 extremely expressive for asking questions over the database. 0:35 And finally, very importantly there 0:37 are extremely efficient implementations of 0:39 the relational model and of 0:41 the query languages on that model. 0:43 So let's move ahead and 0:45 understand the basic constructs in the relational model. 0:48 So, the primary construct is in fact, the relation. 0:51 A database consists of a 0:53 set of relations or sometimes 0:55 referred to as "tables", each of which has a name. 0:58 So, we're gonna use two relations in our example. 1:01 Our example is gonna be a fictitious database 1:03 about students applying to colleges. 1:05 For now we're just gonna look at the students and colleges themselves. 1:08 So we're gonna have two tables, 1:09 and let's call those tables 1:11 the Student table and the College table. 1:15 Now, as an aside, there's 1:16 a healthy debate in the 1:18 database world about whether tables 1:20 relations ought to be named using the singular or the plural. 1:23 I personally don't have a stake in that debate. 1:25 I'm going to use the singular. 1:27 Next, we have the concept of attributes. 1:30 So every relation and relational 1:32 database has a predefined set 1:34 of columns or attributes each of which has a name. 1:37 So, for our student table, 1:39 let's say that each student 1:41 is gonna have an ID, a 1:42 name, a GPA and a photo. 1:46 And for our college 1:48 table, let's say that every 1:50 college is going to have 1:51 a name, a state, and an enrollment. 1:55 We'll just abbreviate that ENR. 1:57 So those are the labeled columns. 1:59 Now the actual data itself 2:01 is stored in what are 2:02 called the tuples (or the rows) in the tables. 2:05 So let's put a couple 2:06 of the data tables, data tuples in our tables. 2:09 So let's start with the 2:12 students and lets say that 2:14 our first student has ID 123. 2:15 Name is Amy - 2:18 GPA 3.9, and she's happy with that. 2:21 So she has a smiley photo. 2:23 And our second student is 234, 2:26 Bob - his GPA 2:29 is 3.4. 2:30 He's not quite as happy. 2:32 And typically of course a 2:33 table will have thousands, maybe 2:35 millions, even sometimes billions of 2:37 rows, each row containing a value for each attribute. 2:41 In our college table, let's suppose 2:43 we have .... well of course 2:44 we're going to start with Stanford in 2:46 the state of California and Stanford's enrollment is 15,000. 2:52 We'll include our cross-bay rival Berkeley 2:54 again, in the state of California. 2:56 Berkeley's enrollment is a whopping 36,000. 2:58 And last of all, 3:01 we are going to not be West Coast biased. 3:04 We'll include MIT in the 3:05 state of Massachusetts with an enrollment of 10,000. 3:08 Now of course there's 3:09 gonna be again many more 3:11 tuples in the college table, and 3:12 many more tuples in the student table. 3:15 Okay, next, let me 3:16 mention that in a relational 3:18 database, typically each attribute or 3:19 column has a type 3:21 sometimes referred to as a domain. 3:23 For example, the ID might 3:25 be an integer, the name 3:26 might be a string, GPA might 3:28 be a float, photo might 3:29 be a jpeg file. 3:32 We do also in most relational 3:34 databases have a concept of enumerated domain. 3:37 So for example, the state might 3:38 be an enumerated domain for 3:40 the 50 abbreviations for states. 3:43 Now, it's typical for relational 3:45 databases to have just 3:46 atomic types in their 3:48 attributes as we have 3:50 here, but many database 3:52 systems do also support structured 3:54 types inside attributes. 3:58 Okay, a little bit more terminology. 4:00 The schema of a 4:01 database is the structure of the relation. 4:04 So the schema includes the name 4:07 of the relation and the attributes 4:09 of the relation and the types of those attributes. 4:12 Where the instance is the 4:13 actual contents of the table at a given point in time. 4:16 So, typically you set up 4:17 a schema in advance, then the 4:19 instances of the data will change over time. 4:23 Now, I mentioned that most columns have types. 4:26 But, there's also a special value 4:28 that's in any type of 4:29 any column and that's a 4:31 special value known as null, and nulls 4:32 are actually quite important in relational databases. 4:35 Null values are used to 4:36 denote that a particular 4:38 value is maybe unknown or undefined. 4:42 And, so let's suppose, let's add 4:44 another tuple to our database. 4:47 Let's say 345, another student 4:50 named Craig, and for, 4:51 whatever reason Craig doesn't have a GPA. 4:53 Maybe Craig is home schooled, 4:55 maybe Craig doesn't want to reveal his GPA. 4:57 So then the database would contain 4:59 a null value for Craig 5:01 and we'll just put a neutral face there. 5:04 Or, for example, maybe Bob doesn't 5:07 want to have his photo in 5:08 the database, so then Bob 5:10 would have a null value 5:12 for his photo, again nulls can go anywhere. 5:15 Now null values are useful 5:17 but one has to be very 5:18 careful in a database system 5:19 when you run queries 5:21 over relations that have null values. 5:23 In a later video we'll go 5:24 into this in more detail, but I 5:25 just wanted to give a just 5:26 sort of example of what can happen. 5:29 So, let's suppose we're 5:30 asking a query over our 5:31 student table of all students 5:33 whose GPA is greater than 3.5. 5:36 So when we run 5:37 that query on our database 5:38 obviously we'll get Amy out, 5:40 obviously we won't get Bob out, but should we get Craig? 5:43 The answer is No. 5:44 We don't know for a fact 5:46 that Craig's GPA is greater 5:47 than 3.5, so we'll only get 5:48 one student out from that query. 5:50 Now let's suppose we 5:52 had another query, where we 5:53 were gonna ask for the GPA 5:55 less than or equal to 3.5. 5:57 So, similarly where we 5:59 would not have Amy in result 6:01 and we would certainly have Bob in 6:02 the result and similarly would 6:04 not have Craig in the 6:05 result because we don't know that 6:06 his GPA is less than or equal to 3.5. 6:08 So far so good, but 6:10 it gets a little weird is 6:11 when we add an or here 6:13 in our query, we say I 6:14 want everyone who's GPA is greater 6:16 than 3.5 or who's GPA is less than or equal to 3.5. 6:18 And even though it 6:20 looks like every tuple should 6:22 satisfy this condition, that it's 6:23 always true, that's not the case when we have null values. 6:27 So, that's why one has to 6:28 be careful when one uses 6:29 null values in relational databases. 6:32 Let me erase this now and 6:33 let's move on to our next concept 6:35 which is the concept of Key. 6:38 Key is again another important concept in relational databases. 6:42 And, a key is an 6:43 attribute in of a relation 6:45 where every value for that attribute is unique. 6:49 So if we look at the 6:50 student relation, we can feel 6:52 pretty confident that the ID is going to be a key. 6:54 In other words, every tuple is 6:56 going to have a unique for ID. 6:59 Thinking about the college relation, it's a little less clear. 7:02 We might be tempted to say 7:03 that the name of the college is 7:04 an ID, that actually college 7:06 names probably are not unique across the country. 7:10 There's probably a lot of or 7:10 several colleges named Washington college for example. 7:14 You know what, we're allowed to 7:15 have sets of attributes that 7:16 are unique and that makes sense in the college relation. 7:19 Most likely the combination of 7:21 the name and state of 7:22 a college is unique, and that's 7:24 what we would identify as the key for the college relation. 7:27 Now, you might wonder why it's 7:28 even important to have attributes that are identified as keys. 7:32 There's actually several uses for them. 7:34 One of them is just to identify specific tuples. 7:37 So if you want to run 7:38 a query to get a specific 7:40 tuple out of the database you 7:42 would do that by asking for that tuple by its key. 7:45 And related to that 7:46 database systems for efficiency 7:48 tend to build special index 7:50 structures or store the database in a particular way. 7:53 So it's very fast to find a tuple based on its key. 7:56 And lastly, if one relation 7:58 in a relational database wants 8:00 to refer to tuples of another, 8:02 there 's no concept of pointer in relational databases. 8:05 Therefore, the first relation 8:07 will typically refer to a 8:08 tuple in the second relation by its unique key. 8:11 As our videos develop we'll see the importance of keys. 8:15 Okay, just to wrap up, 8:16 I'll mention how one creates relations 8:19 or tables in the SQL language. 8:21 It's very simple, you just 8:22 say "create table," give the 8:24 name of the relation and a list of the attributes. 8:27 And if you want to give types for the attributes. 8:29 It's similar except you follow 8:31 each attribute name with its type. 8:33 So to wrap up, the relational 8:35 model has been around a long time. 8:37 Has started a huge industry. 8:39 It's used by all database systems. 8:41 As you've seen it's a very 8:42 simple model and will shortly 8:43 see that it can be queried with very nice languages. 8:46 And, finally, it's been implemented very efficiently.