0:00 This video covers functional dependencies. 0:02 First a quick recap of relational design by decomposition. 0:06 The idea is that the application 0:08 designer writes mega relations 0:10 that contain all the information that 0:12 we want to have and properties of the data that we're storing. 0:15 And then the system will automatically 0:17 decompose those based on the properties that are specified. 0:20 The final set of decomposed relations 0:22 will satisfy what's called the 0:23 normal form, and normal forms 0:25 are good relations in the sense 0:26 that they have no anomalies and they 0:28 don't lose information from what 0:29 was specified in the original mega relations. 0:32 Now the properties themselves are defined 0:34 either as functional dependencies in 0:36 which case the system will 0:37 generate Boyce-Codd Normal Form relations 0:39 or multi-value dependencies, which 0:42 will then yield fourth normal form relations. 0:46 So this video, as you 0:47 can tell, is about functional dependencies themselves. 0:50 And let me say that functional 0:52 dependencies are actually a generally 0:53 useful concept in databases not only for relational design. 0:57 So for functional dependencies, 0:59 as we'll see soon, are a generalization 1:01 of the notion of keys and they 1:03 allow the system to, for 1:05 example, store the data more 1:06 efficiently when the system knows about functional dependencies. 1:09 Compression schemes can be 1:10 used based on functional dependencies for 1:12 storage, and also functional 1:14 dependencies, as a generalization of 1:15 keys, can be used to 1:17 reason about queries and for query optimization. 1:20 Which is a reminder of a very important aspect of database systems. 1:23 Which allows declarative queries to be executed by the system efficiently. 1:28 By the way a third use of 1:29 functional dependencies is for exam 1:32 questions in database courses because 1:34 there is a very nice theory functional dependencies as you'll see. 1:37 It's quite easy to write questions about them. 1:39 The remainder of the video will 1:40 cover functional dependencies in general 1:42 as a general concept and not 1:44 specifically to relational design and 1:46 then later videos will tie functional 1:48 dependencies back to design by decomposition. 1:51 As always we'll be using as 1:53 a sample a college application 1:54 database and this case I've 1:56 expanded the information that we're including quite a bit. 1:59 We'll be using these same two 2:00 relations as examples throughout 2:02 this video and subsequent videos on relational design. 2:05 In this case we're gonna 2:06 look at two relations, one with 2:07 information about students and then 2:09 a separate one with information about where they're applying. 2:11 The student information will have social 2:13 security number, the student's name, 2:15 their address and then three 2:16 attributes about their high school. We'll assume 2:18 there are unique codes for high 2:20 schools, but they also 2:21 have a name and are in 2:22 a city, finally the student's 2:24 GPA, and a priority 2:25 field for admissions that we'll see in a moment. 2:28 For applications, we'll have the 2:30 student's social security number, the 2:31 college name they're applying to, 2:33 the state of the college, the date of application and the major. 2:36 Not all of these attributes 2:37 will even be used in this 2:38 video, but like I said, 2:39 this will permeate several videos as our running example. 2:43 To motivate functional dependencies, let's 2:45 focus on the student relation 2:46 and specifically on the GPA and priority attributes. 2:49 Let's suppose that a student's 2:50 priority is determined by their GPA. 2:53 For example, we might have 2:55 a rule that says if GPA 2:57 is 3.8, greater than 3.8 then priority is 1. 3:01 If the GPA is 3:03 say, in between, let's say, 3.3 and 3.8. 3:06 Then we'll set priority 3:10 to be equal 2, and let's 3:12 say if the GPA, then, is 3:13 less than 3.3, then the priority value is three. 3:17 So, if this relationship is guaranteed 3:19 in our data then what we 3:21 can say is that any two 3:22 tuples that have the same 3:23 priority are guaranteed to have the same GPA. 3:26 And let's formalize that concept. 3:28 So I'm going to write 3:29 a little logical statement here to formalize the concept. 3:32 I'm going to use the for all 3:33 symbol from predicate logic and 3:35 I'm going to say if we have any pair of tuples. 3:37 So for all T or U, 3:39 those are tuples in the 3:40 student relation, then if 3:43 the student, if the T 3:44 and U have the same priorities 3:47 I'm sorry the same, let me 3:49 fix that, they have the the same GPA. 3:52 So if T.GPA equals U.GPA, 3:56 then and this is the 3:57 logical implication symbol, then 3:59 T.priority will equal U.priority. 4:03 So this logical statement is in 4:05 fact a definition of a functional 4:07 dependency, and we would 4:08 write that functional dependency as 4:10 GPA arrow priority, so 4:13 that says the GPA determines 4:15 the priority, any 2 tuples 4:17 with the same GPA must have the same priority. 4:19 So that was a specific example. 4:22 Now let's generalize our definition. 4:24 So, let me replace GPA 4:26 and priority here with just 4:27 two attributes A and B 4:29 E of say a relation 4:31 R. And then we'll also need to modify our definition. 4:35 So you can see, I've erased the specific attributes and relation. 4:38 And I'll just say for every. 4:39 T and U in our 4:41 relation, R. If T 4:43 dot A equals U 4:45 dot A then T dot 4:47 B equals U dot B 4:48 and that's the definition of 4:50 the functional dependency A determines B 4:52 for a relation R. Actually 4:54 I'm gonna generalize this definition even further 4:56 because functional dependencies don't 4:57 always have to have one 4:58 attribute on each side, they can actually have a set of attributes. 5:02 So now I write A1 A2 5:04 dot dot dot AN on 5:06 the left hand side, these will 5:07 all be attributes of relation 5:08 R. And on the right 5:10 hand side, B1 B2 comma 5:11 BM, again, attributes 5:14 of R. Modifying the formal 5:16 definition in red, now I 5:18 can't use the dot notation anymore, 5:20 so I'll use a square bracket and 5:21 I'll write A1 through An 5:24 equals U square bracket A1 5:26 through AN, so what 5:28 I'm saying here in this 5:29 case is that the two 5:31 tuples T and U 5:33 have the same values for 5:34 all of the attributes A-1 through 5:36 A-N, and if they do, 5:38 then they will also though have 5:39 the same values for B-1 through B-M. 5:40 We'll be getting to 5:43 some concrete examples soon. 5:45 Just one last bit of notation before we move on. 5:48 For simplicity I'm going 5:49 to often in the video 5:51 abbreviate a list of 5:52 attributes or set of attributes by using a bar. 5:54 So I'll write A bar to 5:56 indicate a set of attributes A 5:58 and B bar to indicate 6:00 a set of attributes B. And again, this is just for convenience. 6:03 So we've seen the motivation for a functional dependency in a relation. 6:06 A functional dependency for a 6:08 relation is based on knowledge of 6:10 the real world data that's being captured. 6:12 And when we specify one, just 6:13 like specifying keys, all instances 6:15 of the relation must adhere to the functional dependency. 6:19 So just to summarize functional dependencies, 6:21 we say that attribute, a 6:23 set of attributes A functionally 6:26 determines a set of attributes 6:27 B, if again any 6:29 time tuples agree in their 6:31 A values, they also agree in their B values. 6:33 And let's say that our relation 6:35 here R has [xx] tuples 6:36 A the tuples and B 6:38 and also a few more 6:39 attributes we'll call those C. 6:41 So, let me draw a picture 6:42 of a relation now here that 6:43 has those attributes in it. 6:45 So, we'll have here.. 6:47 lets just three columns, but again these are multiple attributes. 6:51 And these are the attributes 6:52 A, these are the attributes B, 6:54 and these are the attributes C 6:56 And if we put in a couple 6:57 of tuples, then what we'll 6:59 say is, if we have 7:00 two tuples here, that have 7:02 and I'm gonna use a bar even for the values in the tuples. 7:05 If we have two tuples whose A 7:07 values are the same then their 7:09 B values must also be the same. 7:10 And we're going to be using this type of template for some reasoning later on. 7:14 But we're not saying their C values 7:15 have to be the same so we 7:16 could have C1 and and different C values here as well. 7:19 But again, if we specify this 7:21 functional dependency, we are saying 7:22 that every instance of our 7:24 relation must satisfy the 7:26 condition that if the A 7:27 values are the same, then the B values are also the same. 7:30 Finally, let's come back to our example. 7:32 And I think when we start writing 7:33 functional dependencies for our actual 7:35 relations it'll give you a good idea of what they're really capturing. 7:38 So let's write a few 7:39 functional dependencies for our student 7:41 relation based on what we 7:42 expect to be true in 7:43 the real world in the data that we're capturing in the relation. 7:46 So here's a first example, Social 7:49 Security number functionally determines S-name, the student's name. 7:53 So what we say, we have 7:54 multiple tuples about a 7:56 particular student and they have 7:57 the same social security number, say 7:59 two tuples about student 8:02 123, we're expecting them to 8:03 have the same name in fact, 8:04 we're requiring them to have the same name. 8:06 And presumably because 1 to 8:08 3 is sort of identifying the 8:09 student that would be 8:10 a natural functional dependency that 8:12 would hold in this case and 8:14 similarly we would expect social security 8:16 number to determine address, although 8:17 we're already making an assumption 8:19 about the real world here, if 8:21 we have this particular functional dependency, 8:23 then we're saying a student doesn't move. 8:25 They don't have multiple addresses. 8:26 Every tuple that describes that 8:28 student by their social security 8:30 number will have the same address. 8:32 Let's go to the high school and see what might be going on there. 8:35 So I mentioned that the 8:36 high school code, what I'm 8:38 trying to capture there is 8:39 a unique code for each 8:40 high school that might be filled 8:42 in college application then we 8:44 would expect the high school code 8:45 to determine the high school name. 8:47 Every time we have the particular 8:49 high school code, maybe for different 8:50 students, it would have the same 8:51 name and also it would have the same city. 8:54 So that's an example of a 8:56 functional dependency with two attributes on the right hand side. 8:59 Now, let's look at one 9:00 that's a little more complicated, which 9:02 is one that has two attributes 9:04 on the left hand side instead. 9:06 That actually turns out to be a more interesting case. 9:08 In fact, in this particular case, 9:10 we can probably reverse the arrow 9:12 and have a functional dependency in the other direction. 9:14 If we have a combination of 9:16 high school name and high school 9:17 city, I'm going to 9:19 assume that's unique, that there aren't, 9:20 there's never two high 9:22 schools with the same name in the same city. 9:24 And if that's the case, 9:25 if that's unique, then we would 9:27 expect a functional dependency to the high school code. 9:29 Any time we have the same 9:31 name and city, we're talking about 9:32 the same high school so we should have the same code. 9:35 What other examples do we have? 9:37 If we assume that there's one 9:38 GPA for each student then 9:40 we'd have the social security number 9:41 determines the GPA and we 9:44 already talked about GPA determines 9:46 priority and another 9:48 example, actually if we 9:50 put these two together we should 9:51 see well if we have the 9:52 same social security number twice we should have the same priority. 9:56 And you may be thinking 9:56 well, that's kind of a 9:58 transitive rule if it takes these two and produces that one. 10:00 And indeed it is. 10:01 And we'll talk about rules for functional dependencies later. 10:03 And there may be more in this case. 10:06 Now let's take a look at functional dependencies for our apply relation. 10:10 Actually, this one is a 10:11 little trickier, it's even possible 10:13 there are no functional dependencies at all. 10:15 It really depends on the real world data, the real world constraints. 10:19 One possibility for example is 10:21 that every college has a 10:22 particular single date on which it receives its application. 10:25 So if that were the case, then 10:27 we'd have the college name determines the date. 10:30 In other words, every application 10:32 for a particular college must have the same date. 10:35 Another constraint might be that 10:36 students are only allowed to 10:38 apply to a single major at each college they apply to. 10:42 So if that were the case, this 10:43 is another one with two attributes 10:45 on the left hand sid, we'd 10:46 say that the social security number, 10:48 together with the college, implies the major. 10:51 In other words, we cannot have 10:53 a student and college combination 10:55 with two different majors and that captured that constraint. 10:58 Maybe we have a constraint 11:00 that students are only allowed to apply to colleges in one state. 11:04 That seems rather unlikely, but I 11:05 was struggling to find functional 11:07 dependencies for this case. 11:08 In that case, we'd have this 11:10 function dependency, again, saying a 11:11 student could only apply to colleges in a single state. 11:14 For the apply relation specifically, again, 11:16 it's really the real world constraints 11:18 that drive which functional dependencies hold for the relation. 11:21 But it's important to understand those 11:23 constraints so they can be 11:24 translated to functional dependencies, which 11:26 then can drive a good relational design. 11:29 So I've alluded a few 11:30 times to the fact that functional 11:31 dependencies generalize the notion of keys. 11:33 And let's make that connection explicit now. 11:36 Let's suppose we have a relation 11:37 R and R has no duplicate tuples. 11:40 R has some attributes A and it has some other attributes. 11:43 Let's call those B. And let's 11:45 suppose that we have a functional dependency that 11:47 A determines all of the attributes in the relation. 11:50 Now let me draw a picture of that. 11:52 So here's our relation R with 11:54 attributes A and attributes B. 11:56 And now let's suppose that 11:57 we have two tuples with the 11:59 same values for A. Now 12:00 we'll just write those as little a bar. 12:03 And now let's see what happens with the B values. 12:05 We'll make them B1 bar and B2 bar. 12:08 Because we have the functional 12:10 dependency, the equal values 12:12 here for A say that B1 and B2 have to be equal. 12:15 So B1 equals B2. 12:16 Let's just erase the little one and two. 12:19 But now we've generated duplicate tuples. 12:21 So what the functional 12:22 dependency tells us in 12:24 that case is that these two tuples 12:25 are actually the same, or rather 12:26 we cannot have two separate 12:28 tuples with the same A values. 12:30 So we cannot have two separate 12:32 tuples with the same A values 12:33 is exactly what it means 12:35 for A to be a key. 12:38 Now again, this is only the case when we have 12:39 no duplicates in R. But 12:41 if we have no duplicates, if 12:43 a set of attributes determines 12:44 all the other attributes, then those attributes are a key. 12:49 So here are a few other definitions related to functional dependencies. 12:52 We have a notion of a trivial functional dependency. 12:55 A functional dependency is trivial 12:57 A to B if B 12:59 is a subset of A. And 13:00 let's draw a little picture of what that means. 13:03 So here we have our attributes 13:05 A, and then we're saying 13:06 and that's all of the attributes 13:08 here, and what we're saying is 13:09 that B is a subset 13:11 of A. So in other words, 13:12 some portion of these attributes 13:14 here, we'll just mark that in 13:15 purple here are attributes 13:17 B. Well, it's pretty obvious 13:19 that if two tuples have 13:20 the same values across their 13:22 entire expanse here for A's, 13:24 then obviously they're also going 13:26 to have the same values for 13:27 just this portion here, the B portion. 13:29 So that's why it's called the trivial functional dependency. 13:33 So a nontrivial functional dependency 13:35 is a functional dependency that's not a trivial one. 13:37 By the way, FD is a common abbreviation for functional dependency. 13:41 So if we have A determines 13:42 B, then that is 13:44 non-trivial if it's not 13:45 the case that B is a 13:46 subset of A. Going to 13:48 our picture, let's have here 13:50 our attributes A. And now 13:52 we're saying there are some 13:53 attributes in B that are 13:55 not part of A. So 13:56 we can say, well maybe B 13:57 is partially part of A, 13:59 but there's some portion that is 14:00 not part of A. So 14:02 let's say that these are our B attributes here. 14:04 So now our functional dependency 14:06 is actually saying something. 14:08 It's saying we have two 14:09 attributes that agree in these 14:11 values, then they're also 14:13 going to agree in these values over here. 14:15 And the last definition is 14:17 a completely nontrivial functional dependency, 14:20 and that's A determines B 14:22 where A and B have no intersection at all. 14:25 So in that case, again, going 14:26 back to our picture, we'll have 14:28 our A attributes here and 14:30 then our B attributes are going 14:32 to be some portion of the remaining attributes. 14:34 And here we're saying a lot. 14:36 We're saying if these two have 14:37 the same value, then these two have the same value as well. 14:40 And the reality is that completely 14:42 nontrivial functional dependencies are the 14:44 ones that we're most interested in specifying. 14:47 I mentioned that there are some 14:48 rules that apply to all 14:49 functional dependencies and I'll give a couple of those right now. 14:52 The first one is the splitting rule. 14:54 The splitting rules says that if 14:55 we have a set of attributes 14:56 that determine another set of 14:58 attributes - and this time I'm 14:59 going to write it out instead of using 15:00 the bar notation - then 15:02 we also have...this implies that 15:05 we have A determines B-1 15:07 and A determines B-2 and so on. 15:11 In other words, we can split the right side of the functional dependency. 15:14 And if you think about it, 15:15 this is pretty If we say 15:17 that the when the A value 15:18 are the same all of the 15:19 B values have to be the 15:20 same then certainly when the 15:22 A values are the same the B values have to be the same independently. 15:25 Now you might wonder if the splitting rule also goes the other way. 15:28 So let's say we have, I'll 15:30 put the left hand side, I'll 15:31 write out the attributes explicitly so 15:33 let's say we have A-1 15:35 through A-N determines B then is 15:37 it from that the case 15:38 that A-1 determines B and 15:41 A-2 determines B on its own, and so on. 15:44 Well, the answer to that is no. 15:46 And I'll give a simple example from our college application database. 15:50 So let's say that we 15:51 have the functional dependency high school 15:53 name and high school 15:55 city determines high school code. 15:58 We talked about that one before. 16:00 Oops, here, that's an arrow there. 16:02 So that says that when we 16:03 have a particular name and 16:05 city combination for a high 16:06 school, that's identifying a single 16:08 high school, and so it will always have the same code. 16:11 So that makes a lot of sense 16:13 but it is not the case, 16:14 I'll put a big X here, 16:16 necessarily, that if we 16:17 split the left hand side 16:19 that high school name alone will determine a high school code. 16:22 So for example, I would 16:24 venture to guess that there 16:25 are a lot of Jefferson High Schools 16:26 all over the country and they 16:28 won't all will be the same high school. 16:30 So it's really the combination of attributes 16:32 on the left that together functionally 16:34 determine the right hand side 16:36 and so we do not then have 16:37 the splitting rule as a general principle. 16:39 The combining rule is the inverse of the splitting rule. 16:42 It says if we have A 16:44 determines B-1 and we have 16:46 A determines B-2 and so 16:47 on up to A determines 16:49 B-N, then we also 16:51 have A determines B-1 through B-N together. 16:57 Next, we have two trivial dependency rules. 16:59 Let me remind remind you what a trivial dependency is. 17:01 It's A determines B where 17:03 B is a subset of A. 17:05 So in other words, every left hand 17:06 side determines itself or any 17:08 subset of itself, and that's 17:10 what drives the two trivial dependency rules. 17:12 One of them says that if 17:13 we have A determines B, 17:16 then we also have A determines 17:18 A union B, so in 17:19 other words we can add to 17:21 the right hand side of every 17:22 dependency what's already on the left hand side. 17:24 Sort of as a converse, 17:26 we can also say that if 17:27 A determines B then A 17:30 determines A intersect B. 17:33 Actually this one is also 17:35 implied by the splitting rule. 17:36 So we have two different rules in this case that are doing the same thing. 17:40 And finally the transitive rule, which is the one we alluded to earlier. 17:43 It says if we have A 17:44 determines B and and 17:46 we have B determines C, then 17:48 we have A determines C. 17:50 Now all of these rules 17:51 can actually be proven formally 17:53 and I'm going to go through 17:54 a sketch of this particular one. 17:56 So here's my relation R and 17:59 I'll have attributes A, B, 18:01 C and then let's let D be the left of our attributes. 18:04 And my goal is to prove 18:06 that A determines C. And 18:08 the information I have to 18:09 prove that is these two functional dependencies here. 18:12 So to prove that A determines 18:14 C, I have to 18:15 prove that if two tupples 18:16 have the same A values--we'll put 18:18 little bars there--then they also have the same C values. 18:22 So I need to show that these two C values here are going to be the same. 18:25 So you can see what's going to happen. 18:27 Using the first functional dependency, 18:29 because these two A values 18:31 are the same, I know their B values must be the same. 18:34 And then I just use the second 18:35 functional dependency and because 18:37 the two B values are the 18:38 same, I then know that 18:39 the two C values are the 18:41 same, and that has 18:42 shown that this functional dependency holds. 18:44 And you can do a 18:45 similar thing to prove the other rules to yourself. 18:48 Now, I'm going to introduce the concept of closure of attributes. 18:51 Let's suppose we're given a 18:52 relation, a set of 18:54 functional dependencies for that 18:56 relation, and then a set 18:57 of attributes A bar that are part of that relation. 19:01 I'm interested in finding all attributes 19:03 B of the relation such 19:04 that A bar functionally determines 19:07 B. And this is what's 19:08 called the closure, and I'll 19:09 show in a bit why we might want to compute that. 19:12 Notationally the closure is written using the + sign. 19:15 So the closure of A bar is A bar +. 19:17 Let me be a little 19:19 more explicit, let me write 19:20 out A bar, because remember 19:22 whenever we write bar, we're actually talking about a list of attributes. 19:25 So we're going to write it 19:26 as A-1 through A-N, and I'm interested 19:28 in computing the closure of that set of attributes. 19:31 In other words, the entire set 19:33 of attributes that are functionally 19:34 determined by the attributes A-1 through A-N. 19:36 And I'm going to give an algorithm for doing that. 19:40 My algorithm says start with the set itself. 19:43 So I'm going to start with 19:45 A1 through AN, except I'm not going 19:48 to close that. 19:49 I'm going to leave a little space there. 19:50 And then I'm going to repeat until there's no change. 19:54 I'm going to add attributes to 19:56 that set until I get to the closure. 19:59 So I'm going to repeat. 20:00 If A determines B 20:03 - and now we'll put bars 20:04 in here - and all 20:06 of A are in the set, 20:09 then add B to the set. 20:12 So I might have my 20:13 attributes here, A1 through AN, 20:15 and it might be the case that 20:17 A, you know, 4 determines 20:19 attributes C and D, 20:20 so I'll add C and D to the set. 20:22 I repeat. 20:23 Maybe there's a C goes 20:24 to E and I'll add E to the set, and so on. 20:26 And when there's no more change, then I've computed the complete closure. 20:30 Now if you happen to be someone 20:31 who likes to think in terms of 20:32 rules instead, what we're effectively 20:34 doing is applying the combining 20:36 and transitive rules to the 20:38 attributes in the set until there's no change. 20:41 So let's run an example of the closure algorithm. 20:43 Let's go to our complete student 20:45 table and let's add three functional dependencies. 20:48 One that says that student's social 20:50 security number determines their name, 20:51 address and GPA and that 20:53 would be a normal... GPA determines 20:55 priority, and high school 20:57 code determines high school name and high school city. 20:59 Again, these are all examples we 21:00 gave earlier that would be 21:01 natural functional dependencies for this particular relation. 21:05 Let's suppose that we're interested 21:07 in computing the closure of 21:08 the two attributes - social security 21:10 number and high school code. 21:12 So in other words, we want 21:13 to find all attributes in 21:15 the student relation that are functionally 21:17 determined by these two attributes. 21:20 So the algorithm says start with 21:22 the two attributes themselves, social security 21:24 number and high school code, 21:26 and then add attributes that 21:27 are functionally determined by ones in the set. 21:30 So if we start with our 21:31 first functional dependency here, because 21:33 we have social security number, that 21:34 allows us to add the 21:35 student name, the address, 21:40 the GPA... and that's it for that one. 21:44 Now let's repeat again. 21:46 Because we have the GPA, our 21:48 second functional dependency tells 21:49 us we can add the priorityAnd that's it for that one. 21:54 And then since we have the 21:55 high school code in this set, 21:56 our third one tells us 21:57 that we can add the high school 21:59 name and the high school 22:00 city and at that 22:01 point we certainly know we're done 22:03 because we've actually got the entire relation at this point. 22:06 Now I didn't pick this particular example at random. 22:09 We took two attributes, social security 22:11 number and high school code, we 22:13 computed their closure, and we 22:14 discovered that they together functionally 22:16 determine all attributes of the student relation. 22:20 Now remember just a few slides 22:21 ago, when a set 22:23 of attributes functionally determine all the 22:24 attributes, then those attributes together form a key for the relation. 22:28 And in fact if you think 22:29 about it, social security number and 22:31 high school code together are a natural key for the relation. 22:34 A student might go to 22:35 multiple high schools, but there's 22:37 no reason to have more 22:38 than one tupple with the 22:40 combination of a particular 22:41 student and the high school they attended. 22:44 So let's formalize a bit this 22:45 relationship between the notion of closure and keys. 22:48 Let's suppose we're interested in 22:49 answering the question... is a 22:51 particular set of attributes 22:52 a key for a relation we can use closure to do that. 22:55 Remember, we have the relation, 22:57 and we have a set of 22:58 functional dependencies for the relation, 23:00 so what we do is we 23:01 compute the closure of 23:04 A that's going to give 23:05 us a set of attributes and 23:07 if that equals all attributes, then 23:11 A is the key. 23:12 As a more general question, let's 23:14 suppose we're given a set 23:15 of functional dependencies for a 23:16 relation, how can we use 23:18 it to find all of the keys for the relation? 23:20 So use the same basic idea 23:22 of computing closure, but this 23:23 time we have to do it for 23:24 every subset of the attributes. 23:27 So let's call each subset A 23:30 bar and we just 23:31 check if the closure of A bar determines all attributes. 23:35 And if it does, then it's a key. 23:37 And by considering every subset of 23:39 the attributes in R, then 23:40 we're considering every possible key 23:42 and we'll just check for 23:43 each one whether it's actually a key. 23:45 Now that seems fairly inefficient 23:47 and actually we can be a little more efficient than that. 23:50 We can consider these subsets in 23:52 increasing size order. 23:54 So for example, if 23:55 we determine that A and 23:58 B together determine all 24:00 attributes functionally determine all attributes in the relation. 24:03 That tells us AB is 24:04 a key, and that actually 24:06 also tells us that every 24:07 superset of AB is also a key. 24:09 And if you think about it, that fills out naturally. 24:12 So the real algorithm would 24:13 say let's start with single attributes and determine if they are key. 24:16 If a single attribute say 24:18 C, is a key then so is 24:19 every super set of C, and 24:21 then we go to pairs of 24:22 attributes and so on. 24:24 Finally let's talk about how we 24:25 specify the set of functional dependencies for a relation. 24:28 First I'll define a notion of 24:29 one set of functional dependencies following from another one. 24:32 So let's suppose we have a 24:33 relation R and we have 24:34 two sets of functional 24:36 dependencies that aren't identical, S-1 and S-2. 24:39 We see that S2 follows from 24:41 S1 if every relation instance 24:43 satisfying S1 also satisfies S2. 24:47 As a simple example, suppose S-2 24:50 is social security number determines 24:52 priority, and suppose S-1 24:55 is the two functional dependencies 24:57 - social security number determines GPA, 25:00 and GPA determines priority. 25:02 Then it's certainly the case that 25:04 in an... for this example, S-2 follows from S-1. 25:08 Every time we have a 25:09 relation that satisfies social security 25:11 number determines GPA and GPA 25:13 determines priority, then that 25:15 relation will also satisfy social 25:17 security number determines priority and 25:18 we kind of proved that actually in an earlier part of this video. 25:22 So one question you might 25:23 have is how do we test whether 25:25 one set of functional dependencies follows from another. 25:28 That really boils down to 25:29 testing whether one functional dependency follows from a set. 25:32 So and let me just make 25:33 this A bar, B bar here to make clear they can be sets of attributes. 25:36 Because if we have S-1 and 25:37 S-2, then we just check whether 25:40 every functional dependency in S-2 25:42 follows from the functional dependencies in S-1. 25:44 There's actually two ways of going about this test. 25:48 One of the ways is to use the closure. 25:50 We'll compute A+ based on 25:53 the functional dependencies that are 25:54 in S and then we'll 25:56 check if B is in the set. 26:00 Reminder - computing the 26:01 closure tells us every attribute 26:04 that's functionally determined by the 26:05 attributes in A based on 26:07 the functional dependencies in S. 26:09 If those include B, then 26:11 A determines B does indeed 26:13 follow from S. The 26:14 second oneway to check is based 26:16 on a set of axioms, a 26:18 set of rules called Armstrong's axioms. 26:20 We saw some rules for functional 26:22 dependencies earlier, but Armstrong's 26:24 Axioms are a specific set 26:25 of rules that are what's called complete. 26:28 It's guaranteed that if one 26:30 thing about functional dependencies can be 26:31 proved from another, then it 26:33 can be proved using the Armstrong's Axioms. 26:36 I'm not going to cover Armstrong's Axioms 26:38 in the videos, but you can 26:39 look at any of the recommended readings and find them there. 26:42 So you might wonder why did 26:43 I introduce this notion of one 26:45 set of functional dependencies following from 26:47 another and for that matter, 26:48 why did I introduce trivial and 26:49 non-trivial functional dependencies. 26:52 Well, I'm going to sum 26:53 up in one sentence what 26:55 we're looking for when we specify 26:57 the set of functional dependencies for a relation. 26:59 So we have a notion of the 27:01 real world data, we have 27:02 our attributes but we need 27:04 to specify the functional dependencies in 27:06 order to get a good designer for some of the reasons that I mentioned. 27:08 What we would like to find 27:10 is a minimal set of 27:12 completely non-trivial functional dependencies 27:15 such that all functional dependencies 27:16 that hold on the relation follow 27:18 from using the technical definition I 27:20 gave, the dependencies in this set. 27:22 Wow, that seems like some 27:24 very complicated thing, but the 27:26 fact is when you start 27:27 specifying functional dependencies, you'll discover 27:29 that you will actually get this definition pretty naturally. 27:33 So to conclude, functional dependencies are 27:35 a generally useful concept in database systems. 27:37 They 're a key ingredient 27:39 of doing relational design by decomposition, 27:41 because we use the functional dependencies 27:43 to get Boyce-Codd Normal Form which 27:45 is what we'll cover in the next 27:46 video, but they're also 27:47 useful for the system to 27:48 determine how to store data, 27:50 to compress data and also 27:51 to reason about query processing.