0:00 This video covers functional dependencies. 0:02 First a quick recap of relational design by decomposition. 0:06 The idea is that the application 0:08 designer writes mega relations 0:10 that contain all the information that 0:12 we want to have and properties of the data that we're storing. 0:15 And then the system will automatically 0:17 decompose those based on the properties that are specified. 0:20 The final set of decomposed relations 0:22 will satisfy what's called the 0:23 normal form, and normal forms 0:25 are good relations in the sense 0:26 that they have no anomalies and they 0:28 don't lose information from what 0:29 was specified in the original mega relations. 0:32 Now the properties themselves are defined 0:34 either as functional dependencies in 0:36 which case the system will 0:37 generate Boyce-Codd Normal Form relations 0:39 or multi-value dependencies, which 0:42 will then yield fourth normal form relations. 0:46 So this video, as you 0:47 can tell, is about functional dependencies themselves. 0:50 And let me say that functional 0:52 dependencies are actually a generally 0:53 useful concept in databases not only for relational design. 0:57 So for functional dependencies, 0:59 as we'll see soon, are a generalization 1:01 of the notion of keys and they 1:03 allow the system to, for 1:05 example, store the data more 1:06 efficiently when the system knows about functional dependencies. 1:09 Compression schemes can be 1:10 used based on functional dependencies for 1:12 storage, and also functional 1:14 dependencies, as a generalization of 1:15 keys, can be used to 1:17 reason about queries and for query optimization. 1:20 Which is a reminder of a very important aspect of database systems. 1:23 Which allows declarative queries to be executed by the system efficiently. 1:28 By the way a third use of 1:29 functional dependencies is for exam 1:32 questions in database courses because 1:34 there is a very nice theory functional dependencies as you'll see. 1:37 It's quite easy to write questions about them. 1:39 The remainder of the video will 1:40 cover functional dependencies in general 1:42 as a general concept and not 1:44 specifically to relational design and 1:46 then later videos will tie functional 1:48 dependencies back to design by decomposition. 1:51 As always we'll be using as 1:53 a sample a college application 1:54 database and this case I've 1:56 expanded the information that we're including quite a bit. 1:59 We'll be using these same two 2:00 relations as examples throughout 2:02 this video and subsequent videos on relational design. 2:05 In this case we're gonna 2:06 look at two relations, one with 2:07 information about students and then 2:09 a separate one with information about where they're applying. 2:11 The student information will have social 2:13 security number, the student's name, 2:15 their address and then three 2:16 attributes about their high school. We'll assume 2:18 there are unique codes for high 2:20 schools, but they also 2:21 have a name and are in 2:22 a city, finally the student's 2:24 GPA, and a priority 2:25 field for admissions that we'll see in a moment. 2:28 For applications, we'll have the 2:30 student's social security number, the 2:31 college name they're applying to, 2:33 the state of the college, the date of application and the major. 2:36 Not all of these attributes 2:37 will even be used in this 2:38 video, but like I said, 2:39 this will permeate several videos as our running example. 2:43 To motivate functional dependencies, let's 2:45 focus on the student relation 2:46 and specifically on the GPA and priority attributes. 2:49 Let's suppose that a student's 2:50 priority is determined by their GPA. 2:53 For example, we might have 2:55 a rule that says if GPA 2:57 is 3.8, greater than 3.8 then priority is 1. 3:01 If the GPA is 3:03 say, in between, let's say, 3.3 and 3.8. 3:06 Then we'll set priority 3:10 to be equal 2, and let's 3:12 say if the GPA, then, is 3:13 less than 3.3, then the priority value is three. 3:17 So, if this relationship is guaranteed 3:19 in our data then what we 3:21 can say is that any two 3:22 tuples that have the same 3:23 priority are guaranteed to have the same GPA. 3:26 And let's formalize that concept. 3:28 So I'm going to write 3:29 a little logical statement here to formalize the concept. 3:32 I'm going to use the for all 3:33 symbol from predicate logic and 3:35 I'm going to say if we have any pair of tuples. 3:37 So for all T or U, 3:39 those are tuples in the 3:40 student relation, then if 3:43 the student, if the T 3:44 and U have the same priorities 3:47 I'm sorry the same, let me 3:49 fix that, they have the the same GPA. 3:52 So if T.GPA equals U.GPA, 3:56 then and this is the 3:57 logical implication symbol, then 3:59 T.priority will equal U.priority. 4:03 So this logical statement is in 4:05 fact a definition of a functional 4:07 dependency, and we would 4:08 write that functional dependency as 4:10 GPA arrow priority, so 4:13 that says the GPA determines 4:15 the priority, any 2 tuples 4:17 with the same GPA must have the same priority. 4:19 So that was a specific example. 4:22 Now let's generalize our definition. 4:24 So, let me replace GPA 4:26 and priority here with just 4:27 two attributes A and B 4:29 E of say a relation 4:31 R. And then we'll also need to modify our definition. 4:35 So you can see, I've erased the specific attributes and relation. 4:38 And I'll just say for every. 4:39 T and U in our 4:41 relation, R. If T 4:43 dot A equals U 4:45 dot A then T dot 4:47 B equals U dot B 4:48 and that's the definition of 4:50 the functional dependency A determines B 4:52 for a relation R. Actually 4:54 I'm gonna generalize this definition even further 4:56 because functional dependencies don't 4:57 always have to have one 4:58 attribute on each side, they can actually have a set of attributes. 5:02 So now I write A1 A2 5:04 dot dot dot AN on 5:06 the left hand side, these will 5:07 all be attributes of relation 5:08 R. And on the right 5:10 hand side, B1 B2 comma 5:11 BM, again, attributes 5:14 of R. Modifying the formal 5:16 definition in red, now I 5:18 can't use the dot notation anymore, 5:20 so I'll use a square bracket and 5:21 I'll write A1 through An 5:24 equals U square bracket A1 5:26 through AN, so what 5:28 I'm saying here in this 5:29 case is that the two 5:31 tuples T and U 5:33 have the same values for 5:34 all of the attributes A-1 through 5:36 A-N, and if they do, 5:38 then they will also though have 5:39 the same values for B-1 through B-M. 5:40 We'll be getting to 5:43 some concrete examples soon. 5:45 Just one last bit of notation before we move on. 5:48 For simplicity I'm going 5:49 to often in the video 5:51 abbreviate a list of 5:52 attributes or set of attributes by using a bar. 5:54 So I'll write A bar to 5:56 indicate a set of attributes A 5:58 and B bar to indicate 6:00 a set of attributes B. And again, this is just for convenience. 6:03 So we've seen the motivation for a functional dependency in a relation. 6:06 A functional dependency for a 6:08 relation is based on knowledge of 6:10 the real world data that's being captured. 6:12 And when we specify one, just 6:13 like specifying keys, all instances 6:15 of the relation must adhere to the functional dependency. 6:19 So just to summarize functional dependencies, 6:21 we say that attribute, a 6:23 set of attributes A functionally 6:26 determines a set of attributes 6:27 B, if again any 6:29 time tuples agree in their 6:31 A values, they also agree in their B values. 6:33 And let's say that our relation 6:35 here R has the tuples 6:36 A the tuples and B 6:38 and also a few more 6:39 attributes we'll call those C. 6:41 So, let me draw a picture 6:42 of a relation now here that 6:43 has those attributes in it. 6:45 So, we'll have here.. 6:47 lets just three columns, but again these are multiple attributes. 6:51 And these are the attributes 6:52 A, these are the attributes B, 6:54 and these are the attributes C 6:56 And if we put in a couple 6:57 of tuples, then what we'll 6:59 say is, if we have 7:00 two tuples here, that have 7:02 and I'm gonna use a bar even for the values in the tuples. 7:05 If we have two tuples whose A 7:07 values are the same then their 7:09 B values must also be the same. 7:10 And we're going to be using this type of template for some reasoning later on. 7:14 But we're not saying their C values 7:15 have to be the same so we 7:16 could have C1 and and different C values here as well. 7:19 But again, if we specify this 7:21 functional dependency, we are saying 7:22 that every instance of our 7:24 relation must satisfy the 7:26 condition that if the A 7:27 values are the same, then the B values are also the same.