0:00 So the basic of idea of 0:01 what we're going to do is 0:02 design by decomposition, specifically, 0:05 we're going to do what we 0:06 did at the very beginning of this 0:07 example, which is start 0:09 by creating mega-relations that just 0:10 contain attributes for everything 0:12 that we want to represent in our 0:14 database, then we're going 0:15 to decompose those mega relations 0:17 into smaller ones that are 0:18 better, but still capture the same information. 0:21 And most importantly we can do this decomposition automatically. 0:25 So how does automatic decomposition work? 0:28 In addition to the mega 0:29 relations, we're going to specify 0:31 formally, properties of the data.The 0:33 system is going to use 0:35 the properties to decompose the 0:36 relations, and then it's 0:38 going to guarantee that the final 0:40 set of relations satisfy what's called a normal form. 0:43 And we'll be formalizing all of this. 0:45 But the basic idea behind normal 0:46 forms is that they don't 0:48 have any of those anomalies that 0:49 I showed and they don't lose any information. 0:52 So specifically for specification of 0:54 properties, we're going to begin 0:55 by looking at something called functional dependencies. 0:58 And once we specify functional dependencies, 0:59 the system will generate 1:01 relations that are in 1:03 what's called Boyse Codd normal form. 1:05 And Boyse and Codd by the 1:06 way were two early pioneers in relational databases in general. 1:10 Then we're going to 1:11 look at another type of 1:13 specification called multi valued 1:15 dependencies which will add to 1:16 functional dependencies and when we 1:18 have both functional and multi 1:19 valued dependencies, then we 1:21 can have what's called fourth 1:23 normal form, and again, that 1:24 would be relations that are generated 1:26 by the system that satisfy the normal form. 1:29 Boyce-Codd normal form is stricter than fourth normal form. 1:33 Specifically if we make 1:34 a big Venn diagram here of 1:36 all the relational designs 1:38 that satisfied Boyce-Codd Normal Form, 1:40 which by the way is very 1:42 often abbreviated BCNF, then that 1:45 contains all of the 1:46 relations that satisfy fourth normal form, 1:49 normally abbreviated 4NF. 1:51 So every relation that's in 1:53 fourth normal form is also 1:54 in Boyce-Codd normal form, but not vice versa. 1:57 You might be wondering what happened 1:59 to first, second and third, normal forms. 2:01 So first normal form is 2:03 pretty much just a specification 2:05 that relations are real 2:06 relations with atomic values in each cell. 2:09 Second normal form is specifying 2:12 something about the way relations are structured with respect to their keys. 2:16 Neither of those is discussed very much anymore. 2:18 Third normal form is a 2:20 slight weakening of Boyce-Codd 2:22 normal form and sometimes people 2:23 do like to talk about third normal form. 2:25 So you can think of third normal form as a little bit of a even bigger circle here. 2:29 We're not going to cover third normal 2:30 form in this video because 2:32 Boyce-Codd normal form is the 2:33 most common normal form used 2:35 if we have functional dependencies only, and 2:37 fourth normal form if we 2:38 have functional and multivalued dependencies. 2:41 So what's going to happen next is 2:43 I'm going to give some examples 2:44 to motivate these four concepts: 2:46 functional dependencies, Boyce-Codd normal form, 2:48 multivalued dependencies normal form, 2:50 and then later videos will 2:52 go into each one in much greater depth. 2:55 So let me just give 2:56 the general idea of functional dependencies 2:58 and Boyce-Codd Normal Form. 2:59 And we'll use a very 3:00 simple for example, an abbreviated version 3:02 of our apply relation that has 3:04 students' social security numbers, the 3:06 student's name and their colleges 3:07 that the student is applying to. 3:09 Even this small relation actually 3:11 has redundancy and update and deletion anomalies. 3:14 Specifically, let's say that our 3:16 student, 123Ann, applies to 7 colleges. 3:19 Then there will be 7 tuples and 3:21 there will be 7 instances where we 3:23 know that a student with the 3:25 social security number 123 is named Ann. 3:28 Specifically, we're going to store 3:29 for every student the name 3:31 and social security number pair once 3:33 for each college that they apply to. 3:36 So now let me explain what 3:37 a functional dependency is and then 3:38 we'll see how functional dependencies are 3:40 used to recognize when we 3:42 have a bad design like this 3:43 one, and to see how we can fix it. 3:45 A functional dependency, in this 3:47 case from social security number 3:48 to name, and we're saying 3:50 social security number functionally determines 3:52 the student name says that 3:54 the same social security number always has the same name. 3:58 In other words, every time we see 123, we're going to see Ann. 4:01 Now it doesn't necessarily go in the other direction. 4:03 It might not be that whenever 4:04 we see Ann, it's 123, 4:06 but whenever we see 123, it is Ann. 4:09 And so what we'd like to 4:10 do is store that relationship just one time. 4:13 One time say that for 123, the name is Ann. 4:16 Now what Boyce Codd Normal Form 4:18 says is that whenever we have 4:19 one of these functional dependencies, then 4:22 the left hand side of that functional dependency must be a key. 4:25 And think about what that's saying. 4:27 Remember a key says 4:29 that we have just one tupple with each value for that attribute. 4:32 So if we have say 4:34 social security number to name 4:36 as a functional dependency and we 4:37 satisfy Boyce-Codd Normal Form, 4:39 then we're going to say that 4:41 social security number has to 4:42 be a key in our relation, 4:44 and we'll only have one 4:45 tupple for each social security number. 4:48 Specifically, we can go back to our original relation. 4:51 We have this functional dependency social 4:53 security number here is not a key, right? 4:56 So then we know that this is not in Boyce-Codd Normal Form. 5:00 So we're going to use functional 5:01 dependencies to help us 5:03 decompose our relation so 5:04 that the decomposed relations are in Boyce-Codd Normal Form. 5:08 And here's what would happen in this example. 5:10 Our functional dependency would tell 5:11 us to pull out the social 5:13 security number and student name 5:15 into its own relation where 5:16 the social security number is a 5:18 key and then we have 5:19 just one time for each 5:21 social security number that students 5:22 name, and then separately we'll 5:24 have the information about the 5:26 students and which colleges they applied to. 5:28 Again, we'll completely formalize this 5:31 whole idea, the definition of 5:32 functional dependencies, their properties, the 5:34 normal form, and how we 5:35 do the decomposition in a later video.