0:00 This is the first of two 0:01 videos where we learn about relational algebra. 0:04 Relational Algebra is a formal language. 0:07 It's an algebra that forms 0:09 the underpinnings of implemented languages like SQL. 0:12 In this video we're going to 0:14 learn the basics of the 0:15 Relational Algebra Query Language and a few of the most popular operators. 0:19 In the second video we'll learn 0:20 some additional operators and some 0:22 alternate notations for relational algebra. 0:26 Now, let's just review first from 0:28 our previous video on relational 0:29 querying that queries over 0:31 relational databases operate on 0:33 relations and they also produce relations as a result. 0:36 So if we write a query 0:38 that operates say on the three 0:39 relations depicted here, the 0:41 result of that query is going to be a new relation. 0:44 And, in fact, we can 0:46 post queries on that 0:47 new relation or combine that 0:49 new relation with our previous relations. 0:52 So let's start out with Relational Algebra. 0:54 For the examples in 0:56 this video we're going to be 0:57 using a simple college admission relations database with three relations. 1:00 The first relation, the college 1:02 relation, contains information about the 1:03 college name, state, and enrollment of the college. 1:06 The second relation, the student 1:08 relation, contains an ID 1:10 for each student, the student's name, 1:11 GPA and the size of the high school they attended. 1:14 And, finally, the third relation contains 1:16 information about students applying to colleges. 1:18 Specifically, the student's ID, the 1:20 college name where they're 1:21 applying, the major they're 1:22 applying for and the decision of that application. 1:25 I've underlined the keys for these three relations. 1:28 As a reminder, a key is 1:30 an attribute or a set of 1:31 attributes whose value is 1:33 guaranteed to be unique. 1:35 So, for example, we're going 1:36 to assume the college names are unique, 1:38 student IDs are unique and that 1:40 students will only apply to 1:41 each college for a particular major one time. 1:46 So, we're going to have a 1:47 picture of these three relations at 1:48 the bottom of the slides throughout the video. 1:52 The simplest query in relational 1:54 algebra is a query 1:55 that is simply the name of a relation. 1:57 So, for example, we can 1:58 write a query, "student" and that's 2:01 a valid expression in relational algebra. 2:04 If we run that query on 2:05 our database we'll get as 2:07 a result a copy of the student relation. 2:09 Pretty straightforward . 2:11 Now what happens next 2:12 is that we're going to use 2:13 operators of the relational algebra 2:15 to filter relations, slice relations, and combine relations. 2:19 So, let's through those operators. 2:23 The first operator is the select operator. 2:26 So, the select operator is used 2:27 to pick certain rows out of a relation. 2:30 The select operator is denoted by 2:31 a Sigma with a subscript--that's 2:33 the condition that's used to 2:35 filter the rows that we extract from the relations. 2:37 So, we're just going through three examples here. 2:40 The first example says that we 2:41 want to find the students 2:42 whose GPA is greater than 3.7. 2:44 So to write that 2:45 expression in relational algebra, we write 2:47 the sigma which is the 2:48 selection operator as a 2:50 subscript the condition that we're 2:51 filtering for--GPA greater than 2:54 3.7--and the relation over which we're finding that selection predicate. 2:58 So, this expression will return 3:00 a subset of the student 3:02 table containing those rows 3:04 where the GPA is greater 3.7. 3:07 If we want to filter for two 3:08 conditions, we just do 3:09 an "and" of the conditions in the subscript of the sigma. 3:12 So if we want, say, students 3:13 whose GPA is greater than 3.7 3:15 and whose high school size 3:16 is less than a thousand, we'll 3:18 write select GPA greater than 3.7. 3:23 We used a logical 3:24 and operator--a caret, high school 3:27 size is less than a 3:28 thousand, and again we'll apply that to the student relation. 3:32 And once again, the result of 3:33 that will be a subset of 3:34 the student relation containing the rows that satisfy the condition. 3:37 If we want to find 3:39 the applications to Stanford for 3:41 a CS major, then we'll be 3:42 applying a selection condition to the apply relation. 3:45 Again, we write the sigma 3:46 and now the subscript is 3:48 going to say that the college 3:49 name is Stanford and the major is CS. 3:54 Again, the and operator, and 3:57 that will be applied 3:59 to the apply relation and 4:01 it will return as a 4:02 result, a subset of the apply relation. 4:06 So the general case of the 4:07 select operator is that we have the sigma. 4:10 We have a condition as a 4:12 subscript and then we have a relation name. 4:14 And we return as a result the subset of the relation. 4:17 Our next operator is the Project Operator. 4:21 So the select operator picks certain 4:23 rows, and the project operator picks certain columns. 4:26 So let's say we're 4:28 interested in the applications, but all 4:29 we wanted to know was the 4:30 list of ID's and the decisions for those applications. 4:34 The project operator is written 4:35 using the Greek pi symbol, 4:37 and now the subscript is 4:39 a list of the column names that we would like to extract. 4:42 So we write ID, sorry, 4:43 student ID and decision, and 4:46 we apply that to the apply relation again. 4:49 And now what we'll get 4:51 back is a relation that has just two rows. 4:54 It's going to have all the 4:55 tuples of apply, but it's 4:56 only going to have the 4:57 student ID and the decision columns. 5:00 So the general case of 5:02 a project operator is the 5:04 projection, and then a 5:05 list of attributes, can be 5:07 any number, and then a relation name. 5:13 Now, what if we're interested in 5:14 picking both rows and columns at the same time. 5:16 So we want only some of 5:18 the rows, and we want only some of the columns. 5:20 Now we're going to compose operators. 5:23 Remember that relational queries produce relations . 5:26 So we can write a 5:27 query, say, with the 5:29 select operator of the 5:31 students whose GPA is greater than 3.7. 5:33 And this is how we do that. 5:35 And now, we can take 5:37 that whole expression which produces a 5:39 relation, and we can 5:40 apply the project operator to that, and 5:42 we can get out the student 5:43 ID and the student name. 5:49 Okay. 5:50 So, what we actually see 5:51 now is that the general 5:54 case of the selection and 5:55 projection operators weren't quite what I told you at first. 5:58 I was deceiving you slightly. 5:59 When we write the select 6:01 operator, it's a select 6:03 with the condition on any 6:06 expression of the relational 6:07 algebra, and if it's 6:08 a big one we might want 6:09 to put parenthesis on it, 6:11 and similarly the project operator is 6:13 a list of attributes from 6:16 any expression of the relational algebra. 6:18 And we can compose these as much as we want. 6:20 We can have select over project, over 6:22 select, select, project, and so on.