0:00 Good day, everyone. This is Dr. Soper here and today I 0:05 have the great privilege of giving you a crash course in the Structured Query Language. 0:13 We have many different topics for this lecture, but generally speaking all of the topics that we will address 0:21 are concerned with providing you with a basic set of skills that 0:26 will allow you to begin working with databases using the Structured Query Language. 0:32 So along the way today we will learn basic structures for performing our four basic database operations. 0:41 That is creating new data, reading existing data from the database, updating data, and deleting data. 0:50 And we'll learn about some additional interesting topics, including how to extract data from multiple tables 0:57 and how to use the Structured Query Language to create, modify, and delete various database objects 1:05 and constraints. As I mentioned, the primary focus of today's lecture 1:11 is on the Structured Query Language. The acronym for this is SQL and it's 1:17 pronouncd as Sequel or S-Q-L depending upon your preference. 1:24 I will use both of these throughout the rest of this lecture. The Structured Query Language was originally 1:30 developed by IBM in the 1970s. 1:35 So this has been around for a long time. It was originally called SEQUEL and it 1:40 was designed to support the relational database model which 1:46 was first proposed by Edgar Codd in his 1970 paper in Communications of the ACM. 1:53 As we move through our course together and we learn more about the relational model and relational databases, I hope that you 2:02 will come to appreciate just how amazing it is that no one's solved this relational data 2:08 problem until 1970. For all of human history, until just a few decades ago, 2:15 people were not able to think about data and relationships in this formal way that we are now all so familiar with today. 2:25 A few additional things that you should know about the Structured Query Language before we proceed 2:30 is that the ideas and capabilities of the Structured Query Language are grounded in relational algebra. 2:40 And the Structured Query Language, itself, is a standard. 2:46 So this is an ANSI standard and an ISO standard. One of the great things about the Structured Query Language 2:53 is that if you can learn to use SQL well, 2:58 then you will be able to sit down and use almost any database in the world with very little learning 3:07 curve, since virtually all major database vendors support the Structured Query 3:13 Language in their products. If you can learn this language well then you'll be able to work with databases regardless 3:21 of the vendor, regardless of the operating system, or regardless of the device. 3:27 So you could work with a SQL Server database, or an Oracle database, MySQL, DB2. 3:36 You can work on Linux, Unix, Mac, PC. You can even work on mobile devices. 3:44 So learning the Structured Query Language is an extremely valuable skill for anyone 3:49 who wants to work with databases. Another important characteristic of the Structured Query 3:55 Language to remember is that it is not in and of itself a programming language, 4:02 but rather it is a data sub language. You will come to appreciate, I hope, 4:08 that the Structured Query Language is simple to use and is fairly easy to understand. 4:16 Even if you've had no experience using SQL in the past, by the end of the lecture today, you 4:22 should be able to understand SQL statements and what they are intended to do and even write some simple SQL 4:29 statements of your own. The Structured Query Language itself can be subdivided into three major components. 4:38 These are the data definition language which we can refer to as DDL, the data manipulation language 4:46 which we can refer to as DML, and the data control language which we can refer to as DCL. 4:54 The data definition language provides us with a series of commands that we can use to create database objects and constraints. 5:04 The data manipulation language, by contrast, provides us with a series of statements 5:09 that we can use for creating new data, reading existing data from the database, 5:16 updating existing data in the database, or deleting data within the database. 5:22 Finally, the data control language is used for establishing things like permissions, user 5:30 accounts, security, and so forth. Our focus here today is going to be 5:36 on the first two of these three components. So we will be learning about some 5:41 of the elements of the data definition language and the data manipulation language. 5:47 Beginning with the data definition language, major statements within the DDL include CREATE, ALTER, 5:56 and DROP. CREATE is used for the purpose of creating new databases 6:02 or database objects. So if I want to create a new table, for example, 6:07 I can write a CREATE statement in SQL. ALTER is used for the purpose of modifying existing database 6:14 objects. So if we need to modify the structure of a table or if we need to add a new constraint 6:22 we can use the ALTER statement within SQL. Finally, we have the DROP statement. 6:29 And this component of the DDL allows us to delete existing database objects. 6:35 So for example, if we need to delete a table or if we need to delete a view we 6:41 can use the DROP statement to accomplish that task. Let's begin by taking a closer look at the CREATE statement. 6:48 In this case, we are looking at a CREATE TABLE statement, the purpose of which is to establish a new table 6:54 in the database named Employee. Let's examine some of the characteristics of the CREATE 7:00 TABLE statement. First, note that we begin just by using the statement 7:05 itself, CREATE TABLE. And then whatever comes after that 7:11 will be the name of the table. Remember that each table within the database 7:18 must have a unique name. In this case, we're creating an Employee table. 7:23 The next component is an open parenthesis. Everything that will be a part of the CREATE TABLE statement 7:31 must fall between the open and close parentheses. Also note that the SQL statement ends with a semicolon. 7:41 It is always good practice to end all of your SQL statements 7:47 with a semicolon. What we see here is a very simple CREATE TABLE 7:52 statement in which all we are doing is establishing a table named Employee 7:59 which will contain two columns. The first of these columns is called Employee ID, which 8:05 I've abbreviated here as empId, and the second column is named Employee Name. 8:12 Let's begin by looking at the Employee ID column. You will notice that the first component is 8:18 the name of the column and then we have a number of spaces after which the word Integer appears. 8:27 This is referring to the data type of the column. So we are saying here that every employee 8:34 ID that is added into this new Employee table must be an integer. 8:39 Next we see a number of spaces and then the words NOT NULL. This indicates to the database that every row which 8:48 is added into the Employee table must have a value for the employee ID. 8:55 That is we do not allow null values to exist for the employee ID. And then we see a comma. 9:00 A comma tells the database that we have finished defining this attribute 9:06 and we are now ready to define the next attribute. A few notes here. I have made the CREATE TABLE statement extend over 9:14 four lines and this is just for convenience. It is for us, as human beings, to be 9:20 able to read the statement more easily. Similarly, I have added spaces between the names 9:28 of the attributes and their data types and whether or not the attribute allows null values. 9:35 Again, this is just for convenience and ease of interpretation by human eyes. 9:41 The database engine itself does not care about all of these extra spaces or line feeds. 9:47 It will treat this as one full and complete statement regardless of whether we have split it across multiple lines 9:56 or have added extra spaces in between the various elements of each statement. 10:02 Next, let's see how we can expand upon our CREATE TABLE statements in order to create tables 10:08 that are a little more interesting and a little more useful. We'll begin by learning how to define a primary key using 10:19 the Structured Query Language. And for this purpose we will use the CONSTRAINT keyword. 10:25 So within the Structured Query Language keys, such as primary keys and foreign keys, 10:31 are considered to be constraints. There are other types of constraints, as well, 10:37 and we will talk about those as we progress through this lecture. 10:42 Note that the CONSTRAINT keyword is used inside the parentheses 10:48 because it is a part of the CREATE TABLE statement. And note that we give the constraint a name. 10:55 In this case, we're defining a primary key for our Employee table and we are a naming that constraint Employee PK, empPk. 11:05 An important point to remember is that within a database 11:14 every constraint must have a unique name. After the name of the constraint, 11:20 we see the words PRIMARY KEY and then an open and close parenthesis. 11:27 This tells the database that we want to use our empId attribute, which 11:34 we defined a little earlier in the statement, as the primary key. 11:39 And as we know, every value of a primary key must be unique so we are placing this primary key constraint 11:48 onto the database. We are telling the database that each value within the Employee 11:55 ID column must be unique for the Employee table. 12:00 And then note that we, just as before, close the CREATE TABLE statement with a close parenthesis 12:07 and a semicolon. Here we see a variation of creating a primary key where, 12:15 instead of creating a simple primary key, we want to create a composite primary key. 12:21 In this case, the primary key consists of two attributes 12:26 named empId and skillId. And to create the composite primary key, 12:34 we simply need to separate these two attributes by a comma and put them both inside the PRIMARY KEY statement. 12:43 Now that we know how to establish both simple and composite primary keys, 12:48 let's take a look at how to create foreign keys. Now, just as a primary key is considered a constraint, 12:55 a foreign key is also considered a constraint. So we implement a foreign key relationship 13:02 using the CONSTRAINT keyword. Here we are establishing two foreign keys, one of which 13:10 is named empFk and the other of which is named skillFk. 13:16 And we are going to use these foreign keys to link our EmployeeSkill table to an Employee table 13:26 and to a Skill table. So conceptually speaking, we can imagine 13:31 that we have two parent tables, Employee and Skill, and then we have this third table named EmployeeSkill which 13:41 sits between the other tables and allows us to connect them together. 13:47 In this way, each employee will be allowed to have multiple skills and each skill 13:54 will be able to apply to multiple employees. To establish these relationships, 14:01 we need to add foreign keys to our EmployeeSkill table. 14:06 And as we can see, we have these two foreign key constraints, 14:11 empFk and skillFk. And to establish the foreign keys, we use the keywords FOREIGN KEY and then an open parenthesis 14:22 and inside those parentheses we specify the name of the attribute within this table 14:30 that we want to be a part of the primary key/foreign key relationship. 14:37 So we're saying FOREIGN KEY empId in the first example. 14:43 That's telling the database that we want to use the empId attribute in this table as a foreign key. 14:50 Following this we see the keyword REFERENCES and then Employee open parenthesis 14:58 empId close parenthesis. This tells the database where the other end 15:03 of the relationship should be. So we want this foreign key to point to the empId attribute 15:11 within the Employee table. We then see a comma and after that, we 15:17 provide our next constraint which, in this case, is the foreign key which links our EmployeeSkill 15:23 table to the Skill table. And we are linking these together 15:28 according to the skillID attribute in each table. So creating these primary keys and foreign keys 15:37 is actually quite easy in the Structured Query Language. Now, let's extend our knowledge of these constraints 15:44 even further by considering cascading deletes and cascading 15:49 updates. First, let's consider the employee foreign key 15:55 where here, after defining the foreign key, we see the keywords ON DELETE CASCADE. 16:03 This tells the database that we want to establish a cascading delete relationship between this table 16:10 and the Employee table. A cascading delete is used to help us maintain the integrity and the quality 16:17 of the data in the database. And here's how it works. In this case, we have an Employee table 16:23 and EmployeeSkill table. So let's imagine that we have an employee number one 16:32 and this employee has skills 101 and 102. 16:39 Now, if our employee number one leaves the organization-- maybe they find a better job or we decide to get rid of them-- 16:47 then we may want to delete the employee from the database. What a cascading delete will do for us 16:54 is when we delete our employee from the database, the database will follow the relationship line 17:03 into the EmployeeSkill table and will look for any records that were associated with the employee 17:11 that we want to delete. If it finds any of those records, it will delete them from the EmployeeSkill table 17:19 automatically for us. In this way, the database maintains the quality 17:24 of the data by eliminating any records in the EmployeeSkill 17:29 table which would no longer be associated with an existing 17:35 record in the Employee table. Next we will consider a cascading update, 17:40 which is very similar to a cascading delete in that the purpose of a cascading update 17:48 is to maintain the quality or the integrity of the data in the database. 17:54 In this case, we are establishing a cascading update between the Skill table and the EmployeeSkill table. 18:02 And what this means is that if we were to update one of the primary key values 18:09 in the Skill table, let's say that we change skill 101 to 105, then the database 18:18 will follow the relationship from the Skill table to the EmployeeSkill table and will look in the EmployeeSkill 18:25 table for any rows that contained skill number 101. 18:33 If it finds any of those rows, it will automatically update 101 to the new value 105 so 18:42 as to maintain the link between records in the Skill table 18:49 and records in the EmployeeSkill table. I hope that you can appreciate just 18:55 how useful these cascading deletes and cascading updates can be. 19:02 Next, let's take a look at the SQL ALTER statement. As I mentioned earlier, the ALTER statement 19:09 is used for the purpose of changing the structure of a database object. 19:14 For example, let's imagine that we created our Employee table, but after creating the table we realize 19:22 that we forgot to establish a primary key. Well, we could easily use the ALTER TABLE statement 19:30 in order to add a primary key to the table. The basic syntax for the ALTER statement is shown here. 19:38 In this case, we want to alter the Employee table and we use the keywords ALTER TABLE and then 19:44 the name of the table that we want to change. And then we specify the change that we want to make. 19:51 In this case, we want to add a new constraint, specifically a primary key constraint named empPk, 19:59 and we're telling the database that we want the empId attribute within this Employee table 20:06 to serve as our primary key. And by running that SQL statement 20:12 our Employee table would then have empId marked as the primary key. 20:19 Just as with our CREATE TABLE example, we see here that we use a comma separated list of attributes 20:27 inside parentheses to tell the database that we want to create a composite primary key, 20:33 in this case consisting of the Employee ID and Skill ID 20:39 columns. Adding a foreign key constraint to an existing table 20:44 is just as easy as adding a primary key. We use the ALTER TABLE statement, 20:50 specify the name of the table, and then we add a new constraint using the same syntax that we learned 20:58 for the CREATE TABLE statement. In this case, we are telling the database that we want to link the deptId attribute in this Employee 21:08 table to the deptId attribute in the Department table. 21:15 With respect to changing the data within the tables in our database, 21:21 there are really just three things that we can do. We can either add new data into the table, 21:27 we can change data that already exists in the table, or we can delete existing data. 21:35 And we accomplish these three tasks by using the INSERT INTO, 21:40 UPDATE, and DELETE FROM commands respectively. 21:46 Let's begin by considering the situation in which we want to add new data into the database table. 21:53 For this purpose, we can use the SQL INSERT INTO statement. 21:58 Note here that if we want to specify non numeric data, 22:04 if we want to add non numeric data such as text into one of the columns in the table, then 22:11 we must enclose that value in single quotes. Let's look at the syntax for the INSERT INTO statement. 22:19 Note that it begins with the keywords INSERT INTO followed by the name of the table in which we want 22:27 to add this new row of data. And then we have an open parenthesis 22:33 and a comma separated list of attribute names followed 22:38 by a close parenthesis. So what I'm telling the database here 22:43 is I want to add a new record into the Employee table and I'm going to be specifying values 22:50 for the employee ID, the salary code, and the last name 22:55 of the employee in that order. 23:01 Next, we see the keyword VALUES followed by an open parenthesis, a comma separated list of values, 23:09 and a close parenthesis. The values which appear inside this second set of parentheses 23:17 are the actual data values which will be added into the table. 23:22 In this case, the values are 62, 11, and Halpert. 23:29 Note that the order matters here. We have specified that we will be providing an employee ID, 23:38 a salary code, and the last name in that order. 23:43 The values that I provide inside the second set of parentheses then must match the same order of the values 23:52 as they appeared in the first set of parentheses. In this case, we're saying that the empId should be 62, 24:02 the value of salaryCode should be 11, and the value of lastName should be Halpert. 24:09 Now let's look at the UPDATE statement. UPDATE allows us to alter or modify 24:17 the values of an existing row or set of rows within a table. 24:25 Let's consider the syntax for the UPDATE statement. To begin, we use the keyword UPDATE 24:31 followed by the name of the table that we wish to update. Next, we use the keyword SET and then the name of the attribute 24:41 that we want to change followed by the new value for that attribute. 24:48 After which, we see a new keyword called WHERE which specifies which row or rows we 24:57 want to change. So in this first example, we're telling the database I want to change a value in the Employee table. 25:06 I want to change the phone number to 657-278-1234. 25:13 And I want to make that change only for the row in the table which has empId equal to 29. 25:23 So in plain language we're saying change the phone number for employee 25:28 ID number 29 to the value shown here. In our next example, we are setting the Employee ID to four 25:38 every employee in the table who's employee name begins with the letters D-A and we will learn a little bit more 25:46 about these wild card searches later. But the point here is to illustrate 25:52 that an UPDATE statement can change values for more than one row at a time. 26:00 And this point is further shown with our final example, 26:05 which just says UPDATE Employee SET deptID equals 3. And what this command will do is it 26:11 will change the department ID for every employee in the table 26:17 to the value 3. One of the things that you will learn as you become more skilled with the Structured Query Language 26:24 is that often the shortest statements can cause the most 26:30 change in the database. So here, if we had a table, an Employee table, 26:38 which contained one million rows, this statements UPDATE Employees SET 26:43 deptId equals 3 would make one million changes to that database. 26:49 It would set the department ID to 3 for every employee. 26:55 And what if we wanted to delete one or more rows from an existing table? 27:01 In that case, we would use the DELETE FROM statement. And the syntax is simple. 27:07 We begin with the keywords DELETE FROM and then the name of the table from which we want to delete data. 27:13 In this case, we're telling the database that we want to delete some data from the Employee table. 27:19 Next we see WHERE clause and that specifies which employee we want to delete. 27:26 So in this case, we're telling the database delete the record in the Employee table who's 27:32 Employee ID has a value of 29. 27:37 Just as with the UPDATE statement, we can use the DELETE FROM statement 27:43 to delete more than one row at a time within the table. And this is what we see in our second example. 27:51 Here we're telling the database to delete all of the rows in the Employee table 27:57 where the name of the employee begins with the letters D-A. So any employees named Dan or Daniel or David or Danica 28:09 or Danielle or Dana they would all be deleted from the employee table 28:16 if we were to run this query. And our last example is the simplest, DELETE FROM Employee. 28:25 Again, this is a short command which has the potential to do the most damage or cause the most 28:32 change within the table. In this case, if we were to run that query 28:40 it would delete every employee in the Employee table. 28:45 Now we can turn our attention to SQL SELECT queries. A SELECT query is used to extract information 28:54 from the database, or perhaps a more interesting way of thinking about this is we can use the SELECT query to get 29:03 answers from the database. So for example, if I want to know which employees work 29:10 in the Marketing Department, that's a question that I can ask the database using a SQL SELECT statement 29:17 and it will give me the answer. Similarly, I might ask questions like what 29:23 is the salary of employee number two or how many employees work in the Sales Department 29:31 or what was the total amount of sales for last month? 29:36 These are all questions that we can ask the database using a SQL SELECT query 29:42 and it can provide us with the answer. To begin, we specify the SELECT keyword followed 29:49 by the names of one or more columns that we want to appear in the results. 29:56 So values for the specified columns which appear after the keyword SELECT 30:02 will become a part of our results. In this first example, you can see that we are using SELECT empName. 30:10 This means that the only column which will appear in our results will be the empName column. 30:18 Next, we see the FROM keyword and after the FROM keyword we specify one or more tables from which we 30:26 want our data to be extracted. In this example, we're just saying we want data 30:32 from the Employee table. And just as with the UPDATE and DELETE FROM statements, 30:39 we can, if we want to, use a WHERE clause which will filter the results. 30:47 So in this first example, in simple language we're telling the database give me 30:52 the name of the employee in the Employee table whose ID is equal to 33. 30:59 In our second example, you will notice that we do not 31:04 have a WHERE clause. And in simple language, this query is telling the database to give us 31:11 a list of all of the employees in the Employee table. One of the most important concepts for you 31:18 to understand about these SQL SELECT queries is that the results of the query are a relation. 31:28 That is, as long as there are some data to be retrieved, the result will be a two dimensional table of data. 31:36 It may contain just a single row or a single column or it may contain many rows or many columns 31:44 but the results can be thought of as a two dimensional table of data. Because the result of a query is a two dimensional 31:51 table of data, we can then use that result, if we want, 31:57 to perform other tasks. Such as, we can use the result to create a new relation 32:03 or we can use the result of one query as input into another query. 32:09 We'll take a look at that situation a little later. Just remember that our result set, that 32:16 is the results that we get back, the table of data that we get back in response to our query, 32:22 may not in third normal form. So it is a relation, but it may not be in third normal form. 32:29 And this is especially common when we perform something called a JOIN operation which 32:35 is where we are simultaneously pulling data from more than one table. 32:41 In our previous examples of using the SELECT statement, 32:48 we we're extracting just a single column of data from the database. 32:53 But, what if we want to get values for more than one column from the table. 32:59 Well in order to accomplish that task, we simply need to use a comma separated list of column names. 33:08 In this example, we're telling the database to give us a list of employee IDs and employee names 33:15 from the Employee table. So it will return a result set which contains two columns, empId and empName, and then 33:25 all of the employee IDs and employee names which appear in the Employee table. 33:32 We also have the option of retrieving all of the columns within a specific table or set of tables. 33:39 And we can do this by using the asterisk character. 33:44 Just as a brief note, I will commonly refer to this as a star instead of an asterisk. 33:52 So in this example here, we see SELECT star FROM Employee. 33:59 This tells the database to retrieve all of the data from the Employee table, every column and every row. 34:08 Remember the star represents the concept of all columns. 34:14 Not all rows, but all columns. So this example says select all columns 34:21 from the Employee table. And because we are not specifying any filtering 34:27 criteria using the WHERE clause, it will also return all rows. 34:34 What if we have a situation where the result of our query contains many duplicate rows? 34:42 Well we can eliminate that problem by using the SQL DISTINCT keyword. 34:48 And if we place the DISTINCT keyword in front of one of our column names 34:54 then the database will remove any duplicate values 35:00 of that attribute when generating the results. In this example, we're telling the database 35:07 to give us a list of department IDs from the Employee table, 35:12 but we don't want any duplicates in the list. That is, we just want all of the unique department IDs. 35:19 In a few of our previous examples, 35:25 we used the WHERE clause and so at this point I think it would be a good idea to explore the WHERE clause 35:32 in a little more detail. Basically, what the WHERE clause does 35:37 is it allows us to tell the database to constrain its actions according 35:45 to some kind of filtering criteria. In this example, we're using a WHERE clause with just a simple 35:53 equality check. So we're telling the database to give us the names of all of the employees in the Employee table 36:02 that have a value for deptId of 15. 36:08 In simple language, we're saying give us a list of all of the employees who work in department 15. 36:15 The WHERE clause is being used here to restrict the full set of employee names 36:21 to just to those that work in department ID number 15. 36:27 So we have an equality check here, a simple equals sign, but there are many other types of comparisons that can 36:34 be done in the WHERE clause. So aside from equals, which is just a simple equal sign 36:40 as we saw, we can do not equals using open and close pointy brackets or in some databases 36:49 you can use an exclamation point and an equal sign for not equal to. We could also represent the concepts of greater 36:57 than or less than, greater than or equal to, and less than 37:03 or equal to. And using these various comparisons gives us a great deal of flexibility 37:10 in the types of filtering that we can do in the WHERE clause. What's more is that we can create compound conditions. 37:20 We can filter items based upon more than one criterion. 37:26 One of the ways that we can do this is by using the AND keyword. So if we want to check if two conditions are true, 37:35 we can use the AND keyword. And in terms of relational algebra, this represents an intersection of the data sets. 37:44 We also have access to an OR keyword. And in terms of relational algebra, 37:51 this represents a union of the data sets. One way to think about AND and OR 37:59 is in terms of Venn diagrams. Imagining that we have two sets, A and B. 38:07 The AND operation is the intersection of those two sets. It is the area of overlap between A and B. 38:16 The OR operation, however, is saying if either A is true 38:22 or B is true or both A and B are true, 38:28 then the entire function returns as true. Now, let's look at a few quick examples of these AND and OR 38:37 operators being used. In this first example, we are asking the database 38:45 to give us a list of employee names from the Employee table 38:50 where the employees' department ID is less than seven 38:56 or the employees' department ID is greater than 12. So any employee record in that table 39:05 that has a value of the deptId of less than 7 or a value of the deptId of greater than 12 39:12 will become a part of the results. In our next example, we are asking the database 39:19 to give us a list of employee names from the Employee table 39:24 where the employee works in department ID number 9 and where the employee also has a salary code 39:33 of less than or equal to 3. So in this example both of these conditions 39:38 must be true in order for a row to be included in the results set. 39:44 That is, the employee's department ID must be 9 and the employee's salary code must be less than 39:51 or equal to 3. Whenever I'm teaching the Structured Query Language, 39:57 I always like to provide people with shortcuts or easy ways of doing things. 40:04 One example of that is the use of the IN keyword. 40:09 Now the IN keyword can be used to provide a list of values 40:15 for a WHERE clause. And perhaps the best way to learn this is by looking at an example. 40:23 Here we see a SELECT query which is asking the database to give us a list of employee names from the Employee table 40:33 where the department ID for the employee is either 4, 8, or 9. 40:43 So as the database looks through this table and is attempting to assemble the set of results, 40:50 if the deptId is equal to the values 4, 8, or 9, 40:56 that row will be included in the results. Otherwise, they will not become a part of the result set. 41:03 Now we can compare this method to a method which uses a series of OR operators. 41:11 So instead of saying WHERE deptId IN 4, 8, 9, 41:16 we could instead say WHERE deptId equals 4 OR deptId equals 8 OR deptId equals 9. 41:26 Hopefully you can see that using the IN keyword can save you a few key strokes, especially 41:32 if you have a long list of acceptable values that you would like to check for in the WHERE clause. 41:39 We also have access to the NOT operator 41:45 which allows us to invert the filtering criteria specified in the WHERE clause. 41:51 Consider this example. Here we are asking the database to give us 41:56 a list of employee names from the Employee table where the department ID is not 4, 8, or 9. 42:07 So the results here will be a list of employee names from the Employee table for every employee except for those employees who 42:17 work in departments 4, 8, or 9. 42:23 Another useful shortcut is to use the BETWEEN key word. 42:28 BETWEEN allows us to specify a range of values in our WHERE clause such that any value 42:35 between a minimum and a maximum value is considered acceptable. 42:41 Let's see an example. Here we are asking the database to give us a list of employee names from the Employee table 42:50 where the salary code falls between 10 and 45. 42:55 So as the database is scanning through the Employee table trying to assemble our results, 43:03 it will check to see if the value of the salary code for each row falls somewhere between 10 and 45. 43:11 And that is an inclusive range, meaning that 10 and 45 fall within the range. 43:17 If a row has a salary code that falls between 10 and 45, it will be included in the result set. 43:24 If not, that row will not be included in the result set. 43:29 This is equivalent using a compound WHERE statement 43:36 with an AND operator. So instead of saying WHERE salaryCode BETWEEN 10 AND 45, 43:44 we could instead say WHERE salaryCode is greater than or equal to 10 AND salaryCode is less than 43:52 or equal to 45. The results will be the same. 43:58 Next, I would like to talk about the SQL LIKE keyword. And the LIKE keyword is very useful 44:05 because it allows us to do partial searches. So for example, if I would like to generate a list of employees 44:15 whose names start with the letter D or if I would like to retrieve a list of customer email 44:22 addresses that end in gmail.com I 44:28 can use the LIKE keyword in order to achieve that result. 44:33 SQL provides us with a few different types of wild card characters that we can use for these types 44:40 of partial searches. First, is the percent sign. 44:45 Now a percent sign means match any number of characters. 44:50 So this is any number of wild card characters. To whereas the second type of wild card character 44:57 is a single character wild card. And for this purpose we use the underscore symbol. 45:05 This allows us to match values against a single, unknown character. 45:11 Let's see some examples of how these wild card searches work. In this first example, we are asking the database 45:18 to give us a list of employees from the empId table whose 45:24 names begin with D- A. So we see our percent sign 45:33 wild card character shown here. We're saying we want the employee name to look like D- A and then any set of characters 45:42 thereafter. So, zero or more characters thereafter. This means that values of Dan or Daniel 45:50 or Danica or Danielle or David or any other employee name that 45:58 starts with the letters D- A would be a part of that result set. In our second example, we see the use of the single wild card 46:07 character. And in this case, we are asking the database to give us a list of employee IDs for employees 46:15 whose phone numbers begin with 6-5-7 dash 2-7-8 dash 46:22 and then contain exactly four unknown characters. 46:27 So here we're using the single character wildcard in order to tell the database that we want 46:33 to match a very specific pattern of telephone numbers. 46:39 This wild card search capability that is built into the Structured Query Language 46:45 can be easily used to build something like a simple search engine. 46:52 Thus far, with all of our SQL SELECT statements we've retrieved a list of results 46:58 but we've said nothing about the order in which the rows contained within those results 47:05 would appear. I, for example, have asked many times for a list of employees, 47:10 but they can be in whatever order the database chooses to put them in. If I want to instruct the database 47:18 to return the results in a specific order, then I can use something called an ORDER BY statement. 47:27 In this example, we are asking the database to give us all of the columns out of the Employee table 47:36 and we want the results to be ordered by the employee name. 47:43 In this case, we can assume that empName is an alphanumeric column that contains text values, 47:51 so they will be ordered alphabetically. Note that SQL supports two different types of sorting. 48:00 You can do and an ascending sort or a descending sort. 48:06 By default, if we do not specify a sort type, SQL will assume that we want an ascending sort. 48:15 If we like, we can tell SQL specifically what type of sort to perform and we can do this by using the keyword ASC which 48:26 stands for ascending or the keyword DESC which 48:31 stands for descending. And all we need to do is add these characters 48:38 after the name of the attribute by which we want to sort and the database engine will sort the results according 48:46 to our desires. A few interesting things that I need to point out about sorting. 48:53 Most people are aware that an ascending sort on numeric values means that the resulting sorted 49:04 values will go from small numbers to large numbers. 49:09 Similarly, an ascending sort on text would be an alphabetic sort. 49:15 So in the English language we would expect the results to be sorted A through Z. 49:22 And of course, we can do a descending sort which inverts the sort operations. 49:27 So a descending numeric sort will have numbers that go from high to low, or large to small 49:36 if you prefer, whereas a descending sort on a text 49:42 column will be a reverse alphabetic sort. 49:48 Here's a question for you. If I perform a sort on a series of dates, 49:53 what do the results look like? Well, if we're sorting on a date or time column 50:03 and we're using an ascending sort, the results will be sorted from oldest at the top 50:11 to the newest at the bottom. Similarly, if we do a descending sort on a date or time field 50:20 then the newest records will appear at the top and the oldest records will appear 50:25 at the bottom of the list. SQL provides us with several additional 50:33 built in functions that can be very, very useful for our querying purposes. 50:40 For example, we can use the COUNT function to get a count of the number of rows 50:45 that match some sort of specified criteria. Or if we want to find the minimum or maximum value 50:54 for a column within a set of rows, we can use the MIN and MAX functions. 51:01 We also have access to a SUM function if we would like to compute a total for a specific column. 51:07 So for example, I may be interested in knowing what my total sales were yesterday. 51:13 Or we might want to determine an average. 51:19 As an example, I might say what were the average daily sales last week? 51:25 Or a standard deviation. So taken together we have access to some very useful 51:30 built in functions within the Structured Query Language. Let's see some examples. 51:38 In this first case, we're saying simply SELECT COUNT star FROM Employee. 51:44 This is just asking the database to give us the number of records in the Employee table. 51:50 Or in simpler language, how many employees are in the Employee table. 51:56 In our second example, let's imagine that we are interested in learning something 52:01 about the total number of hours that are required for the various projects within our company. 52:10 In this case, we're using the MIN, MAX, and AVG functions in order to determine the minimum number of hours 52:18 involved in a project, the maximum number of hours involved, and the average number of hours involved. 52:25 So when processing this query, the database will first scan through the entire project table 52:32 and extract the set of rows whose projID values are greater 52:37 than 7 and then it will look within that set of rows 52:42 at values of the hours column and it will use those values to determine 52:48 the minimum number of hours worked, the maximum number of hours worked, and then it will also compute the average number of hours 52:56 for all of those projects. And then return the results to us. Very convenient. 53:02 Another extremely useful statement 53:08 within the Structured Query Language is the GROUP BY clause. And what the GROUP BY clause does 53:15 is it allows us to combine results into some sort of categorical or categorized output. 53:24 Let's look at an example. In this case, we want the department ID 53:34 and the number of employees that work in each department. 53:39 And we can do this by using the GROUP BY clause. So we tell the database to give us deptId and the count 53:47 or the number of rows within the Employee table, but we want the results to be grouped together by deptId. 53:57 So in this case, the result of this query would be a table which contains two columns, the deptId 54:05 and another column which will be named numOfEmployees, 54:10 and each row then will contain the department ID and its associated number of employees. 54:18 Note that I'm using something here called an alias to refer to the result of the count operation. 54:26 And this is implemented in this case using the AS keyword. 54:32 So I'm telling the database that I want the result of the COUNT function 54:37 to be called numberOfEmployees. The use of aliases can be very convenient 54:44 and we will see some more examples later in our lecture. 54:50 If we're doing a GROUP BY in order to categorize our results in some way, 54:57 we may also want to use the HAVING statement. And what the HAVING statement does 55:03 is it allows us to filter categorized results. Let's look at this example. 55:10 In this case, I'm asking the database to give me a list of salesperson IDs, salesperson last names, 55:19 and the total sales, which in this case is computed as the sum of the sale amount, 55:28 out of the Sales table. And I'm grouping by the salespersonId 55:33 and the salespersonLastName which means I want the results to be combined by the sales person. 55:41 And I'm also using the HAVING clause to constrain the set of results only to those salespeople who 55:52 have a total amount of sales that is greater than or equal to 10,000 55:57 in whatever units our sale amount column is measured. So perhaps units sold or total dollars sold, et cetera. 56:06 So HAVING can be very, very useful for filtering a set of categorized results 56:13 that has been generated using a GROUP BY clause. Another way of thinking about the HAVING clause 56:20 is that it serves the same purpose as a WHERE clause but for a GROUP BY statement. 56:28 In all of the examples of using the SQL SELECT 56:34 statement that we've seen thus far, we have been extracting data from just one table. 56:42 But oftentimes the kinds of questions that we will want the database to answer for us 56:50 will require that we extract data from more than one table. 56:55 And the Structured Query Language provides us with several different ways of doing this. 57:02 The first approach that we will consider is the use of a subquery. 57:08 Now remember as I said earlier that the result of a SQL query 57:15 is a relation in to two dimensional table of data. 57:20 And because of that, it's possible to use the results of one query as input into another query. 57:27 And this is what we mean by the term subquery. Note there are two different types of subqueries. 57:35 They are non correlated and correlated subqueries. 57:42 I know these are very technical terms but I will try to explain them using examples 57:47 that are easy to understand. To begin, let's consider a non correlated subquery, 57:53 which I think is the easier of the two to understand. 57:58 So here we see an example of a non correlated subquery. 58:04 What characterizes a non correlated subquery is that the inner query, that is the query that 58:12 is contained within parentheses, only needs to be run once in order for the database engine 58:21 to answer the question. Let's look at this a little more carefully. 58:27 Our outer query begins with SELECT empName FROM 58:33 the Employee table WHERE the department ID is equal to some set of values. 58:41 That is our outer query. The set of values that will be used for answering that outer query will be 58:48 the result of the inner query. And in this case, the inner query says give me a list 58:55 of department IDs from the Department table where the department name begins with the letters A-C-C-O-U-N-T. 59:04 So any department that begins with the word Account, which might include Accounting or Accounts payable or Accounts 59:12 receivable, would have its department ID included in the results. 59:18 That list of department IDs then becomes input into the outer query, which would use those department 59:27 IDs in order to produce the list of employee names. 59:32 So in plain language, if our company has an Accounts Receivable and Accounts Payable department, 59:43 then what this query would say is give me a list of employee names for any employee that 59:49 works in the Accounts Receivable or Accounts Payable department. 59:55 Note here that the inner query, that is the query that is contained within the parentheses, 1:00:02 only needs to be run once in order for the database to solve the problem. 1:00:09 That is we only need to get the list of department IDs a single time in order for the outer query to be solved. 1:00:19 This is a non correlated subquery. Now let's look at a correlated subquery. 1:00:27 In contrast to a non correlated subquery, in a correlated subquery the inner query, 1:00:35 that is the query inside the parentheses, will need to be run many times in order for the database 1:00:43 engine to solve the problem. The reason for this is that the inner query 1:00:49 is going to need a value from the outer query in order to do its job. 1:00:56 Let's see an example of this. Here we're asking the database for a list of employee names 1:01:04 from the Employee table and know that I'm using a lowercase letter e here as an alias for this Employee 1:01:14 table. So this allows me to refer to the Employee table in the outer query just by using the letter e. 1:01:24 So I want a list of employee names from the Employee table where the employee salary is greater 1:01:31 than some value and that value is determined by the inner query. 1:01:38 In this case, the inner query is computing the average employee 1:01:44 salary for the department in which the employee in the outer query works. 1:01:53 So imagine that the database is trying to answer this question. It's going to begin scanning through the Employee table. 1:02:02 And let's say that we arrive first at employee number one. And employee number one works in department five. 1:02:12 So the database is going to say, should I include this employee in the results. 1:02:18 And in order to answer that question it needs to compare employee number one's salary 1:02:25 to the average salary of the employees who work in the same department as employee number one. 1:02:34 So that means that we need the department ID of the employee that we are currently considering in order 1:02:42 to answer the inner query. In this case if our department ID is five, 1:02:47 the inner query will say give me the average employee salary from the employee table for department number five. 1:02:55 And that five is associated with the employee that we are currently considering in the outer query. 1:03:03 This is called a correlated subquery. As I mentioned earlier, the SQL provides us 1:03:10 with several different ways of extracting data simultaneously from more than one table. 1:03:16 Subqueries that we just examined were one of those ways. 1:03:22 The second way is through the use of something called a join. And a joint allows us to merge data 1:03:29 from more than one table together into a single result set. 1:03:36 There are several different types of joins within the Structured Query Language and I've organized the most important types here 1:03:45 into what I call Dan's Typology of Database Joins. So all database joins can be divided into inner joins 1:03:53 or outer joins. There are three different types of outer joins that we will consider. 1:03:59 They are a left outer join, a right outer join, and a full outer join. 1:04:05 And we will also take a look at inner joins. Now technically speaking, there are several different types 1:04:13 of inner joins, such as an equi join or a natural join, 1:04:18 but for our purposes we will just need to focus on an inner join as a more generic concept. 1:04:25 To begin, let's see how we can implicitly join two tables together using an inner join in the WHERE 1:04:34 clause. In this case, I'm asking the database to give me a list of employee names and their associated 1:04:42 department names. And note that we are now including more than one table 1:04:50 in our FROM clause. So we're saying give me data from the Employee table, which 1:04:55 we will refer to as E, and the Department table, which we will refer to as D, where the department ID in the Employee 1:05:06 table matches the department ID in the Department table. 1:05:12 What this means then is a database will take the department ID in the Employee table 1:05:17 and will look for a matching department ID in the department table. 1:05:23 When a match is found it will extract the department name from the Department table and the employee name 1:05:31 from the Employee table and will then include those value as those in the row in the results set. 1:05:38 An important thing to note here is that a result will only be included in the result set 1:05:45 if a match occurs. Let's see another example of how we can implement an inner join. 1:05:54 In this case, we're using the INNER JOIN key words in order 1:05:59 to move the join operation out of the WHERE clause 1:06:04 and into the FROM clause in our SQL SELECT query. 1:06:09 This allows us to remove some of the clutter out of the WHERE clause so that it can be used exclusively for other filtering 1:06:17 purposes. Let's take a look at the syntax. In this case, we're asking the database 1:06:22 to give us a list of employee names and department names out of the Employee table, which 1:06:28 we are referring to here as e. And then we see these new keywords INNER JOIN the Department table, which we 1:06:35 will refer to as d ON e.deptId equals d.deptId. 1:06:43 So we're telling the database look in the Employee 1:06:49 and Department tables, link those tables together by their department IDs, and then constrain the results only 1:07:00 to those departments that do not begin with the word Account. 1:07:06 So in plain language, we are asking the database here to give us a list of employee names and department names 1:07:13 for every employee except for those that work in any department which begins with Account, like Accounts Receivable or Accounts 1:07:22 Payable. Any employee that does not work in those departments will be a part of the result along with the name 1:07:29 of the department in which they work. Earlier I said that in order for a record 1:07:36 to be included in the results of an inner query, 1:07:41 it must have matching values from both of the tables that are involved in the join. 1:07:49 The major distinguishing characteristic between an inner join and an outer join is just that point. 1:07:58 In an inner join, records can only be included in the results set if there 1:08:05 are matched pairs of values in the two tables that are involved in the join. 1:08:11 In an outer join, the matched pairs of values will be included in the results, but so, too, 1:08:19 might be additional records that have no matching value in the other table involved in the join. 1:08:28 Let's see an example of how this works. In this case, we are asking the database 1:08:35 for a list of employee names and department names from the Employee table, which we will refer to as e, 1:08:44 and the department table, which we will refer to as d. We are joining these tables together using a LEFT OUTER 1:08:52 JOIN operation. Where the department ID in the Employee table 1:08:58 matches the department ID in the Department table. Now what the results of this query will be 1:09:05 is a list of employee names and department names. 1:09:12 The results will contain every employee name in the Employee 1:09:17 table along with the department names of those employees which have a department ID assigned to them. 1:09:28 If an employee in the Employee table does not have a department ID, then they 1:09:36 will still be included in the results it's just that there will be no department name associated 1:09:43 with that employee. It will be a null value in the result set. 1:09:50 Now the term LEFT OUTER JOIN here is being used to tell the database from which 1:09:58 table all of the results should appear regardless 1:10:03 of whether there is a matching value in the other table. If you will notice in the FROM statement 1:10:11 we have two tables listed, Employee and Department. 1:10:17 The Employee table is named on the left side of the join statement, so to the database that is 1:10:25 the left table in this query. To whereas the Department is the right table. 1:10:30 It appears on the right side of the join statement. 1:10:36 So when we do a LEFT OUTER JOIN then, in this example, we're saying give me a list of all of the employees 1:10:44 regardless of whether they have a matching department. But if they have a matching department, 1:10:50 include the department, as well. A RIGHT OUTER JOIN, by contrast, tells the database 1:10:58 to include all of the records in the right table regardless of whether there is a matching 1:11:05 record in the left table. In this case, we are instructing the database 1:11:12 to include all of the departments in the results regardless of whether there is an employee assigned 1:11:21 to those departments. If an employee is assigned to a department, 1:11:27 then that information will be included in the results. But if not, if there is a department that currently 1:11:35 has no employees in it, it will appear in the table, as well. 1:11:40 It will just have an empty or null value in the results 1:11:47 where the employee name would otherwise be. Finally, we can consider a FULL OUTER JOIN 1:11:54 which instructs the database to include all of the records from both the left table and the right table 1:12:01 regardless of whether a match exists. But if a match exists then those records 1:12:07 will be matched up in the results. In this case, we're instructing the database 1:12:12 to include all of the employees and all of the departments in the results 1:12:17 regardless of whether they have any matching values in the other table that is involved in the join operation. 1:12:26 At this point, we have a basic level of understanding on how to use the Structured Query 1:12:32 Language to create, read, update, and delete data 1:12:37 within tables. So let's return briefly to the data definition language sub 1:12:45 component of the broader Structured Query Language and learn how to perform a few additional tasks. 1:12:53 First among these is dropping an object from the database. 1:12:59 For example, if we wanted to entirely get rid of a table, 1:13:04 not the data within the table but the entire table itself, we could use the DROP statement. 1:13:12 Specifically, if we were trying to get rid of the Employee table, we could say DROP TABLE Employee. 1:13:21 Very short set of commands and that will not only delete all of the data within the table, 1:13:29 but it will remove the table itself from the database. Remember, there's an important distinction 1:13:35 between deleting the data in a table and deleting the table itself. 1:13:40 Removing the table itself means there's no longer, in this case, any Employee 1:13:46 table in the database at all. Recalling the use of our ALTER statements and CONSTRAINT 1:13:54 statements, here we can see how to get rid of a constraint 1:13:59 if we need to. So in this example, we want to remove the employee foreign key 1:14:08 from the Employee table. And we accomplish this by using an ALTER TABLE statement. 1:14:15 So we say ALTER TABLE we specify the name of the table that we want to alter in this case, the Employee table, 1:14:21 and then we tell the database to DROP the constraint that is named employee foreign key. 1:14:29 Remember that in the context of the ALTER TABLE statement, we used ADD CONSTRAINT to add a new constraint to a table 1:14:39 and here we used DROP CONSTRAINT to remove that constraint. 1:14:45 An additional type of constraint that we can create is called a CHECK constraint. 1:14:52 CHECK constraint can be used to specify an acceptable set 1:14:58 of values that are allowed to appear within a column in a table. 1:15:05 Let's look at this example. Here we are altering the project table 1:15:10 to include a CHECK constraint, which we are naming projectCheckDates, and we 1:15:18 are instructing the database to check whether the start date is less than the end date. 1:15:25 If that condition is true, everything is acceptable. 1:15:31 If that condition is not true, that is if the proposed end date for the project 1:15:37 is less than or equal to the proposed start date, then the database will not allow this row of data 1:15:47 to be added to the Project table. This is another mechanism that we 1:15:52 have available in our toolbox for helping to protect the quality and integrity of the data 1:16:00 in the database. Finally, I just want to briefly mention 1:16:06 the concept of a SQL View. Now a SQL View is a virtual table 1:16:13 which is constructed by using a SELECT statement that 1:16:18 is stored in the database. Views can be useful in many different situations, 1:16:25 especially when we have different types of users who use our database. 1:16:30 To create a view, we simply use a CREATE VIEW statement 1:16:35 and then we specify the SELECT statement which forms the foundation of that view. 1:16:42 Let's look at this example. Here we are creating a view named SalesDepartment 1:16:49 which will include all of the columns from the Employee table, but will include only those employees that 1:16:56 work in the Sales Department. So we are using this SELECT statement 1:17:02 to construct a virtual table, which includes just a subset of all of our employees. 1:17:09 We only want to see the employees who work in the Sales Department. And then once this view is established, 1:17:16 I can write queries against that view just as if it were a table. 1:17:23 So for example, I could say SELECT employee name from the sales department view. 1:17:30 And in this case, the database would look in the View for employee names 1:17:35 and the results, by extension, would be a list all of the names of the employees 1:17:43 who work in the Sales Department. Well, my friends, thus ends our brief introduction 1:17:51 to the Structured Query Language. There is certainly much more to be learned about the Structured 1:17:57 Query Language than I presented here in this lecture. However, you should now have a basic set of skills in SQL 1:18:06 that will allow you to start working with databases using the Structured Query 1:18:12 Language immediately. I hope you learned something interesting. 1:18:20 Until next time. Have a great day.