0:00 Good day, everyone. This is Dr. Soper here. 0:06 And today we will be exploring our sixth topic in our series of database lectures. 0:11 With today's lecture focusing on database administration. 0:17 Although we have many different objectives for today's lecture, there are a few primary areas on which 0:23 I will be focusing the majority of my comments. These are concurrent access, which 0:31 is a situation in which more than one user wants to access a given database at the same time, 0:39 database security, and the model that has been put in place for backup and recovery 0:47 of these relational databases. We will of course be addressing the several additional topics 0:53 along the way, but the primary focus of our lecture today will be those three areas. 1:00 In addition, toward the end of the lecture we will perform a quick examination 1:06 of the administrative and managerial tasks that a database administrator must complete. 1:13 And we'll also be examining distributed database processing, which is a situation in which our database might 1:21 be replicated or partitioned such 1:26 that it can exist on more than one machine. And we will conclude our lecture today 1:33 with a brief overview of object oriented databases and object relational databases. 1:41 To begin, consider this conceptual representation of the database processing environment. 1:49 This diagram is intended to show the complex environment in which modern database technologies are used. 1:57 We can see here that the database management system exists at the center of this environment, 2:04 and it is exclusively the database management system that has access to the data themselves. 2:11 However, many different types entities can make requests of the database management system. 2:20 Aside from internal database triggers and stored procedures, we have application programs that can utilize our DBMS. 2:30 Web requests, perhaps in the form of active server pages or Java server pages that similarly can utilize our DBMS. 2:39 And we have any number of additional sources of requests that the DBMS may need to process, 2:47 such as ad hoc queries, forms, reports, et cetera. 2:53 So as we can see here, the possibility exists for a single database to have 2:59 to serve many different types of requests simultaneously. 3:06 Because of this complexity we need to view a database processing environment 3:12 as a dynamic environment. And in order to ensure that the database can meet the needs 3:19 and demands of all of the different parties and software applications that may rely upon the database, 3:27 a database administrator must configure and manage the database such that it provides 3:34 for concurrency control, security, and backup and recovery in the case that something goes wrong. 3:45 To begin, let's consider concurrency control. So broadly speaking, the idea of concurrency 3:51 is that more than one user, in this case a user might be a person, a human being, or software 4:00 application, such as a standalone application, a web service, or a data driven website. 4:08 More than one of these types of users may be trying to access the same data in the database 4:15 at the same time. And this need to provide for concurrent access leads 4:22 to a problem known as interdependency, where in modifications to the database that are made by one 4:31 user may impact other users who are simultaneously trying 4:36 to use the same information. So the objective of currency control mechanisms 4:46 in database administration is to ensure that the actions of the various users, 4:54 who may be trying to access the same data at the same time, are not going to adversely impact one another. 5:02 Another way of saying that is, we want to put a mechanism in place that 5:08 will ensure that the quality, and reliability, the accuracy 5:14 of the data remain intact, despite the fact that many users might be trying to work with or modify 5:23 those data at the same time. The primary challenge with concurrency control 5:30 is balancing this need to maintain the quality of data 5:36 in the database with the need to provide accessibility to the data, so that many users can efficiently use 5:46 the database at the same time. Thus, we have a spectrum. 5:51 On one end of the spectrum we can design our concurrency control framework such 5:57 that when one user accesses or touches 6:03 a particular set of data within the database, we block all other users from being 6:10 able to use those data until the first user has finished 6:15 whichever tasks he or she was trying to perform. The other end of this concurrency control spectrum 6:22 would be that we don't impose any constraints and a user is allowed to read and work with data regardless 6:30 of whether any other users are simultaneously using or working with those data. 6:37 In a high volume high user database processing environment 6:43 we must often try to find some level of the balance between the two extremes 6:49 of this concurrency control spectrum. Next I would like to introduce the concept of a database 6:58 transaction. Simply put, a database transaction 7:03 is a series of operations that need to be 7:09 performed against the database. One way to think about this is that a transaction 7:15 is a series of SQL commands or a series of instructions. 7:22 Despite the fact that a transaction typically 7:28 involves more than one operation, it is critical in the database processing environment 7:35 that all of the operations or all the instructions involved in the transaction be treated as a single atomic unit, that 7:45 is a logical unit of work. And using this approach every step in the transaction 7:54 must be completed successfully, or the entire transaction 7:59 will fail. That is, every step must succeed in order 8:05 for the entire transaction to succeed. If any of the steps fails, then the whole transaction 8:13 will fail, and we will need to undo, or roll back, 8:19 any changes that were made to the database during the transaction prior to the point when 8:27 the transaction failed. Here we see an example of a transaction 8:33 which is going to involve three separate tables in a database. A Customer table, a Salesperson table, and an Orders table. 8:44 On the left we see the state or status of these three tables 8:51 before the transaction begins. While on the right, we see the state or status 8:58 of these three tables after the transaction has completed. And in this example, we're illustrating 9:04 how a transaction can introduce anomalies into the database 9:10 if one of the steps in the transaction fails. So let's move through this transaction one step at a time. 9:17 Prior to running the transaction we see that in our Customer table we have one row for customer number 123. 9:26 The first step in our transaction is to add a new record into the Customer table. 9:31 And here we're adding a new record for order number 8,000, 9:36 which is an order for 250 basketballs valued at $6,500. 9:43 So that's the first step in our transaction, is to record that new information in the Customer table. 9:49 The next step in our transaction is to update the Salesperson table to reflect the fact 9:56 that our salesperson Jones has made this sale of 250 10:03 basketballs valued at $6,500. So prior to the transaction running, 10:11 the record for salesperson Jones in our Salesperson table showed that he or she had accumulated 10:18 $3,200 worth of sales during this period. And after the transaction we have updated our salesperson's 10:29 record, such that he or she now has $9,700 worth of sales. 10:36 So the update reflects the additional sale of $6,500 worth of basketballs. 10:45 Finally, the third step in our transaction is to insert the new order into the Orders table. 10:52 However, at this point in the transaction an error occurs. 10:57 And our error in this case is that the Orders table is full. So perhaps we've run out of storage space 11:04 and the database is therefore not able to add any additional information to the Orders 11:10 table. Now, if we stop at this point, we 11:15 have introduced several anomalies into our database. 11:20 Namely, we have a new record in the Customer table which shows that a customer wanted 11:27 to purchase 250 basketballs. And we've updated our Salesperson table 11:34 to reflect the fact that our salesperson Jones has made that sale. 11:40 However, the order information itself does not exist in the Orders table. 11:46 If we were to stop at this point without undoing the changes that were made during the earlier 11:54 stages in the transaction, then these anomalies that have been introduced into the database 12:01 would almost certainly cause problems for this organization. By contrast, when all of the steps in a transaction 12:10 are treated as atomic, as a single, logical unit of work, 12:16 any changes that have been made to the database tables will be undone if an error occurs during the execution 12:25 of the transaction. So in this case, we can follow the execution 12:30 of the transaction. And we see again, the first step is to change the Customer data to reflect the fact 12:38 that 250 basketballs were sold. We would then change the Salesperson data. 12:45 We would then try to insert the Order data. However, in this case, an error exists, 12:53 so we must roll back the transaction in order to restore the database tables to the state or status 13:03 that they had prior to the transaction. And following the rollback it is as if the transaction never 13:12 occurred. Next, let's see an example of how 13:18 a database server might handle a concurrent processing situation. 13:24 So imagine that we have two Users, A and B, both of whom are trying to access information simultaneously 13:33 in our Item table. So User A makes a request to read information 13:39 related to item 100, while at the same time User B makes a request to read information 13:45 related to item 200. User A then makes a request to change a value associated 13:53 with item 100, while User B makes a request at the same point in time to change a value associated 14:01 with item 200. The resulting change for A is then written to the Item table, 14:07 while at the same time User B is requesting that his or her changes be written to the Item table. 14:15 Despite the fact that many of these requests are arriving simultaneously, the database server 14:21 must process them in a sequential fashion. So as an example of how the database server might actually 14:29 process these simultaneous requests, we can see that the DBMS has arbitrarily 14:35 chosen to handle the requests from User A first. 14:40 So although the requests from User A and User B to read items 100 and 200, respectively, arrive 14:49 at the same time, the database first will read the value associated with item 100 for User A, 14:56 and then will read the value associated with item 200 for User B. The proposed change for item 100 for User A 15:05 is then handled, after which that change is written to the database for User A. After User A's modifications 15:15 to the Item table have been completed, then the DBMS will handle the requests from User B 15:23 to make changes to the Item table for item 200. 15:29 Without an intelligent concurrent processing strategy in place, the possibility exists for us to encounter something 15:38 called a lost update problem. And the lost update problem occurs when two or more 15:45 users are simultaneously attempting to access the same piece of information in the database, 15:51 and one or more of those user's changes are overwritten by the changes made to the database 16:00 by another user. Let's see an example of how this can work. 16:05 In this example imagine, once again, that we have User A and User B, and both and B 16:11 are attempting to use information in the Item table. 16:18 In this case, both User A and User B are interested in item number 100. 16:24 Let's assume for this example that the current count value, say the quantity on hand for item number 100, 16:33 is 10 units. So we have 10 units of item 100 in our inventory. 16:40 Now, both User A and User B simultaneously request from the database to read how many units of item 100 16:50 are in inventory. So again, we are assuming that the answer to that question is 10. 16:55 User A after receiving that information wants to instruct the database to reduce the number of item 17:04 100 that is available by five units, and then write the result back to the database. 17:12 Simultaneously User B wants to reduce the number of item 100 17:19 that are available in inventory by three units, and then write the result back to the database. 17:26 Now we as intelligent human beings can look at this problem and we can say, OK we begin with 10 units. 17:34 User A, perhaps is a salesperson, and is attempting to sell five units of item 100, 17:43 while User B, another salesperson, is attempting to sell three units of item 100. 17:50 So in total eight units are being sold. And the result after these two operations 17:57 should be that two units of item 100 remain in inventory. 18:03 So we began with 10 units User A sells five, units User B sells three units. 18:10 The result should be two units. Unfortunately, because these two users requests 18:17 are arriving simultaneously at the database, and in this example, we do not have an intelligent strategy 18:25 in place for handling this type of concurrent access, we'll see that we are going to encounter some problems. 18:33 Let's see how the database server might process these two requests. So first it might read the quantity of item 100 18:42 available for User A. And it would return a value of 10 to User A. It would then perform the same operation for User B, 18:51 returning a value of 10 for User B. User A, then, would attempt to set the number of units of item 18:59 100 in inventory to five. That is, our salesperson has sold five units, 19:07 so we are subtracting 5 from 10. 10 minus 5 is 5. 19:12 Therefore User A is attempting to set the new value of the quantity of item 100 19:20 available to five units. That value is then written to the database. 19:26 So by the time that we reached step four, inside the database the number of units available is now five. 19:36 The database then proceeds to handle the requests from User B. Recall that the database has 19:42 told User B That there are 10 units in inventory. User B then attempts to reduce the number 19:50 of units in inventory by three. The result of which would be 7. 10 minus 3 is 7. 19:58 And then that value is written to the database. So by the time we have completed step six, 20:06 the number of units of item 100 that our database shows we have in inventory is seven, when in fact, it should be 2. 20:18 Another way of saying that is, because we did not have an intelligent strategy in place 20:24 for handling this concurrent access, the changes that were made to the database by User A 20:32 have been overwritten by the changes made by User B. This is an example of a lost update problem. 20:41 Speaking more broadly, we have three concurrency issues 20:47 which might emerge during the execution of a transaction in a database processing environment that supports 20:55 multiple simultaneous users. The first among these is called a dirty read. 21:02 In a dirty read a transaction will read a record that has been modified by another user, 21:10 but who's changes have not yet been committed to the database. And by committed to the database here, I 21:18 mean that the database has been informed that a change should be made, but that change has not yet 21:27 been made permanent. It has been proposed to be made. 21:32 The database just has not yet made that change. So, when a transaction reads a record whose value has been 21:42 modified by another user, but has not yet been made permanent within the database, 21:47 that is called a dirty read. Our second concurrency issue is called an inconsistent read. 21:54 And in an inconsistent read, the transaction is going to re-read a data set that it has previously 22:03 examined and will discover that the values of the data 22:09 have changed since the last time that it examined those data. So imagine a transaction where one of the early steps 22:17 is to examine the values in the data set, and then it performs some additional operations 22:24 on some other data. When it comes back and re-examines 22:29 those original values, they've changed. Meaning that in the interim, some other user 22:35 or some other process has altered the data upon which this transaction is relying. 22:42 This is called an inconsistent read. And finally, we have a concurrency issue 22:48 known as a phantom read. In a phantom read, when the transaction 22:54 re-reads the data set, it may discover that a new record has been added since the last time 23:02 that the transaction examined the data set. So again, we can imagine a transaction 23:07 where somewhere in the early steps, the transaction reads a data set and then 23:15 perform some operations against some different data values in the database. 23:21 And when it comes back, it finds that a new record or a new row 23:26 has been added to the original data set that it read, which was not there during the initial read. 23:35 This is called a phantom read. So in order to address these three different types 23:42 of possible concurrency control issues, the general strategy that is employed 23:48 by databases that support concurrent access is known as resource locking. 23:54 And the primary thing to understand about this strategy is that when resource locking is in place, 24:02 data that are being used by one user are temporarily locked, such that other transactions which 24:10 want to use those data are not allowed use the data until the transaction which 24:17 has locked the data has finished all of its tasks. 24:22 Database locks come in two different varieties, implicit and explicit locks. 24:30 Implicit locks are issued automatically by the database management system, 24:36 whereas explicit locks are issued by users who are requesting exclusive access 24:44 to certain data within the database. So user initiating a transaction may lock certain database 24:52 objects such that all of the steps in the transaction can be completed without having to worry 24:59 about any ill effects being introduced into the database well the transaction 25:06 is under way by other users who may be trying to use the same data simultaneously. 25:13 Any other users who are attempting to use those data when they are locked, 25:20 will simply be put into a wait state. That is they're added into a queue 25:26 and the database will then service their request after the resources have been unlocked. 25:32 Note that explicit locks can be issued against several different types of database objects 25:38 at different levels of granularity. A transaction, for example, might request a table lock, in which an entire table of data 25:47 is locked. At a finer level of granularity, the database might issue a row lock or a column lock, 25:55 where in a certain set of rows within a table are locked, or a certain set of columns within a table 26:02 are locked while the transaction is unfolding. And at the finest level of granularity, 26:08 we may issue a cell lock. That is, we may lock in a single cell within the table 26:15 temporarily while our transaction is using or manipulating its value. 26:21 Let's see an example of a concurrent processing situation when an explicit locking 26:27 strategy is put in place. So once again we have Users A and B. 26:35 And we recall our previous example where both users are trying to access 26:41 item 100 at the same time. We currently have 10 units of item 100 in inventory. 26:48 User A wishes is to reduce the count of item 100 by five units. User B wishes is to reduce the count of item 100 26:57 by three units. So the result should be, after both of these transactions 27:04 have been processed, that we have two units remaining in inventory. 27:10 Let's see what happens when an explicit lock is put in place. 27:16 So the database server begins processing these two requests. It first locks item 100 for User A 27:25 then reads the value of the inventory count for item 100. 27:31 That is, it's going to return a value of 10 units back to User A. At this point, the database server may receive 27:42 the request from the User B to lock item 100. However, since item 100 is already locked by User A, 27:51 the database cannot service that request at this point in time, so it places B in a queue. 27:58 It says, OK User B, I will get you eventually, but right now you have to wait. 28:06 The database then continues processing User A's transaction. So it sets the item count to five for User A, 28:14 because recall that User A wanted to reduce the original item count by five units. 28:20 10 minus 5 is 5. And then it writes the results back to the table, after which the lock is released. 28:29 So by the time we have completed step six, the inventory count for item 100 in the database 28:37 is now five units. And now that User A has released their lock on the data, 28:46 the database can then honor User B's request to lock the data. It then reads the value for B. So at this point 28:56 it will return a value of 5 units back to User B. User B wishes is to debit that value by three 29:04 units. 5 minus 3 is 2 units. The new inventory count is set to 2. 29:12 That value is stored to the database and the database releases the lock that 29:18 had been established for User B, such that other users now have access to item 100 should they need it. 29:27 So by imposing this locking strategy, we no longer have the lost update problem 29:33 that we saw in our previous example. Using the locking strategy User B had to wait temporarily, 29:42 but the result is that our inventory count is correct. 29:48 An important concept to understand at this point in our discussion is the notion 29:54 of a serializable transaction. And this is a philosophical orientation 30:00 which says that if two or more transactions are processed 30:05 concurrently, then after all of the transactions have been completed the results in the database 30:15 should be logically consistent with the results that 30:21 would be in the database if the transactions had been processed 30:26 in arbitrary order. Let's revisit our previous example to see if we can understand this concept. 30:34 When we went through this example the first time, User A's requests were serviced by the database 30:43 before User B's requests were serviced by the database. 30:50 Let's now reverse that process, so that we handle User B's requests first, 30:56 after which User A's requests will be handled by the database. 31:01 Again, recall that we are beginning with 10 units of item 100 in inventory. 31:07 If we handle user B's requests first we lock item 100. 31:14 We read the number of units of item 100 in inventory for User B. That value is 10. 31:22 We then reduce the count by three. That is, we set the value to 7. 31:29 10 minus 3 is 7. We then write the result the database, after which the lock is released. 31:37 Next, a lock is issued for User A. The number of units in inventory is read for User A. The database 31:45 returns a value of 7 for that result. User A then makes a request to reduce the number of units 31:52 in inventory by 5. 7 minus 5 is 2. 31:58 So the resulting value of 2 units is then written back to the database. 32:04 Thus the final number of units in inventory is 2, regardless of whether we process User A's transactions 32:13 first, as we did during our first trip through this example, or whether we process 32:19 User B's transactions first. In either case the result, using our explicit locking strategy, 32:28 is an inventory count of two, and that is logically consistent with what we would expect. 32:37 So a locking strategy seems very logical and can have great benefits, but using 32:45 a locking strategy can also introduce certain new problems 32:50 into the database processing environment. And one of those problems is known as deadlock. 32:58 Deadlock occurs when two transactions have each locked resources in the database that the other transaction 33:06 needs in order to complete its tasks. In this situation without some mechanism 33:13 for avoiding deadlock, the two transactions involved will have to wait on each other indefinitely. 33:22 Let's see an example that will illustrate this deadlock phenomenon. 33:31 So here we have a conceptual example for User A and User B. 33:37 We're not going to worry about the technical details of the database. We'll just say that we're working with simple objects, 33:44 paper and pencils. So imagine that User A initiates the following sequence 33:50 of events. Lock paper then take some paper or modify the value of paper, 33:59 and then try to lock the pencils. 34:04 Simultaneously User B locks the pencils, tries to take or modify the value of the pencils, 34:11 and then asks the database to lock paper. So let's see what happens at the database server 34:18 when these requests arrive. The database first locks the paper resource 34:24 for User A. It then locks the pencils resource for User B. The database next process User A's requests 34:35 to modify the value of paper and then write those results back 34:41 to the database. It then handles B's requests to modify the value of pencils 34:48 and then write those results back to the database. And next, this is where the deadlock occurs. 34:56 A has requested to lock pencils, but pencils have already been locked for User B. At the same time, 35:04 B has requested to lock paper, but paper has already been locked for User A. 35:11 Thus the database would put User A in a wait state for the pencils to be released, 35:18 while user B would be put in a wait state for the paper to be released. 35:23 And these two transactions would then wait on each other indefinitely and neither would be able to complete its tasks because the resources that it 35:33 needs to complete its tasks are locked by another user. 35:39 Database management systems must therefore incorporate a mechanism for detecting or avoiding 35:47 this deadlock situation. 35:52 Next let's talk about two major strategies toward locking. One of which we can call optimistic locking, 35:59 and the second of which we can call pessimistic locking. Conceptually speaking, in optimistic locking, 36:06 we are optimistic that the changes that we make to the database during our transaction will succeed 36:15 and no concurrency control problems will arise during the execution of the transaction. 36:23 Conversely, during pessimistic locking, we assume that a concurrency control problem might 36:32 arise during the execution of our transaction, and so we proactively seek to avoid that. 36:39 Optimistic locking strategies are useful in database environments in which 36:45 the majority of the transactions involve just reading data. That is, we're not making a lot of updates to the data. 36:54 In this case, we can be optimistic that no concurrency control problems will occur, 37:02 because it is unlikely that more than one user will be trying to update the same data at the same time. 37:10 We simply don't have a lot of updates in our database, so the likelihood of having two simultaneous updates 37:18 against the same data is small. And in that case, we might implement an optimistic locking 37:24 strategy. By contrast, if we are in an environment where data are frequently updated and the chances that two 37:34 users might want to simultaneously update the same data are reasonably high, then in that case, 37:41 we would want to implement a pessimistic locking strategy. So let's consider the differences 37:47 between these two locking strategies at a slightly lower level. To begin we'll examine optimistic locking. 37:55 And the basic process which is employed under optimistic locking is, we first 38:03 read the data that is requested by the transaction. We process the transaction, that is, 38:09 we initiate any sort of modifications or updates to the data that we need. 38:17 That update is recorded. We then check the database to see if any sort of conflicts 38:23 occurred, or any sort of errors occurred. Is there anything unexpected? 38:29 If no conflict has occurred, we then commit the transaction and make our requested changes permanent. 38:36 They then become a permanent part of the database. Otherwise, if some sort of conflict did occur, 38:42 then we initiate a roll back. We undo all of the steps of our transaction 38:48 and we try to repeat the transaction again, until it ultimately succeeds without a conflict. 38:55 Under the pessimistic locking strategy, we begin first by locking the resources, 39:02 reading the data that we need, we then process the transaction 39:07 by making any updates or modifications to the database that are necessary. 39:13 We commit those changes and we release the locks. Note that in a pessimistic locking strategy 39:21 we do not need to check to see if a conflict occurred. We simply commit the changes directly. 39:27 The reason that we do not need to check to see if a conflict occurred is because all the required 39:32 resources for the transaction were locked prior to any updates or modifications being 39:40 made to the data that are being used by the transaction. 39:45 Let's see some intuitive examples of these optimistic and pessimistic locking strategies. 39:52 Here we see an example of an optimistic locking strategy 39:59 that has been written in the pseudo-code. To begin, we request some values from the product table. 40:07 Specifically here we are interested in the quantity of pencils that are available in our database. 40:15 So we request the number of pencils available and we record that value 40:21 into a variable which here we have named old quantity. So let's say that our number of pencils is 10. 40:30 So we request 10 pencils. We store the value 10 into our old quantity variable. 40:36 We then set the new quantity value equal to product quantity minus 5. 40:44 That is, we set the value equal to what should be 10 minus 5, 40:49 or five units. Thus, the value of new quantity should be equal to 5. 40:57 We then continue processing our transaction, looking for any sort of errors, et cetera. 41:04 And at this point, we then issue a lock. We then attempt to carry out our update. 41:11 So we instruct the database to update the value of pencils. 41:16 We want to set the value of pencils to the value stored in our new quantity variable, where 41:24 the quantity of pencils in the database is equal to the old quantity. 41:30 That is, it's equal to 10. If the quantity of pencils has changed since we first 41:37 retrieve the quantity at the beginning of our transaction, that is, if the quantity of pencils 41:44 is no longer equal to 10 when we attempt to issue our update, 41:50 then we know that some other user or some other process has modified the quantity of pencils during the time 41:59 while we were busy processing other parts of the transaction. 42:04 So when we check and we determine that our update was not successful, 42:10 we would need to roll back this transaction. That is we need to undo any changes that 42:16 were made along the way and try to repeat the transaction until ultimately it is successful. 42:23 Next, let's look at a similar example, 42:28 except this time we will use a pessimistic locking strategy. Remember in a pessimistic locking strategy 42:36 we assume from the outset that one of our concurrency control 42:42 problems might occur, and so to defend against that we simply 42:47 begin by first locking all of the resources that we will need to complete our transaction. 42:54 So we lock all of the resources that we need. In this case, we extract the product quantity 43:00 for pencils, which is 10 units. We set the new quantity equal to the previous quantity minus 5. 43:08 So the new quantity is set equal to 5. We process any additional steps in our transaction. 43:14 We issue our update and we then unlock the resources 43:20 that were previously locked. In this case there's no need to check whether the update was 43:26 successful. We know it was successful, because we had locked all of the required resources 43:32 that we needed prior to executing transaction. At this point you may be asking yourself, well, 43:39 why would anyone ever want to use an optimistic locking strategy if a pessimistic locking strategy can always 43:48 guarantee that the update is successful? Well, the answer to that question is, when a pessimistic locking strategies is used, 43:57 it is much more likely that other users will be put into a wait state. 44:04 That is, they'll have to wait in line for access to the resources in the database that are required 44:11 to complete their transactions. And that can slow the overall performance of the database. 44:19 So ultimately, the decision about whether to use pessimistic locking or optimistic locking 44:26 must come down to the nature of the queries that are being run against this particular database. 44:34 If the database one in which many, many updates are done, then a pessimistic locking strategy 44:43 would probably be preferred. If the database is one in which most of the requests that 44:50 are made of the database are reads, that is we're just reading data, selecting data out 44:56 of the database, then an optimistic locking strategy would probably be preferable. 45:01 Next, let's talk about this notion 45:08 of a consistent transaction. And a consistent transaction is commonly 45:15 referred to in the industry as an ACID transaction. That is, it is one which is atomic, consistent, isolated, 45:24 and durable. Atomic transactions are treated by the database 45:30 as a single logical unit of work. As we said earlier, an atomic transaction 45:35 is one in which all of the steps of the transaction must be completed successfully in order for the transaction 45:44 to be considered successful. That is, if any of the steps in the transaction fails, 45:51 then the entire transaction fails. Treating a transaction as a single logical unit of work 46:00 helps us to avoid data anomalies, as we saw earlier in this lecture. 46:07 Consistent transactions are those in which other transactions are disallowed well 46:15 the transaction is under way. And this sort of consistency can be achieved through the artful use of database locks. 46:25 Next, let's consider the extent to which a transaction is isolated. 46:31 We now know that in multi-user databases, 46:36 it's possible for different transactions to be using or working with the same data at the same time. 46:44 And because of this, we know that the data in the database can be continuously changing. 46:52 We therefore need to make some decisions as database administrators regarding the extent to which our transactions will 47:00 be isolated in our concurrent database processing environment. 47:06 Now, the ANSI SQL standard defines four different isolation levels. 47:12 And these four isolation levels specify which of the three concurrency control problems 47:20 are allowed to occur in the database. If you remember our three concurrency control 47:25 problems were dirty reads, inconsistent reads, and phantom reads. 47:31 So the least restrictive isolation level is Read Uncommitted. 47:37 And here, many concurrency control problems are considered acceptable. That is, transactions are not particularly isolated. 47:48 As we move from Read Uncommitted toward Serializable, our transactions become increasingly isolated as they 47:55 are processed by the database. A Serializable isolation level than prevents 48:02 concurrency control problems, but comes at the expense 48:07 of accessibility. That is, if we choose to implement a Serializable 48:13 transaction isolation level, then although our transactions 48:19 will be very well protected, and the likelihood 48:25 of concurrency control problems occurring is essentially zero, the cost of that is accessibility. 48:34 It is much, much more likely that users will have to wait for longer periods of time 48:39 in order to have the requests serviced by the database. Finally, we have Durability. 48:47 And a Durable transaction is simply one in which all of the changes which 48:52 are committed during the course of the transaction are considered permanent within the database. 49:00 Another topic that can be very important for database administrators to understand is that of cursors. 49:07 Recall that when we use a SQL SELECT statement, 49:14 the result of that statement is a set of rows that is returned by the database management system. 49:23 And we call that set of rows a result set. A cursor is simply a pointer into that set of rows. 49:35 One of the reasons that we need to use cursors in application programs or data driven websites, 49:43 is that these results sets can often be very large. So it doesn't make sense, or it might even 49:52 be beyond the capability of these applications or websites, 49:58 to work with the entire results set as a single unit. 50:03 Using a cursor allows an application or a website to process a result set one row at a time. 50:12 And these cursors are commonly defined using SQL SELECT statements, such as the example here 50:20 which relies upon the declare cursors syntax. 50:26 Conceptually speaking, we have two major types of cursors. There are forward only cursors, also called 50:35 non-scrollable cursors, and scrollable cursors. 50:40 As the name suggests, in a forward only cursor, our rows are fetched one at a time from the start 50:49 to the end of the cursor. We iterate through the cursor one row at a time, 50:55 and we can only move in one direction. That is, we cannot stop halfway through and look 51:02 at the previous row in the cursor, for example. We start at the first row in the result set. 51:08 We then move to the second row, to the third row, to the fourth row, and so forth until we reach 51:13 the end of the results set. And at no time are we allowed to move backward through the results set. 51:21 Because we cannot scroll the cursor backwards, any changes that are made by other users to the database 51:30 to rows in our results set that we have already examined will 51:37 not be visible through the cursor. So if I am reading through a results set using a forward 51:45 only cursor, and I read row one and then row two and then row 51:51 three, and at that point in time some other user makes a change 51:57 to row one, I would not be able to see that change through 52:03 the cursor, because I cannot look backwards. 52:09 If however, I am scrolling through the results set, 52:15 and through some sort of insert, update, or delete statement, 52:21 values in the results set are changed for rows that I have not yet examined, then I 52:30 will see those changes at the point in time when I examine the affected rows. 52:37 So for example, if I am scrolling through a results set using a forward only cursor, and I examine row one 52:46 and then row two and then row three, and at that point in time someone 52:51 makes a change which affects the values of row five, by the time 52:58 I reach row five I will be able to see that changed result, 53:04 because I am fetching rows from the database serially, one 53:10 at a time as I move forward through the cursor. 53:16 By contrast to these forward only cursors, we also have scrollable cursors. 53:22 And a scrollable cursor allows us to move both forward 53:27 and backward through a results set as needed for the application program or the data driven website 53:37 to examine the rows. And there are three major types of scrollable cursors, 53:44 each of which has different properties. These are static cursors, keyset cursors, and dynamic cursors. 53:53 So static, keyset, and dynamic cursors are the three different types of scrollable cursors. 54:00 Let's see how they're different from one another. Of the three different types of scrollable cursors, 54:08 static cursors are probably the simplest to understand. The way that a static cursor works, 54:15 is that when the cursor is requested, a snapshot of the results is taken 54:23 and the complete result set is fetched from the database and stored in a temporary table. 54:31 So when the cursor is opened all of the results are fetched from the database at that point in time 54:37 and stored in a temporary table. The user of the cursor can then scroll forward or backward 54:45 through the results set. And the rows which are visible to the cursor 54:50 are those which were copied into this temporary table at the moment in time when the cursor was opened. 54:59 Because of this, because of the use of this temporary table, any changes that are made to the underlying data, that's is, 55:08 in the real tables from which the temporary table was constructed, any change it is made to those underlying data 55:16 are not visible through the cursor. So if I open a cursor using a select statement 55:24 and that statement fetches a number of rows into this temporary table for use by my static cursor. 55:32 And at that point in time other users are changing the underlying values 55:37 from which my temporary table was constructed I will not be able to see any of those changes through the cursor, 55:45 because I am not actually looking at the data in the real table anymore. 55:52 My cursor, or my static cursor, is simply using the data from the temporary table to move forward 55:58 or backward through the rows as necessary. So the disadvantage of these static cursors 56:04 is that changes which are made to the underlying data are not visible. 56:10 But the major advantage is that they're simple 56:16 and they consume comparatively few resources on the database server. 56:23 The next scrollable cursor type is called a keyset cursor. And the way that a keyset cursor works 56:32 is that when a cursor is opened, rather than fetching 56:38 the entire results set, the database will fetch all of the key values for the rows in the results set 56:49 and store those key values in a temporary table. So for our purposes let's consider 56:56 this to be primary key values. So if I open a cursor on a table, 57:03 and let's say that the results set from that select statement 57:09 would retrieve 1,000 rows, the database will fetch the primary key values of the 1,000 57:18 rows of data that comprise the results set 57:24 and it will store those primary key values in a temporary table. 57:31 Then as I am scrolling through the rows using the cursor moving either forward or backward, 57:38 every time I scroll forward to the next row or scroll backward to the previous row, 57:44 the database will use that primary key value to fetch the values in the underlying 57:53 table at that moment in time. 57:58 Because of this, updates that are made to a row 58:04 after the cursor was opened are going to be visible through the cursor. 58:09 So let's say that my keyset cursor fetches a number of key values when the cursor is opened. 58:17 These are primary key values, let's say 1, 2, 3, 4, and 5. 58:23 I start moving through the cursor. I examine row one. 58:29 When I scroll to that row the database fetches the values for primary key ID one out of the database, 58:38 returns them to me. I then scroll to row 2, and the database 58:44 fetches the values for row two from the database. Let's say at that moment in time, 58:49 some other user changes the values for row one. 58:55 Well, if I scroll backwards using a keyset cursor, 59:01 I will use the primary key value, in this case one, to fetch the current values of the data 59:09 in the underlying table and I will therefore be able to see the changes to row one that 59:16 were made by the other user. Therefore, updates that are made to the row 59:23 after the cursor was opened are visible. It is also possible to identify deleted rows 59:31 using a keyset cursor, because as I scroll forward or backward 59:37 through the results set using the cursor the database will use the primary key values 59:44 that it had stored in the temporary table to attempt to fetch the true values of the data 59:51 from the underlying table. And if at that moment in time a row no longer exists, 1:00:01 then the database will fetch a null result for me through the cursor and I can then 1:00:08 infer that the associated row has been deleted. 1:00:14 Although updates and deletes are visible through a keyset cursor, new rows, 1:00:20 that is rows that are added using an insert statement, are not visible through the cursor. 1:00:27 And the reason for this is that the key values for all of the rows in the results set 1:00:34 are fetched and stored in temporary table at the point in time when the cursor is opened. 1:00:41 Therefore, any new rows which might be added after that point in time, 1:00:47 their key values, their primary key values, are not going to be included in my temporary table. 1:00:54 And as I scroll forward or backward through the results set I will therefore be unable to identify any new rows that 1:01:02 have been added after the point in time when the cursor was opened. 1:01:09 Finally we have dynamic cursors. And a dynamic cursor is essentially the opposite 1:01:16 of a static cursor. Dynamic cursors use a comparatively large quantity 1:01:22 of system resources, because they allow all changes that are made to the rows of a result set to be visible 1:01:32 while the cursor is scrolling. So as I scroll through my dynamic cursor, 1:01:37 forward or backward, any changes that have been made to the results set 1:01:43 are going to be visible to me through the cursor. That is, all changes made through updates, 1:01:51 all new rows added through insert statements, or any rows deleted through delete statements. 1:01:57 I will be able to identify those changes as I scroll forward or backward using a dynamic cursor. 1:02:06 Although this capability certainly is convenient, I hope you can understand that it consumes 1:02:15 quite a large quantity of system resources to enable this level of visibility 1:02:21 through the dynamic cursor. 1:02:26 Toward the beginning of our lecture we talked about three major functions 1:02:32 that to a database administrator must perform. The first was understanding concurrency control, 1:02:39 and we talked about that, as well as the effects of concurrent access on different types of cursors. 1:02:47 And the second of the major functions that a database administrator must perform 1:02:54 is to manage database security. In a multi-user database environment, 1:03:01 database security is critically important. The objective of creating a database security model 1:03:08 is to ensure that only authenticated users are allowed to perform only authorized activities. 1:03:19 So potential users, we want them to authenticate themselves to the database. 1:03:25 Those users can then attempt to carry out activities, but only the activities for which they have been authorized 1:03:33 will be allowed to be performed against the database. To enable database security, modern databases 1:03:41 implement a model, which is based on processing rights. And database processing rights define 1:03:48 two separate things. First is who, by which I mean which 1:03:54 users, who is permitted to perform which actions and when 1:04:00 are those actions allowed to be performed? After defining and assigning processing rights 1:04:08 to groups or individual users, then the individuals 1:04:14 who are performing activities against the database will have full responsibility for the implications 1:04:20 of their actions. That is to say, if the database administrator has granted you 1:04:27 the power to, say delete customers from the database, 1:04:33 and you choose to exercise that power, then it's not the database administrators fault 1:04:39 if you decide to delete all of customers. It's your fault. 1:04:44 Typically in modern databases, individuals 1:04:49 authenticate themselves to the database management system through the use of a username and a password. 1:04:56 And the database administrator can establish password rules, 1:05:02 timeout periods, and so forth, just as if we were managing user accounts 1:05:09 on a different type of server. In the realm of database permissions, 1:05:15 one critical concept to understand is that a user can be known to the database, 1:05:23 both as an individual and as a member of one or more roles. 1:05:30 And because of this, it is possible for our database privileges to be assigned to both individuals 1:05:38 on a user by user basis and to the roles to which different individuals may belong. 1:05:46 Another critical concept to understand with respect 1:05:51 to database permissions is that permissions are associated with database objects, such as tables. 1:06:03 And permissions can therefore be granted, denied, or revoked with respect to a user's ability 1:06:12 to interact with one of those database objects. Because users can be assigned privileges 1:06:21 as individuals and as members of one or more roles, then a given 1:06:28 user will possess the compilation of all of the permissions that have been granted 1:06:35 to him or herself, and to all of the roles to which they belong. 1:06:40 More formally, a user receives the union 1:06:48 of all of the permissions that have been granted, denied, or revoked by their individual permissions 1:06:55 and their role permissions. As an example, you may be assigned 1:07:01 to a role which has permissions to read, modify, or delete rows 1:07:09 in the customer table. However, as an individual, the database administrator 1:07:16 may deny your rights to delete rows within that table. 1:07:24 In that case, your role privileges would ordinarily allow you to perform 1:07:30 operations such as select, insert, update, and delete. 1:07:37 However, if the database administrator has additionally denied you as an individual rights 1:07:44 to delete rows in that table, then the database would not allow you to delete any rows. 1:07:52 So what happens then if a conflict exists between individual permissions and role 1:08:00 permissions? Well the way that databases handle this is first, 1:08:07 a deny permission will always take precedence. That is, if the database administrator has denied you 1:08:16 as an individual access to one or more database objects or if your role denies you access to one or more objects, 1:08:28 then that deny will always be honored by the database, even if your other individual rights or role rights what 1:08:37 otherwise grant you access to the database object. The deny will always take precedence. 1:08:44 Aside from those denies, role permissions are always given precedence over individual permissions. 1:08:51 And this is a critical point for a database administrator to understand. 1:08:57 In this diagram we're using the set of symbols that we learned about in entity relationship modeling 1:09:05 to illustrate the database security model. Beginning on the left in working toward the right, 1:09:11 we can see that a user can have zero to many roles, while a role can have zero to many users. 1:09:19 Both users and roles are assigned permissions, which are associated with different objects 1:09:27 in the database. So a user can have zero to many permissions, 1:09:33 a role can have zero too many permissions, and an object might have zero too many permissions associated 1:09:41 with it. In SQL Server database roles can either be fixed 1:09:48 or they can be flexible. Flexible Roles are custom roles that are established and defined by the database administrator 1:09:57 to support the specific needs of his or organization. By contrast, fixed rules are predefined roles to which 1:10:06 database users can be assigned. Here we see a list of the fixed roles 1:10:12 that come with SQL Server. A user who was assigned to the DB Owner role, for example, 1:10:20 is allowed to essentially do anything they want with the given database, including dropping the database. 1:10:28 A user who has been assigned to the DB Data Writer role is allowed to insert rows, delete rows, or update 1:10:37 rows in all of the user tables within a database. By contrast, if a user has been assigned to the DB Data Reader 1:10:45 role, they're allowed to read all of the data from the user tables. 1:10:50 SQL Server also comes with two fixed deny database roles. 1:10:57 So a user assigned to the DB Deny Data Writer role is not allowed to add, modify, or delete data 1:11:05 within the database. Whereas a user who is assigned to the DB Deny Data Reader role 1:11:12 is not allowed to read any data within the database. 1:11:18 And there are, of course, several other fixed roles which are listed here that we won't talk about. 1:11:25 Just a few final points to finish up our overview of database security. 1:11:31 Data are absolutely critical to modern organizations, and as such, the DBMS should always 1:11:37 be run behind the firewall. This may seem obvious, but nevertheless, I 1:11:43 hope the point is well taken. It's important for the database administrator 1:11:48 to ensure that the latest operating system and DBMS service packs and patches are always applied. 1:11:58 This will help to ensure the security of the database. Philosophically speaking, the functionality of the DBMS 1:12:06 should be intentionally constrained only to those functions and features that 1:12:13 are needed by the organization. Finally, the database administrator 1:12:19 should ensure the physical protection of the database servers by placing those servers in a secure location, 1:12:29 and using physical barriers like locked doors. 1:12:34 And the user accounts and passwords which allow access into the DBMS should 1:12:40 be managed very carefully following established organizational policy. 1:12:46 The third of our three major database administration 1:12:51 functions was backup and recovery. 1:12:56 A backup and recovery strategy is necessary to hedge against the possibility of database 1:13:02 failures. Databases can fail for many different reasons, including failures of hardware, bugs in software, 1:13:12 human error, or malicious actions, like efforts by hackers, denial of service, et cetera. 1:13:22 Because it is not possible to completely avoid all of these issues, it is absolutely essential 1:13:30 that a database administrator establish recovery procedures for the database. 1:13:36 And the foundation the basis of these recovery procedures 1:13:41 is making regular backups of the database. So a database administrator should back up the database 1:13:49 regularly, but not so often as to negatively impact 1:13:56 the accessibility of the database. With backup procedures in place, then the simplest method 1:14:04 of recovering a database that has failed 1:14:09 is to go through a process called reprocessing. 1:14:16 And in reprocessing, the strategy is to restore the database using the most recent backup copy. 1:14:25 And then have all of the database activities which were executed since the last backup copy 1:14:34 be performed again by human beings who originally performed those actions. 1:14:41 So if our organization, for example, has a database failure and we restore a backup copy 1:14:48 that was made to days ago. And we need to recover all of the sales that 1:14:56 were added into the database. Using the processing strategy we would ask our sales force 1:15:04 to go back through their records for the past three days and add each sale that they made into the database 1:15:12 again manually. This approach, of course, is costly in terms 1:15:19 of the effort involved by the human beings who must reenter all the data. 1:15:26 And it is also a risky, insofar as human error is likely. 1:15:33 This strategy is best used only in very, very small database 1:15:38 implementations, such as personal databases or very, very small corporate databases. 1:15:46 For a database that contains any reasonable amount of data or has more than just a very few users a better strategy 1:15:55 is required. The strategy that is used by most enterprise level 1:16:00 relational database systems is called rollback and rollforward, and again this 1:16:07 is based on the notion of having a database backup. 1:16:14 The second mechanism that is required to enable rollback and rollforward is something called a log file. 1:16:22 And log file, as the name suggests, is simply a file which records a log of all of the activities 1:16:30 to carry out against the database. With a backup copy of the database and the log file 1:16:40 it is possible for a database administrator to recover the database using either rollback 1:16:46 or rollforward as necessary. The rollback strategy is used to undo transactions. 1:16:56 So for example, perhaps a hacker was able to access our database 1:17:03 and add a large number of malicious transactions to the database, and we would want 1:17:08 to undo those transactions. We can undo those transactions using the rollback strategy. 1:17:16 And the process is quite straightforward. Essentially, rollback requires that the log file contains 1:17:24 copy of the values of any changed records 1:17:30 before they were changed, as well as the order in which changes were made to the database. 1:17:37 So these types of records are called before images. They save the values in a given cell, 1:17:44 for example, before the change is made to the cell. 1:17:50 To undo a transaction then, all we need to do is work backwards through the log file 1:17:57 and apply the before images of all of the changes in the log file one by one. 1:18:05 If by contrast, we want to redo a transaction, then the log file needs to contain 1:18:11 a record of the values of all of the changes after the change was made. 1:18:19 And these records are called after images. To redo a transaction then, we simply 1:18:24 move forward through the log file one record at a time 1:18:30 and apply the after images of the changes to the database. 1:18:36 Here we see an example of a transaction log file. 1:18:41 Each of the three transactions in this log file is color coded. So one of the transactions is in green, the other in red, 1:18:50 the third in a light yellow. My apologies if you happen to be red green colorblind. 1:18:56 Note first that records in the log file are recorded in chronological order. 1:19:03 That is, they are recorded in the order in which they were executed by the database. 1:19:09 The next critical thing to note about the transaction log file is that it contains pointers to the next and previous rows 1:19:20 involved in the transaction. So taking row two as an example, we 1:19:27 can see that the previous row is row one 1:19:33 and the next row in the transaction is row four within the log file. 1:19:38 So these pointers tell us where to look for the next part of the transaction 1:19:45 or where to look for the previous part of the transaction. Now let's just work through a few of these transactions 1:19:52 so that we have a solid understanding of how these transaction log files work. 1:19:57 Beginning with transaction OT1 you can see the first record of the transaction 1:20:03 is the transaction starting. The next thing that we do is we modify customer number 100. 1:20:11 And you can see we record both a before image and after image. That is, the previous value prior to the modification 1:20:20 for customer 100 and the new value after the modification for customer 100. 1:20:27 The next step in this transaction is modifying salesperson AA. 1:20:33 And we, again, record the previous value for this record and the new value for this record. 1:20:41 Next part of the transaction is to insert a new record into the order table and in this case there is no before image, 1:20:50 because it is a new record so we only record the new value. 1:20:56 We then finish the transaction by issuing a commit. And thus ends our first transaction. 1:21:04 Let's look at another example. Perhaps we looked at CT1. 1:21:09 So we see the transaction begins. We then skip down to row nine for the next step 1:21:15 of the transaction, where we modified salesperson BB. 1:21:20 We record the previous value and the new value, and then we move to row 10 where we 1:21:27 commit the changes made by our transaction to the database. 1:21:32 Now let's examine this roll back procedure in a little more detail. 1:21:38 Remember that the log file saves its activities in the order 1:21:45 in which they were carried out. So they are saved in sequence within the log file. 1:21:50 We can therefore undo any changes that were made to the database by using the before 1:21:56 images in the log file and moving through the log file in reverse order. 1:22:03 Let's see an example of how this works using our log file. Consider transactions CT1. 1:22:11 In this transaction we modified salesperson BB 1:22:17 and we recorded the before image, which was the previous value, and the after image, which was the new value. 1:22:25 If we want to undo this transaction all we need to do is take the old value, that is the before image, 1:22:32 and apply it to the database. So if the old value was, say five and the new value was 10, 1:22:41 by setting the value equal to 5 it is as if this transaction never occurred. 1:22:48 That is, it is if we never modified salesperson BB to begin with. 1:22:54 Extending this concept, continuing to move backward through the log file, you can see 1:23:00 how we can undo one transaction after the other after the other, until ultimately our database will 1:23:07 be rolled back to a specific point in the past. 1:23:13 Graphically this rollback process can be understood using this sort of a diagram. 1:23:20 We begin with a database that has undesirable changes that 1:23:25 have been made to it. We apply our before images to that database 1:23:30 in an undo operation, and the result is the database without the undesirable changes. 1:23:38 Next, let's consider roll forward. Remember, the activities in the log files 1:23:44 are recorded in sequence. That is, in the order in which they were applied to the database. 1:23:50 This means that if we begin with the most recent backup copy 1:23:58 of our database and we start at the top of the log file 1:24:03 and move downward, we can essentially redo or replay 1:24:09 all of the transactions that occurred since the last backup. By the time we reach the end of this process, 1:24:17 our database will be restored to the point at which it was when the failure occurred. 1:24:24 This process is called rollforward. And again, it relies upon the backed up 1:24:30 version of the database and the log file. Let's see an example using our sample log file. 1:24:39 So let's imagine that this transaction log file contains all of the transactions that were 1:24:46 applied to the database since the last backup. Our database has failed, we have restored the database using 1:24:55 the last to back up, and now we want to recover all of the changes that were made in the time that has passed since the last backup. 1:25:04 So what we do is, we take this transaction file and we simply replay it beginning at the top 1:25:10 and working our way toward the bottom. As we go through and we are simply 1:25:16 restoring all of the changes that were previously made. By the time we arrive at the bottom of the file 1:25:23 our database will be fully recovered up until the point when it failed. 1:25:29 Graphically this rollforward process can be understood using a diagram such as that's 1:25:35 shown here. We begin with a backup copy of the database, that is, 1:25:41 the database without the desired changes. We take the after images from our log 1:25:48 file, apply those to the backup copy of the database in a redo process, 1:25:54 and the result is that the database has been restored and all of the desired changes have been reapplied. 1:26:02 In addition to these three principal tasks of handling concurrent access, handling database security, 1:26:12 and establishing a framework for rollback and recovery, 1:26:18 a database administrator has several additional responsibilities as well. 1:26:24 First, the DBA needs to ensure that there is a mechanism in place through which user 1:26:32 reported errors and other problems can be gathered and stored. This information can be used to tune the database, 1:26:40 modify the database, et cetera. Additionally, the DBA also needs to create 1:26:46 to manage a process for controlling the database configuration. 1:26:51 Organizations are living entities and we cannot expect our initial database configuration to continue to be 1:27:00 satisfactory for the organization as time passes. As the needs of the organization change, 1:27:07 it's very likely that we will also need to alter the configuration of the database. 1:27:12 And for this reason, we need to establish a process for controlling the database configuration. 1:27:20 Finally, the DBA is also responsible for documentation. So the DBA needs to continually document 1:27:29 the structure of the database, the concurrency control strategy that has been employed, the security model, the backup 1:27:37 and recovery strategy, and the various applications and websites that rely upon the database. 1:27:44 When taken together all of this documentation helps to provide it and holistic understanding 1:27:51 the role of the database in the broader database processing environment within the organization. 1:27:58 There are just a few remaining topics that I would like to touch on in this lecture. 1:28:04 The first two which is distributed database processing. Although conceptually it's convenient to think 1:28:11 of a database as a single entity located in a single physical position, perhaps on a single server, 1:28:19 it is now very common for real world databases to be partitioned, replicated, or both. 1:28:28 And a given conceptual database might be split across many different servers, 1:28:36 perhaps in different geographic locations. We may also have replications of the database, which 1:28:44 are continuously cloning all of the actions that 1:28:49 are occurring as part of a backup and recovery strategy. 1:28:55 And so we need to start thinking of these databases in broader terms. 1:29:00 Let's look at a few examples as to how databases can be partitioned or replicated. 1:29:07 In this first example, we see a version of a database which aligns with our simple conceptual model. 1:29:16 That is, the database is nonpartitioned and is nonreplicated. 1:29:22 All of the tables in this case, labeled W, X, Y and Z, 1:29:28 exist on a single database server. In our next example we see a database partition. 1:29:35 And in this case, the database has been subdivided such that it runs on two separate database 1:29:43 servers. In this case, tables A and X are operating on server number one, 1:29:50 whereas tables Y and Z are operating on server number two. And the two servers are interconnected 1:29:57 via communications channel of some sort. In our third example, we see a replicated database. 1:30:06 In this case, we have two database servers and each server is maintaining a full and complete copy 1:30:14 of the entire database. So they're interconnected with a communications line, 1:30:19 and we keep track of tables W, X, Y, and Z on both servers 1:30:24 such that we have a full real time backup strategy in place. 1:30:31 And in our final example, we see the most complicated of these four designs, and that is a partitioned and replicated 1:30:40 database, where we have two different database servers interconnected with the communication line. 1:30:48 On the first server we store tables W, X, and Y. To whereas on the second server we store tables Y 1:30:59 and Z. So in this case, we are replicating table Y 1:31:05 between servers one two. And we are then storing tables W and X 1:31:13 additionally on server one and table Z additionally on server two. 1:31:19 So there are many different ways in which databases can be subdivided according to our partitioning 1:31:27 and replication needs. Our final topic for today's lecture will be the object oriented paradigm and its role 1:31:38 in the database world. Over the past several decades object oriented programming 1:31:45 has come to be the most common paradigm for software 1:31:52 development. And object oriented programming is the basis of most of the world's most popular and widely used 1:32:01 computer programming languages. Just like entities in a database, 1:32:08 objects in the object oriented programming world have identifiers and properties. 1:32:16 Objects, however, can also have methods. And this is what differentiates them 1:32:22 from entity classes, such as those that we've studied in the entity relationship model. 1:32:31 A method can be thought of as an algorithm which allows the object to perform certain tasks. 1:32:38 Let's say, for example, that we have an employee object, 1:32:44 and one of its properties is the employee's date of birth. 1:32:50 Well we might then have a method which allows us to calculate the employee's age in years. 1:32:59 So we could call that age method, and it would return a value which 1:33:06 would reflect how old the employee is in years based upon the current time and the employees 1:33:13 date of birth. Since most of the world's widely used programming languages 1:33:19 are based on the object oriented model, software developers commonly have the need 1:33:26 to save objects on a storage device such 1:33:32 that they can be retrieved for future use. And this is called object persistence. 1:33:39 Now, object persistence can be accomplished using several different approaches. 1:33:45 One approach, for example, would be to serialize the objects 1:33:50 and store them as a binary file on the file system. 1:33:56 But another common technique is to use database technology for storing objects. 1:34:02 For this purpose, normal relational databases can be used, but doing so requires 1:34:08 substantial additional programming. To address this problem, several vendors 1:34:14 have over the past few decades developed a type of database management system known 1:34:20 as an object oriented database management system. And the purpose of these object oriented database management 1:34:28 systems is to provide the ability to easily store and retrieve objects 1:34:36 for use by software programs that are built using the object 1:34:42 oriented paradigm. Unfortunately these object oriented 1:34:48 databases have had very little commercial success, 1:34:53 largely due to the fact that transferring existing data from relational and other organizational legacy databases 1:35:02 would just be too cumbersome, too time consuming. That is, most managers have decided 1:35:09 that the cost of transitioning to an object oriented database 1:35:16 are simply not worth the benefits. The good news is that the current SQL 1:35:24 standard includes several object oriented features. 1:35:30 The two most prominent being user defined types and inheritance. 1:35:39 So if you are familiar with object oriented programming, you will immediately see the connection 1:35:45 between these features of the SQL standard and the object oriented paradigm. 1:35:50 Therefore because these features are a part of the current SQL 1:35:57 standard, any database that fully supports the current standard, can therefore 1:36:04 be considered an object relational database. That is, they not only incorporate 1:36:10 all of the features and capabilities of the relational model, but they additionally incorporate certain object oriented features 1:36:19 which enhance their viability and value for use in conjunction with object oriented software 1:36:26 programs. Most major database vendors currently support at least some 1:36:34 of the object oriented features of the SQL standard. This includes Oracle, DB2, SQL Server among others. 1:36:45 Well, my friends, thus ends our further exploration of database administration. 1:36:52 I hope you learned something interesting in this lecture. And until next time, have a great day. 1:37:00