0:00 foreign 0:06 we will focus on the part 2 of the 0:10 database system architecture in the last 0:13 presentation we have focused on the part 0:15 one in part one we focused on the 0:18 storage manager and also we have focused 0:20 on the various data structures used by 0:23 database management system in this 0:25 presentation we will focus on the 0:27 transaction manager in the last 0:29 presentation we have seen only the idea 0:31 of transaction manager in this 0:33 presentation we will see more about the 0:35 transaction manager and then we will 0:37 also focus on this part the query 0:39 processor so let's see the transaction 0:41 manager first what do you mean by 0:43 transaction Management in this chapter 0:45 when we had compared about the file 0:47 system versus database management system 0:49 we had seen one important requirement 0:51 for a database to be satisfied is the 0:54 atomicity requirement what do we mean by 0:56 this it means all are non-property if 1:00 you are directly watching this lecture I 1:02 request you to watch my previous lecture 1:04 titled database management system versus 1:06 file system and there I have explained 1:09 about the various advantages of 1:11 databases over file systems where I have 1:14 explained what is atomicity requirement 1:17 so in simple terms all should be 1:19 executed or none should be executed and 1:22 why we are enforcing this requirement 1:24 because this will ensure the consistency 1:27 to the database our databases want to be 1:30 consistent because transactions are 1:32 going to be made on the databases when 1:35 transactions happen on the database it 1:37 should not leave the database to an 1:39 inconsistent State I know it will be 1:41 unclear at this moment because I am 1:43 using the term transaction here what is 1:46 a transaction no worries I will explain 1:47 that shortly so for now just understand 1:50 when transactions happen on the database 1:53 it should not leave the database to an 1:55 inconsistent State and that is why we 1:57 are enforcing all our non-property so 2:00 when we enforce all our none property if 2:03 a transaction starts it should complete 2:05 till the end otherwise none of the 2:07 statements in the transaction should 2:09 execute so this ensures that the 2:11 database is consistent and when we talk 2:14 about transactions it should also be 2:16 durable what do you mean by durable 2:19 durability means it is the persistence 2:22 requirement and we know any system is 2:24 vulnerable to Hardware or software 2:26 failures when any failure occurs I mean 2:29 whether it is a hardware failure or a 2:31 software failure in a nutshell I will 2:33 refer it as a system failure so when any 2:36 system failure occurs let's assume there 2:39 are some transactions happening on the 2:40 database let's take fund transfer as an 2:43 example what do you mean by fund 2:45 transfer the fund is going to be 2:47 transferred from one's account to 2:48 another account Let's Take A and B are 2:51 the accounts here the value from as 2:53 account should be debited and it should 2:56 be credited to base account let's assume 2:58 this is the fund transfer after the 3:00 successful execution of the fund 3:02 transfer transaction the new values of 3:05 the balances of accounts A and B must be 3:08 stored persistently despite the 3:10 possibility of the system failure this 3:12 persistence requirement is actually 3:14 referred as durability so the 3:16 transaction should also be durable so 3:19 now we are hearing the term transactions 3:21 repeatedly let's see that what is a 3:23 transaction a transaction is one logical 3:26 function example fund transfer and this 3:29 transaction is a logical function which 3:31 consists of a collection of operations 3:34 and we have taken the example as fund 3:36 transfer what are all the various 3:37 logical operations that it contains the 3:40 fund transfer obviously we know it is 3:42 going to transfer the fund from one 3:44 account to another I mean from Ace 3:46 account to B's account for example in 3:49 this case the existing balance from the 3:52 source that is as account should be read 3:54 then how much value we are going to 3:56 debit to Ace account that should be done 3:58 after debating the value to as account 4:01 the new value should be stored in yes 4:03 account then the debited value should be 4:06 credited to base account so the account 4:08 balance of Base account should be read 4:10 first and then the new value to be added 4:12 to B's account this is the value which 4:14 is debited to as account and then the 4:16 the final value should be stored in base 4:19 account so if you see this fund transfer 4:21 transaction a lot of logical operations 4:24 like reading the existing balance from a 4:26 writing the New Balance reading the 4:28 existing balance from B writing the New 4:30 Balance to B all these operations are 4:33 constituted in a single term as fund 4:36 transfer transaction so this is exactly 4:38 a transaction so it is a logical 4:41 function which consists of a collection 4:43 of operations if you still need more 4:46 clarity on this I request you to view my 4:49 previous lecture databases versus file 4:51 system where I have explained with an 4:54 example when we talk about transactions 4:56 we know this is a collection of 4:58 operations and when these operations are 5:01 executed at the midst of the execution 5:03 we may encounter a system failure it can 5:06 be a hardware failure or a software 5:09 failure and that is why we are enforcing 5:11 all our non-property when the 5:13 transaction is started let's assume this 5:15 transaction is having one time 1000 5:17 lines so when this transaction is 5:19 started let's assume the transaction is 5:21 now at the execution of 25th line if 5:24 there is a system failure at 25th line 5:27 and we know the transaction cannot 5:29 complete because it has encountered a 5:31 failure so we are enforcing the property 5:33 all are none we know 25 lines are 5:36 already executed so we don't want the 5:38 outcome of those 25 lines to be 5:40 reflected on the database I mean we 5:43 should ensure the property that none of 5:45 these 25 lines are executed and our 5:48 databases should be restored to previous 5:50 state that existed before to the 5:53 occurrence of the failure so here the 5:55 transaction is a failed one but still 5:58 the values of A and B are preserved and 6:01 this all our Nan property provides us 6:04 consistency if you want some real-time 6:06 examples I request you to watch my 6:08 previous lecture where I have compared 6:10 the database systems with the file 6:12 systems let's continue dealing with the 6:14 theoretical aspects of transaction 6:16 management let's say the transaction is 6:18 failed what's the next step 6:20 it's the recovery manager comes into 6:23 picture what is the job of this recovery 6:25 manager the job of this recovery manager 6:28 is to ensure these two properties are 6:30 satisfied I mean the atomicity property 6:33 and the durability property if there is 6:36 no Hardware failure during the execution 6:38 of the transaction obviously all 6:40 transactions will complete successfully 6:42 so atomicity is achieved but this 6:45 recovery manager will come into picture 6:47 only when the transaction encounters a 6:50 failure so we need to recover from 6:53 failure so what is failure recovery it 6:56 is restoring the database to the state 6:58 that existed before the occurrence of 7:01 the failure the failure recovery should 7:03 also detect whether there is a system 7:04 failure or not it could be a software or 7:07 a hardware failure let's take the same 7:09 fund transfer example we know what all 7:11 the account balance we had for A and B's 7:14 account 7:15 so before failure what was the balance 7:18 for A and B then after there is a 7:20 failure we need to restore the values of 7:23 A and B to the previous balance that 7:25 existed before the occurrence of the 7:27 failure so failure recovery is very much 7:30 needed in this scenario and at the same 7:33 time we know transactions can happen 7:35 concurrently so databases can be 7:38 accessed by multiple users at the same 7:40 time in that case concurrency control 7:42 manager will take care of the 7:45 consistency of the databases even when 7:47 there are concurrent executions the 7:50 concurrency operation that is carried 7:52 out on the database is not leading to 7:54 any inconsistency or any conflicts so 7:57 this is ensured by concurrency control 8:00 manager in simple terms there is a 8:03 transaction manager where this 8:05 transaction manager takes care of this 8:08 recovery management aspect as well as 8:10 the concurrency control manager aspects 8:13 so transaction manager takes care of of 8:15 recovery manager when there are failures 8:17 so it deals with the failure recovery 8:19 and also it takes care that the 8:22 transactions are happening concurrently 8:24 without any conflicts so this is about 8:27 the transactions and let's now focus on 8:29 the query evaluation engine this query 8:32 evaluation engine is very important as 8:35 far as the query processor is concerned 8:37 so what we have dealt so far 8:39 we have seen about the transaction 8:41 manager and we have completed the 8:43 working of storage manager here now 8:45 let's focus our attention towards query 8:47 processor what is this query processor 8:50 these are all the queries that interact 8:52 with the database our data are actually 8:54 stored here and here we have a database 8:57 software and this database software will 9:00 respond to database queries and when 9:02 queries are supplied you see database 9:04 administrator can supply some queries 9:06 also sophisticated users or analysts can 9:09 use some query Tools in order to supply 9:11 queries to the database so these two 9:13 users normally Supply queries directly 9:16 to the database but what about the 9:18 application programmers these 9:20 application programmers are not database 9:22 administrators so they cannot directly 9:24 Supply query to the database instead 9:27 they will use the application programs 9:29 in order to supply queries to the 9:30 database so these application programs 9:33 what they write they will actually 9:35 Supply queries to the database and 9:37 coming to the naive users of your 9:39 obviously they will also not supply 9:40 queries to the database directly so they 9:43 will use the application interfaces that 9:45 are generated by the application 9:46 programmers to interact with the 9:48 database these application interface 9:51 will generate the application program 9:52 object code that is sent to the query 9:55 evaluation engine we will see about this 9:57 query evaluation engine now and what 9:59 about this guy application programmers 10:01 these application programmers write the 10:03 application programs that needs to be 10:05 compiled and linked so this compiler 10:08 generates the object code and this 10:10 object code is then interacts with the 10:13 query evaluation engine and what about 10:15 these two guys the sophisticated users 10:17 and administrators I mean the database 10:19 administrators the sophisticated users 10:22 use the query Tools in order to generate 10:24 the DML queries that is supplied to the 10:26 DML compiler we know DML queries are 10:29 used for selecting updating or inserting 10:32 or even deleting the data from the 10:33 databases so these are data related but 10:36 administrator can also use DML is and 10:40 that is why there is a link here and 10:42 also he uses ddl commands ddl is 10:44 directly dealing with the schema of the 10:46 table so he is the one who has the 10:48 complete privilege over the database 10:50 who's the one the database administrator 10:53 database administrator can create a 10:55 table create databases modify the 10:57 existing database add columns delete 10:59 columns from the table anything that is 11:02 the ddl also he will use DML as well the 11:06 data manipulation language this is data 11:08 definition language normally is when we 11:10 see the last lecture of this chapter we 11:12 will understand the role of database 11:14 languages at the time I will explain you 11:16 what is ddl and DML and when DML queries 11:20 are supplied DML compiler and organizer 11:22 takes the DML queries and generates some 11:25 plans and these plans are executed by 11:27 the query evaluation engine no worries I 11:30 will explain that now so we are now here 11:32 in the query processor part where this 11:35 query processor part has The ddl 11:37 Interpreter then the DML compiler and we 11:40 also should focus on the query 11:42 evaluation engine we will see one by one 11:45 now what is this ddl interpreter let's 11:48 go to the diagram and see this ddl 11:50 interpreter actually interprets the ddl 11:53 statements that are generated by the 11:55 database administrator and Records the 11:58 definition to the data dictionary so 12:00 because this data dictionary stores the 12:02 metadata so what happens this ddl 12:05 interpreter interprets all the ddl 12:07 statements provided by the administrator 12:09 and Records the definition where it is 12:12 going to record the definition in the 12:14 data dictionary and please note database 12:16 administrator has the complete privilege 12:18 over the databases and coming to this 12:21 DML compiler this DML compiler 12:24 translates the DML statements in the 12:26 query language into an evaluation plan 12:29 so here the output of this will be an 12:31 evaluation plan to be precise the query 12:34 evaluation plan and these query 12:36 valuation plan consist of low level 12:38 instructions that the query evaluation 12:40 engine can understand this query 12:43 evaluation engine understands only query 12:45 evaluation plans and this query 12:47 valuation plan are actually given by the 12:50 DML compiler and this query valuation 12:52 plan is also generated by the 12:54 application programs object code anyway 12:57 these query valuation plan whether it is 12:59 coming from the Navy users or from the 13:01 application programmers or from the 13:03 database administrators or some 13:05 sophisticated users all will be queries 13:08 and a lot of alternative evaluation 13:11 plans which we call as the query 13:13 evaluation plans are actually generated 13:15 and the best plan I mean the best eval 13:18 evaluation plan is chosen so this DML 13:21 compiler will also do query optimization 13:24 which means picking the lowest cost 13:26 evaluation plan from the Alternatives 13:28 and these query valuation plans are 13:31 executed by the query evaluation engine 13:33 so query valuation plan comes from this 13:36 side or from this side anyway the query 13:39 evaluation engine will execute this low 13:42 level instructions which is actually the 13:44 query evaluation plan so various plans 13:48 are generated and this query evaluation 13:50 engine will pick the best plan to 13:52 execute the query on what basis it will 13:55 pick the best plan to execute the query 13:57 there may be multiple query valuation 14:00 plans all will be giving the same result 14:02 but the best plan to be chosen the DML 14:06 compiler helps the query evaluation 14:08 engine in choosing the best evaluation 14:10 plan which has the lowest cost say for 14:13 example from a source to a destination 14:15 there may be multiple routes but which 14:18 route normally people will follow based 14:20 on some cost factor for example if the 14:22 cost is lesser from the source to the 14:24 destination though there exist multiple 14:27 path or multiple ways to execute but 14:29 still the lower cost is preferred 14:31 likewise multiple plans are there and 14:34 query evaluation engine chooses the best 14:37 plan based on the cost Factor the lower 14:39 the cost the higher the chance of that 14:42 particular query evaluation plan to be 14:44 selected and obviously the query will be 14:46 executed and thus the operation is 14:49 carried out on the database anyway in 14:52 the coming lectures we are going to have 14:53 a separate chapter called query 14:55 evaluation and optimization where we are 14:58 going to see how query evaluation plans 15:00 are generated and how query evaluation 15:02 engine picks the best plan to execute 15:05 also we are going to focus on optimizing 15:08 the queries so what we have seen here 15:10 The ddl Interpreter the DML compiler and 15:14 the query evaluation engine so we have 15:17 seen The ddl Interpreter the DML 15:19 compiler and the query evaluation engine 15:22 and I hope now you understood the 15:24 overall architecture of the databases in 15:27 other words it is also referred as the 15:29 database system structure I hope the 15:31 session is informative and thank you for 15:33 watching 15:34 [Music] 15:35 [Applause] 15:37 [Music]