0:00 This video introduces JSON. 0:02 Let's start by talking about its pronunciation. 0:04 Some people call it Jason, and some call it J-sahn. 0:07 I'll do a little bit of 0:09 investigation and discovered that the 0:10 original developer of JSON calls 0:12 it JSON so, I'll do that too. 0:15 Like XML, JSON can be thought of as a data model. 0:18 An alternative to the relational data 0:20 model that is more 0:21 appropriate for semi-structured data. 0:24 In this video I'll introduce the 0:26 basics of JSON and I'll 0:27 actually compare JSON to the 0:29 relational data model and I'll compare it to XML. 0:32 But it's not crucial to have 0:34 watched those videos to get something out of this one. 0:37 Now among the three models 0:38 - the relational model, XML, and 0:40 JSON - JSON is by 0:41 a large margin the newest, 0:43 and it does show there aren't 0:44 as many tools for JSON 0:46 as we have for XML and 0:48 certainly not as we have for relational. 0:51 JSON stands for Javascript object notation. 0:54 Although it's evolved to become pretty 0:56 much independent of Javascript at this point. 0:59 The little snippet of Jason in the corner right now mostly for decoration. 1:03 We'll talk about the details in just a minute. 1:05 Now JSON was designed 1:07 originally for what's called 1:09 serializing data objects. 1:11 That is taking the objects that 1:13 are in a program and sort 1:14 of writing them down in a 1:15 serial fashion, typically in files. 1:18 one thing about json 1:19 is that it is human readable, 1:21 similar to the way xml 1:23 is human readable and is 1:25 often use for data interchange. 1:27 So, for writing out, say 1:28 the objects program so that 1:30 they can be exchanged with another 1:32 program and read into that one. 1:34 Also, just more generally, because 1:36 json is not as rigid 1:38 as the relational model, it's generally 1:40 useful for representing and for 1:42 storing data that doesn't 1:43 have rigid structure that we've been calling semi-structured data. 1:47 As I mentioned json is 1:49 no longer closely tied to 1:51 Many different programming languages do 1:54 have parsers for reading json 1:56 data into the program and 1:57 for writing out json data as well. 2:00 Now, let's talk about the basic 2:01 constructs in JSON, and as 2:03 we will see this constructs are recursively defined. 2:06 We'll use the example JSON 2:08 data shown on the screen 2:09 and that data is also available 2:11 in a file for download from the website. 2:14 The basic atomic values in JSON are fairly typical. 2:18 We have numbers, we have strings. 2:21 We also have Boolean Values 2:23 although there are none of those 2:24 in this example, that's true and false, and no values. 2:29 There are two types of composite 2:30 values in JSON: objects and arrays. 2:34 Objects are enclosed in curly 2:36 braces and they consist 2:37 of sets of label-value pairs. 2:40 For example, we have an 2:41 object here that has a first name and a last name. 2:44 We have a more - 2:46 bigger, let's say, object here 2:48 that has ISBN, price, edition, and so on. 2:51 When we do our JSON demo, 2:53 we'll go into these constructs in more detail. 2:55 At this point, we're just introducing them. 2:57 the second type of composite 2:59 value in JSON is arrays, 3:01 and arrays are enclosed in square 3:03 brackets with commas between the array elements. 3:06 Actually we have commas in the objects 3:07 as and arrays are list of values. 3:10 For example, we can see 3:11 here that authors is a 3:13 list of author objects. 3:16 Now I mentioned that the constructs 3:18 are recursive, specifically the values 3:20 inside arrays can be anything, 3:22 they can be other arrays or objects, 3:23 space values and the values 3:26 are making up the label value 3:27 pairs and objects can also 3:29 be any composite value or a base value. 3:32 And I did want to 3:33 mention, by the way, that sometime 3:34 this word label here for 3:36 label value pairs is called a "property". 3:39 So, just like XML, JSON 3:41 has some basic structural requirements in 3:44 its format but it doesn't 3:45 have a lot of requirements in terms of uniformity. 3:47 We have a couple of examples 3:49 of heterogeneity in here, for 3:51 example, this book has an 3:52 edition and the other one 3:53 doesn't this book has a remark and the other one doesn't. 3:57 But we'll see many more examples 3:59 of heterogeneity when we do 4:00 the demo and look into JSON data in more detail. 4:03 Now let's compare JSON and the relational model. 4:06 We will see that many of 4:07 the comparisons are fairly similar 4:09 to when we compared XML to the relational model. 4:12 Let's start with the basic structures underling the data model. 4:15 So, the relational model is based on tables. 4:18 We set up structure of 4:20 table, a set of columns, and 4:22 then the data becomes rows in those tables. 4:25 JSON is based instead on 4:27 sets, the sets of label 4:29 pairs and arrays and as we saw, they can be nested. 4:34 One of the big differences between 4:35 the two models, of course, is the scheme. 4:38 So the Relational model has a 4:39 Schema fixed in advance, 4:41 you set it up before you 4:43 have any data loaded and then 4:44 all data needs to confirm to that Schema. 4:47 Jason on the other other 4:48 hand typically does not require a schema in advance. 4:52 In fact, the schema and the 4:53 data are kinda mix together 4:55 just like an xml, and 4:56 this is often referred to as 4:58 self-describing data, where the 5:00 schema elements are within the data itself. 5:04 And this is of course typically 5:05 more flexible than the to a model. 5:08 But there are advantages to having schema [sp?] 5:10 as well, definitely. 5:12 As far as queries go, one 5:13 of the nice features of the 5:15 relational model is that there 5:16 are simple, expressive languages for clearing the database. 5:21 In terms of json, although a 5:23 few New things have been proposed; 5:25 at this point there's nothing widely 5:27 used for querying Jason data. 5:29 Typically Jason data is 5:31 read into a program and it's manipulated programatically. 5:34 Now let me interject that this 5:35 video is being made in February 2012. 5:38 So it is possible 5:40 that some json query languages 5:42 will emerge and become 5:44 widely used there is just 5:45 nothing used at this point. 5:46 There are some proposals. 5:47 There's a JSON path language, 5:49 JSON Query, a language called jaql. 5:52 It may be that just like 5:53 XML, the query language are 5:55 gonna follow the prevalent use 5:57 of the data format or the data model. 5:59 But that does not happened yet, as of February 2012. 6:01 How about ordering? 6:04 One aspect of the relational model is that it's an unordered model. 6:07 It's based on sets and 6:08 if we want to see relational 6:10 data in sorted order then we put that inside a query. 6:14 In JSON, we have arrays as 6:16 one of the basic data structures, and arrays are ordered. 6:19 Of course, there's also the fact like 6:20 XML that JSON data is 6:22 often is usually written files 6:24 and files themselves are naturally ordered, 6:26 but the ordering of the data 6:27 in files usually isn't relevant, 6:30 sometimes it is, but 6:31 typically not finally in 6:33 terms of implementation, for the 6:35 relational model, there are 6:37 systems that implement the relational model natively. 6:39 They're very generally quite 6:42 efficient and powerful systems. 6:44 For json, we haven't yet 6:46 seen stand alone database systems 6:48 that use json their data 6:49 model instead JSON is 6:51 more typically coupled with programming languages. 6:54 One thing I should add however 6:56 JSON is used in NoSQL systems. 7:00 We do have videos about NoSQL 7:02 systems you may or may not have, have watched those yet. 7:05 There's a couple of different ways that JSON is used used in those systems. 7:08 One of them is just as 7:10 a format for reading data 7:11 into the systems and writing data out from the systems. 7:14 The other way that it is 7:15 used is that some of the 7:17 note systems are what are 7:18 called "Document Management Systems" where 7:20 the documents themselves may contain 7:22 JSON data and then the systems 7:24 will have special features for manipulating 7:26 the JSON in the document is better stored by the system. 7:29 Now let's compared json and XML. 7:32 This is actually a hotly debated comparison right now. 7:35 There are signification overlap in 7:37 the usage of JSON and XML. 7:40 Both of them are very 7:41 good for putting semi-structured data 7:43 into a file format 7:46 and using it for data interchange. 7:48 And so because there's so 7:49 much overlap in what they're used 7:50 for, it's not surprising that there's significant debate. 7:54 I'm not gonna take sides. 7:55 I'm just going to try to give you a comparison. 7:57 Let's start by looking at the 7:58 verbosity of expressing data in the two languages. 8:02 So it is the case 8:03 that XML is in general, 8:05 a little more verbose than Jason. 8:08 So the same data expressed in 8:09 the 2 formats will tend to 8:11 have more characters [xx] than Json 8:12 and you can see that 8:14 in our examples because our big 8:16 Json example was actually pretty 8:18 much the same data that we used when we showed XML. 8:20 And the reason for 8:22 XML being a bit more 8:23 verbose largely has to 8:24 do actually with closing tags, 8:26 and some other features. 8:29 But I'll let you judge 8:30 for yourself whether the somewhat 8:32 longer expression of XML is a problem. 8:35 Second is complexity, and here, 8:37 too, most people would say 8:39 that XML is a bit more complex than JSON. 8:42 I'm not sure I entirely agree with that comparison. 8:45 If you look at the subset 8:47 of XML that people really 8:49 use, you've got attributes, 8:51 sub elements and text, and 8:52 that's more or less it. 8:54 If you look at Json, you got 8:55 your basic values and you've got your objects and your arrays. 8:58 I think the issue is that 8:59 XML has a lot of 9:01 extra stuff that goes along with it. 9:03 So if you read the entire XML specification. 9:06 It will take you a long time. 9:08 JSON, you can grasp the 9:10 entire specification a little bit more quickly. 9:12 Now let's turn to validity. 9:14 And by validity, I mean the 9:16 ability to specify constraints or 9:18 restriction or schema on 9:20 the structure of data 9:22 in one of these models, and 9:24 have it enforced by tools or by a system. 9:27 Specifically in XML we 9:28 have the notion of document type 9:30 descriptors, or DTDs, we also 9:32 have XML Schema which 9:34 gives us XSD's, XML Schema Descriptors. 9:38 And these are schema like 9:39 things that we can specify, and 9:41 we can have our data checked to 9:42 make sure it conforms to the 9:43 schema, and these are, I would say, 9:45 fairly widely used at this point for XML. 9:49 For JSON, there's something called JSON Schema. 9:51 And, you know, similar to 9:53 XML Schema, it's a way 9:55 to specify the structure and then 9:57 we can check that JSON conforms 9:58 that and we will see some of that in our demo. 10:02 The current status, February 10:04 2012 is that this is 10:07 not widely used this point. 10:09 But again, it could really just be evolution. 10:11 If we look back 10:14 at XML, as it was originally 10:15 proposed, probably we didn't 10:17 see a whole of lot of use 10:18 of DTDs, and in fact not 10:20 as XSDs for sure until later on. 10:22 So we'll just have to see whether JSON evolves in a similar way. 10:26 Now the programming interface is where JSON really shines. 10:31 The programming interface for XML can be fairly clunky. 10:34 The XML model, the attributes 10:37 and sub-elements and so on, 10:39 don't typically match the model 10:41 of data inside a programming language. 10:43 In fact, that's something called the impedance mismatch. 10:47 The impedance miss match 10:48 has been discussed in database 10:50 systems actually, for decades 10:52 because one of the original 10:54 criticisms of relational database 10:56 systems is that the data 10:57 structures used in the database, 10:59 specifically tables, didn't match 11:01 directly with the data structures and programming languages. 11:04 So there had to be some manipulation 11:05 at the interface between programming languages and the database system and that's the mismatch. 11:09 So that same impedance mismatch 11:13 is pretty much present 11:15 in XML wherein JSON is 11:17 really a more direct mapping 11:19 between many programming languages and the structures of JSON. 11:23 Finally, let's talk about querying. 11:25 I've already touched on this 11:27 a bit, but JSON does not 11:28 have any mature, widely 11:31 used query languages at this point. 11:33 for XML we do have 11:34 XPath, we have XQuery, 11:36 we have XSLT. 11:39 Maybe not all of 11:41 them are widely used but there's 11:42 no question that XPath at least and 11:44 XSL are used quiet a bit. 11:46 As far as Json goes there 11:48 is a proposal called Json path. 11:50 It looks actually quiet a lot 11:52 like XPath maybe he'll catch on. 11:55 There's something called JSON Query. 11:56 Doesn't look so much like 11:58 XML Query, I mean, XQuery. 12:01 and finally, there has been a 12:02 proposal called [xx] language, but 12:07 again as of February 2012 12:08 all of these are still very 12:10 early, so we just don't know what's going to catch on. 12:13 So now let's talk about the validity of JSON data. 12:16 So do JSON data that's 12:17 syntacti[xx] valid, simply needs 12:19 to adhere to the basic structural requirements. 12:22 As a reminder, that would be 12:24 that we have sets of label 12:25 value pairs, we have arrays 12:27 of values and our values 12:29 are from predefined types. 12:31 And again, these values here are defined recursively. 12:34 So we start with a JSON 12:35 file and we send 12:37 it to a the parser 12:39 may determine that the file 12:40 has syntactic errors or if 12:42 the file is syntactically correct then 12:44 it can parsed into objects in a programming language. 12:47 Now if we're interested in semantically 12:49 valid JSON; that is 12:51 JSON that conforms to 12:52 some constraints or a schema, 12:54 then in addition to checking the 12:55 basics structural requirements, we check 12:57 whether JSON conforms to the specified schema. 13:00 If we use a language like JSON 13:02 schema for example, we put 13:03 a specification in as a 13:05 separate file, and in 13:07 fact JSON schema is expressed in 13:09 JSON itself, as we'll see 13:11 in our demo, we send it 13:12 to a validator and that 13:13 validator might find that there 13:15 are some syntactic errors or 13:16 it may find that there are 13:17 some symantic errors so the 13:19 data could to be correct syntactically 13:21 but not conform to the schema. 13:23 If it's both syntactically and semantically 13:25 correct then it can move 13:26 on to the parser where 13:28 will be parsed again into 13:30 objects in a programming language. 13:32 So to summarize, JSON stands for Java Script Object Notation. 13:36 It's a standard for taking data 13:38 objects and serializing them into a format that's human readable. 13:41 It's also very useful for 13:43 exchanging data between programs, 13:46 and for representing and storing 13:48 semi-structured data in a flexible fashion. 13:51 In the next video we'll go 13:52 live with a demonstration of JSON. 13:55 We'll use a couple of JSON 13:56 editors, we'll take a 13:57 look at the structure of JSON 13:59 data, when it's syntactically correct. 14:01 We'll demonstrate how it's very 14:03 flexible when our data might 14:05 irregular, and we'll also 14:06 demonstrate schema checking using 14:09 an example of JSON's schema.