Maker Mind MelD Summit
Benoît Blanchon – Serialization and JSON
Benoît’s presentation is an introduction to JSON for Arduino users. He begins by implementing various serialization techniques to explain why JSON represents the best compromise for most projects. He continues with the implementation of JSON serialization from scratch, and finally, with a demonstration of how a library can be of help.
This presentation is for beginners; you only need to know the basics of Arduino programming and to be able to read simple C++ code. It is a practical course with many actionable code samples and is recommended for anyone who has little experience with Arduino but has never used JSON.
Watch full video below
Peter Dalmaris: Hi everyone, and welcome to this Special Maker Session. In this session, Benoit Blanchon will show us how to use the JSON data exchange format with the Arduino. There are many applications where JSON is an excellent fit, including in the Internet of Things, so stay tuned.
Peter Dalmaris: I'm Peter Dalmaris, an online educator and maker, author of Maker Education Revolution, and Founder at Tech Explorations. My mission is to help people learn electronics, programming, printed circuit, board design, and lots more. Most importantly, I want to help as many people as possible to enjoy the technology education adventures.
Peter Dalmaris: In this session, I'm excited to introduce Benoit Blanchon. Benoit is a programmer with a Master's Degree in Electronics and 20 years of professional experience in software development. He's the creator of ArduinoJson, the most popular JSON library for the Arduino. He's also the author of The Ultimate Guide to Master ArduinoJson, one of my favourite programming books for the Arduino.
Peter Dalmaris: Running a program for a constrained device, like a microcontroller, is challenging because of the limitations in storage and the programming interface. That's why it's essential to be efficient and clean. This is the approach that Benoit takes in his book, and as you'll see in his presentation too. Benoit is also a Test-Driven Development practitioner. TDD is a programming approach that helps to write reliable code.
Peter Dalmaris: In this presentation, Benoit will introduce us to JSON for Arduino users. He'll demonstrate various serialization techniques to understand why JSON represents the best option for many projects. Then, Benoit will show us how to implement JSON serialization from scratch. And then, add the ArduinoJson library in the mix to make our life as a programmer much easier. Benoit, thank you for joining me today. How are you?
Benoit Blanchon: Hi, Peter. Thank you for having me. I'm very glad to be here. And I pray this will come out great.
Peter Dalmaris: I'm sure it will. I've got a question before you start, because I know you have a lot in store for us. It's going to be an epic presentation. I'd like to ask you, why did you decide to create ArduinoJson? Like, you could have done many other things or written many other libraries, why this particular one?
Benoit Blanchon: Well, I told the story in my blog. It was around 2014, I guess maybe 2013, I was making a payment terminal for my company. So, employees, we are buying drinks with our badge, the badge to use to enter the building. And I was using an RFID reader and Arduino Uno, and [inaudible].
Benoit Blanchon: And none of the options that I tried like it was adjacent on some old library wherever to fit in the Arduino Uno. All of them were making too many dynamic [inaudible] location. That's why I started this library, which was able to work with a fixed amount of memory, and especially with memory from the stack. And that solved all the problems. That was the main goal.
Peter Dalmaris: And with your library, you were able to do a lot of the manual work that you're going to show us later with just a few lines of code. So, that made it a lot more efficient and productive with the limited resources in the Arduino.
Benoit Blanchon: Right. This came over time. From the beginning, it was quite rough, but now it's Revision 6 of the library and it's got a much better syntax.
Peter Dalmaris: And you even wrote a book about it, so the library has been used by a lot of people. It's been tested over time. And your book doesn't just talk about the library, you also have a very interesting, I think, two or three chapters with programming techniques and principles. Like, to better use the limited memory resources of the Arduino, which was a great component of your book.
Benoit Blanchon: Yeah. Yeah. Well, it's actually one chapter and I call it The Missing C++ Course. The things that the other books or the tutorial don't tell you what is the stack, what is a pointer, how memory allocation works, and why it matters. That's an important part of the book.
Peter Dalmaris: Definitely it is. And we'll hear more about it in your presentation. So, take it away. The virtual floor, as I say, is yours.
Benoit Blanchon: Well, welcome to this course on serialization in JSON. Before we jump in, let's check that this course is really for you. The goal of this course is to discover JSON through Arduino. It's intended for beginners and very few required knowledge. You need to be able to read some very simple C++ code. You need to know the basics of Arduino. In particular, you need to be able to [inaudible].
Benoit Blanchon: Let's review the content of the presentation. First, I will talk a little about me. Then, we'll have a large part dedicated to serialization in general. We'll start to see what it is. And we'll implement serialization from scratch and it would lead us to talk about JSON. And we'll have another large part dedicated to JSON, again, we see what it is and what is the syntax. We'll implement JSON from scratch and then use a library, ArduinoJson. And then, I will conclude by showing you also our alternative to JSON.
Benoit Blanchon: So, who am I? I've been a professional programmer for 20 years. I'm the creator of ArduinoJson, and ArduinoJson is the most popular library for Arduino today. I'm also the author of several other libraries, like ArduinoTrace and StreamUtils.
Benoit Blanchon: Okay. Let's begin. Let's begin our journey through serialization, and I call this a rite of passage to fully appreciate JSON. And I'm sure you will understand why when we are done. And everything that we see today is based on this true story. My story.
Benoit Blanchon: So, what is serialization? Simply put, serialization is just the act of transforming data into a series of bytes. Why do we do that? We do that to send the data or to store the data. And we call the reverse operation the deserialization.
Benoit Blanchon: Let's see an example. Let's say, you have two Arduinos. One have some sensors and then the other one has an LCD screen, and you want to display the value of the sensor on the screen. So, you are using the serial connection to transmit information through the wire. In this case, the data is the temperature, the humidity, and a timestamp that is added by the first Arduino. The first Arduino serializes the information. And the second one deserializes.
Benoit Blanchon: Okay. I will show a code in a moment, but to be sure that everyone is on the same page, I'm going to show you the convention that I will use in the next slide.
Benoit Blanchon: First, we'll transmit binary data, and not text as we usually do [inaudible]. So, we cannot use Serial.print as usual. But instead, we have to use Serial.write. Similarly, we cannot use Serial.readString. We must use Serial.readBytes. And these two functions take a pointer in size. For example, we will transmit the byte that are present in memory.
Peter Dalmaris: Benoit, can I ask you here, why can't we use Serial.print to send the data?
Benoit Blanchon: Because it will transform the information and, also, because it's typed. If you call, for example, print was an integer, it will send the integer as text. So, it will send the decimal representation instead of the binary, instead of the actual bytes that are in your memory. Did I make it clear?
Peter Dalmaris: Yes. If I understand right, the second example, with the integer, for example, when you do a Serial.print, what the print function does is to change the decimal into something that can be printed in the serial monitor into characters. So, if you do it like that, essentially the receiver will receive something else, not what you intended. You intend to send the integer, not the characters.
Benoit Blanchon: Exactly. So, we are not using the characters now. In the next few slides, I'm going to use binary. And as you will see in the next slide, we'll implement a serialization code incrementally. I will start with a very simple version that I call Generation 1, then improve it with Generation 2, et cetera. You get the idea?
Peter Dalmaris: Yeah.
Benoit Blanchon: Let's start with Generation 1. Example, we have three values we want to send over the wire. And the first version of our serialization code is just sending the values one after the other. So, I start by sending the temperature. As you can see, I'm calling Serial.write. I parse the address of the variable and the size of the variable.
Benoit Blanchon: So, if my variable is 4 bytes, Serial.write will send the 4 bytes that are present in memory at this address.
Peter Dalmaris: Sorry to interrupt you. That little sign there is an operator, right?
Benoit Blanchon: The ampersand. Yes. The ampersand operator is saying get the address of this variable. So, that's going to be the location in memory of that variable.
Peter Dalmaris: Right. So, that means the write function is looking for a memory location to do something with whatever is stored there, right?
Benoit Blanchon: Exactly.
Peter Dalmaris: Where print looks for the value that exists in a variable.
Benoit Blanchon: Exactly. The value.
Peter Dalmaris: That's another reason to use write instead of print. Thanks.
Benoit Blanchon: Then, do the same for the last value I want to send. And let's review quickly what is sent over the wire. So, we have 4 bytes for the temperature, 4 bytes for humidity, 4 bytes for the timestamp. So, that's 12 bytes that are transmitted over the serial path.
Benoit Blanchon: But here, we can deserialize that, we're just going to use Serial.readbytes this time as the address of the variable we want to set and the size, do the same for humidity, and do the same for timestamp.
Benoit Blanchon: This is the pros of this approach. First, it's quite simple. I mean, I think it is. It's very performant because your processor doesn't have any transformation to do and it has no overhead.
Benoit Blanchon: But the main problem of this technique is that it's really repetitive. And, therefore, it's error prone. And in particular, you have to take good care of writing and reading the values in the same order and also to match the variable with the size and type.
Benoit Blanchon: Let's improve this code with Generation 2. So, we have all three values that we want to transmit. Let's bundle them in a structure. That allows us to reduce our serialization code. So, instead of saying these are three values independently, we'll just send the whole structure in one shot. And we can do the same on the deserialization part. We can reduce this to just one line. And it doesn't affect what is sent over the wire. It's still 12 bytes with all three values.
Benoit Blanchon: The pros of this approach, it's even simpler than the previous one. It's still very performance. It's fool proof. But the main problems are, it's not portable, and I will show you why in a moment. And it's really rigid. We are limited to the values that are present in the structure.
Benoit Blanchon: Let's see the portability issue.
Peter Dalmaris: Just a question with Gen 2, what do you mean by portability?
Benoit Blanchon: That's what I'm going to address in Gen. 3. So, there are two portability issues. First, the size of long is platform dependent. For example, on the Arduino, it's 4 bytes, and on Linux 64, it's going to be 8 bytes. So, if you want to write serialization format that can work across different hardware, we must address this problem.
Benoit Blanchon: And we will address it very simply by replacing the type long with uint32. And this type comes from the stdint header. We call that a portable integer type. But what is sent over the wire is still exactly the same as before. We just made it clear that we are going at 32-bit integer.
Benoit Blanchon: And then, there is another portability issue, which is endianness. Because not every machine stores the information. They store the bytes in the words in the same order. For example, an int32 can be stored with the least significant byte first, and we call that the little endian code. Or with the most significant byte first, what we call big endian. And we are not going to address this in this presentation, but you get the idea.
Benoit Blanchon: Now, we're clarified, we can make our code more portable. Let's go to Generation 4 and introduce a string in our data. This is our initial data structure, let's add a string.
Benoit Blanchon: So, the only difference, the code from Generation 2 will still work with the exception that we now need to initialize this value. But you cannot write this. You cannot assign a string to a char. Instead, you must use strcpy. And even better, you should use strlcpy that protects you against buffer overruns. Because the last perimeter specifies the size of the destination buffer. So, that's for Generation 4.
Benoit Blanchon: And let's see what is sent over the wire. We still have 4 bytes for the temperature, 4 for the humidity, the timestamp. And, now, we send 16 bytes for the location. So, even if we set a smaller string, we will still need to send 16 bytes.
Benoit Blanchon: This is the pros of this approach, so it's still simple, it's still portable. But the main problems are it's really rigid. As we saw it, 16 characters, not more, not less. And there is significant overhead if you are just sending two or three letters for the location.
Benoit Blanchon: Let's address this problem with Generation 5. We'll introduce variable length strings. This is our ground data format. We'll extract the string. And instead in this track, we will just store the length of the string. Let's see if we can serialize this code or we can send it over the wire.
Benoit Blanchon: So, the first step is to, of course, set the location by calling strlen. And then, we can send the structure as we did before with Serial.write. And then, just after that, we send the string.
Benoit Blanchon: So, let's see what this sends over the wire. It's still 4 bytes temperature, and humidity, and the timestamp. Now, we had 1 byte to specify the length of the string. And then, 11 bytes for the actual string, assuming that we are still using the living room example.
Benoit Blanchon: Let's see if we can deserialize this data. So, first, as usual, we read the data structure and then we read the string that follows the data structure. And we parse the length of the string to read that. And after that, we insert the new terminator so we are sure that we actually terminate the string in case it was not in the initial string.
Benoit Blanchon: Let's see the pros of this approach. There is no overhead compared to the previous one. We just sent the bytes that we need. It's still performance.
Benoit Blanchon: But the main problems are it's really error prone. In particular, you have to remember to set the length of the string. More importantly, you need to insert the new terminator.
Benoit Blanchon: But the biggest problem to me, what is going to happen if someone sends you a data string with location length with a value above 16 bytes. In that case, the second read bytes, we read more than 16 bytes and this will cause a buffer run. In that case, your microcontroller is going to crash.
Benoit Blanchon: And another problem is that it's still pretty rigid. We are still very limited in the data we are going to send. And we'll address that in Generation 6, where we will introduce versioning.
Benoit Blanchon: This is our structure. And let's say, now we have two version of the structure. We have the original, Version 1, and a new one that I call Version 2, that adds the pressure measurement.
Benoit Blanchon: So, to write code that will be able to work with both version at the same time, we'll bundle this in another data structure and we'll use the union.
Benoit Blanchon: And if you don't know what the union is, it's like a struct, except that the values overlap. So, it means that you cannot use both at the same time because they're both in the same location in memory. And you need to know which one is going to be used. So, that's why I added a byte, a version identifier, at the beginning of the structure.
Benoit Blanchon: And as we are going to see it in the next slide, it's very important that this byte comes first. What is sent over the wire? So, we still have our initial structure. We just added 1 byte to the front, that's for V1. If we are sending V2, it's the same, except that at the end we also added pressure. So, these are our two version of the message.
Benoit Blanchon: Let see how we can implement deserialization code. So, the first step is to send the version byte. And then, if the version is one, we send the Version 1. Otherwise, we send the Version 2. And we're still using the same syntax, except that this one I'm parsing the right feed from the union.
Benoit Blanchon: To deserialize this, we need to read the version. If the version is 1, then we read the Version 1. Otherwise, we read the Version 2.
Benoit Blanchon: Pros of this approach. It's very flexible. You can add as many versions as you want. They can be totally different if you want. So, it means you can send very different messages over the wire, not only the temperature and humidity. You can do whatever you want. It's still very performance.
Benoit Blanchon: But the main problems are it's very error prone. I think because it's quite complicated. And my sentiment about that is, I think it's too much effort for the code that I'm writing every day.
Benoit Blanchon: So, let's summarize. We wrote a serialization code that can store or send multiple values efficiently, that support variable length strings, that can work with different platforms, different hardware, that support multiple versions. But, unfortunately, quite complicated over time.
Benoit Blanchon: So, let's try a different approach. Let's try XML, Generation 7. So, these are two versions or two messages that we want to send over the wire. Let's say, now, we change our rule and we say these are XML documents.
Benoit Blanchon: So, immediately we can see the pros of this approach. Flexible, I can insert or remove fields as I want without breaking the compatibility. It's portable. By design, I don't have to care about the size of the integer because they are sent as text. I don't have to think about the endianness for the same reason.
Benoit Blanchon: And a new feature, it's human readable, so I can monitor to say I got and debug my [inaudible]. I love it. Let's give it a shot.
Benoit Blanchon: So, Generation 7, let’s see how we can implement the serialization code. And we'll use a very simple technique, we'll send the first string segment. Let's start by opening the tag of data in temperature then send the value. And you notice that this time I'm using print and not Serial.write, because I want to send text. Then, close the tag. Do the same for humidity, and do the same for the timestamp, and do the same for the pressure, and that's all.
Peter Dalmaris: Sorry to interrupt you, Benoit. Here you're using prints because, exactly, you want to send text out as a text, so print is okay.
Benoit Blanchon: Yeah. Yeah. XML is pure text format, so instead of sending the raw bytes that are present in memory, you say, no. I'm going to send a text version of the structure. And so, XML is just a way of laying out the information as you see on the left. It's just a writing convention, a format that you want to layout the objects on text.
Benoit Blanchon: Let's see what is sent over the wire now. So, we have 6 bytes to send the initial opening text, 31 bytes for the temperature, including the open and closing tag, 23 for humidity, 28 for timestamp, 25 for the pressure, and, finally, 7 bytes to close the data. That's a total of 120 bytes. That's roughly six times bigger than Generation 6.
Benoit Blanchon: My reaction to that is, okay. It's probably worth it. I mean, now, it's human readable. It's portable. It's' probably worth it. Let's see the deserialization code now. So, we are still deserializing this document.
Benoit Blanchon: And, now, I'm going to use a technique that probably few people know, it's Serial.find. So, what this function is going to do is, you parse a string to Serial.find and it will read every byte of the serial [inaudible] and stop as soon as it reads the string you specified.
Benoit Blanchon: So, in our case, I'm going to look for the temperature opening tag. So, it stopped reading right there. And this allows me to call Serial.parsefloat just after that so I can read the text or the representation of the temperature. And do the same for humidity, do the same for the timestamp. And if I find the pressure string, I can extract the pressure.
Benoit Blanchon: But one significant problem to this approach, Serial.find only reads in one direction. So, for example, if pressure comes before the timestamp, you'll not see that pressure was present in the string. So, that's a huge problem.
Benoit Blanchon: Now, how can we fix that? Let's use the library. How hard could it be, right? So, I'm going to demonstrate how we can use libxml, which is the most popular XML library, as far as I know. So, we just need to include two additional headers. Let's give this a try.
Benoit Blanchon: So, I'm going to extract the whole XML document in a buffer, this time using Serial.readString. It will read until it reach the timeout. So, it's going to wait for a timeout. That may be a problem that you may want to address, but that's not the problem today. So then, I will call xmlReadMemory, parsing the address of the buffer so it can deserialize the XML string in memory. Then, the next step to extract the root element. In our case, it's the data tag.
Benoit Blanchon: After that, we need to work through each children of the root note. We need to get the name of the tag from that. And don't forget to cast it because it's node saying char. After that, we need to extract the content of the tag.
Peter Dalmaris: I'm getting a headache now.
Benoit Blanchon: Yeah. Me too. But stay focused. Stay with me a bit.
Peter Dalmaris: I'm trying.
Benoit Blanchon: So, if the tag matches temperature, I'm going to call atof to parse the string. And this is going to reach the float. Now, do the same for humidity, do the same for pressure, and do the same for timestamp. After that, don't forget to free the value. I mean, the content of the children node. And after that, don't forget to free the domain to avoid memories.
Benoit Blanchon: My sentiment about that is this was horrible.
Peter Dalmaris: And my feeling too.
Benoit Blanchon: So, I could write this instead as a string too. Eval is the content of my object and evaluate will return me the value. So, this way I don't have to write any deserialization code. It's already written for me. And I will read that part from the text file so that here is how I'm going to save and restore my configuration.
Benoit Blanchon: By the way, if you're doing that nowadays, don't use eval. Instead, use JSON.parse, it's much safer and probably a little faster.
Benoit Blanchon: It turned out somebody already discovered JSON. I'm not the inventor of JSON. And this someone is Douglas Crockford. He created json.org around 2002. So, if you wonder why I was not aware of that in 2006, I'm going to remind you that this was three years before Stack Overflow. This was a completely different era.
Benoit Blanchon: So, Douglas Crockford described JSON as the fat-free alternative to XML. There's a whole page that compares the two approaches. It's a lightweight data interchange format, easy for human to read and write, easy to parse and generate from machine. That pretty much summarize the spirit of this format.
Benoit Blanchon: So, let’s see the syntax in detail. So, several kinds of values you can store in JSON. The first group that I call simple values, of course, we have the strings, the numbers, the booleans. Then, we add the complex values, like arrays, which is a sequence of values, and the objects, a dictionary of values.
Benoit Blanchon: Let's see strings in more detail now. So, strings in JSON are delimited by double quotes. Single quotes are excluded. And symbols without any quotes are excluded too. JSON supports escape sequences, like this line break between the words Hello nWorld. It supports Unicode escape sequences, like this fancy I on my first name. And here you can see the syntax is very similar to C++ with one exception, this little guy.
Benoit Blanchon: The numbers in JSON are only written in decimal, so you cannot add extra decimal values in JSON. You can also insert floats and you can also write floats. We use scientific notation using e. Again, this is very similar to C++.
Benoit Blanchon: Arrays in JSON, they are delimited by square brackets. You can mix types in an array. You can nest array. And an array can run an array in an array [inaudible] program. And the syntax is very similar to Python this time, except that you cannot have a trailing comma. You cannot have a comma after the last value.
Benoit Blanchon: Object in JSON, they are delimited by curly braces. The keys are strings. The values are anything you want. You can nest arrays in object. Of course, you can nest objects in objects. Again, the syntax is very similar to Python, except that you cannot have a trailing comma. You cannot have a comma after the last value.
Benoit Blanchon: Various facts about JSON, you cannot null inside a JSONDocument. This can be quite useful. And JSON allows you to put any number of spaces or line break or tabs between each value. And it doesn't change the content of the document. It allows you to lay it out to make something readable.
Benoit Blanchon: Let's write our Generation 8 now, we use JSON. So, these are our two messages we want to send over the wire and transform them to JSONDocument. These are pros of this format compared to the previous one, so we have still something very flexible. We can insert and remove values, we can nest, we can do anything we want with this format without breaking the compatibility.
Benoit Blanchon: It's portable by design as XML. Since it's text, we don't have to care about the size of the word or the order of the bytes and the endianness. It's human readable. And more importantly, it's concise. Migration to that is wow. This is going to be so cool.
Benoit Blanchon: Let's start. Let's implement that. So, this is the message we want to send. Deserialization, we can write it exactly the same way we did for the XML code. So, first we send the first string segment, then send the value of the temperature. And, again, we're using print because this is text. And then, we do the same for the humidity, the same for the timestamp, and the same for the pressure. And see what is sent over the wire.
Benoit Blanchon: We have 5 bytes for the top-level object for the two curly braces and the three comments. Then, we have 18 bytes for the temperature, including the key, the codes, and the colon, 13 bytes for humidity, it's 74 for timestamp, et cetera. So, that's a total of 68 bytes, almost half the size of the XML document.
Benoit Blanchon: Let's see how we can write the deserialization code now. Let's try the simple approach as we did, Serial.find. So, we are looking for the temperature key of the temperature, do the same for humidity, the same from temperature, the timestamp, and the same for the pressure. And, of course, we have the same problem as XML. So, yeah, buddy. We'll have to try with another library. But I promise it's going to be better this time.
Benoit Blanchon: Let's introduce ArduinoJSON, a JSON library for Arduino and every embedded C++ project. Two words about ArduinoJSON. It's currently the most popular Arduino library. As I said, I've been continuously developing and improving this library since 2014.
Benoit Blanchon: It works, as I said, with any C++ project. And this is a huge win if you want to write unit tests and run them on your computer. It's optimized for embedded. And as you will see, this is optimized for low memory and low CPU hardware. It's well-tested with over 98 percent coverage. And it's well documented, there is a whole website for that, arduinojason.org.
Benoit Blanchon: Let's see how we can deserialize with ArduinoJSON. This is our message and this is going to be our code. So, with ArduinoJSON, everything starts with a JSONDocument. In this case, I'm going to use StaticJSONDocument with 200 bytes in it. I will talk about the JSONDocument in a few slides.
Benoit Blanchon: Then, after that, we are going to call deserializeJSON. And so, this function, as the name says, it will deserialize what comes from the serial path, and put the information in the document.
Benoit Blanchon: Then, we'll extract the values one by one. We'll extract for the temperature with this fancy dictionary syntax. And we don't have to care about parsing the float. This is taken care of by ArduinoJSON because it sees that we are extracting a float variable. So, it knows it has to parse the float. The same for humidity, the same for timestamp, and the same for pressure.
Benoit Blanchon: We don't have to verify; the pressure is present in the document. If it's not, ArduinoJSON will just read on zero. And we know that our pressure there means it was not present. It's a string.
Peter Dalmaris: Just a quick question, Benoit. So, I guess these variables, temperature, humidity, et cetera, these are declared somewhere else with the correct types. So, you'll say something in the program float temperature.
Benoit Blanchon: Exactly. I'm assuming that we preserve the four variables that I introduced in Generation 1. So, it's still -
Peter Dalmaris: And the library will know that when you say doc temperature, whatever it is in the temperature field in the JSON message must be converted into the right data type and then put into the temperature variable. You don't have to worry.
Benoit Blanchon: Yeah. Yeah. It can deduce that from the assignment. It's because you assigned it to a float. And the alternative syntaxes, if you are in a situation where this doesn't work, for example, if you are using auto word. But, yeah, I'd like you to check the syntax on arduinojson.org if you want to know the details.
Peter Dalmaris: Just [inaudible] on this. Let's say that in your JSON, in temperature, instead of 21.2, you have just 21, which is an integer. Will the library convert that to 21.0 and then store it in the floating-point type variable?
Benoit Blanchon: Yeah. In that case, how do you know JSON will store 21 as an integer? Well, exactly as logged in the document. But as soon as you extract, it's casted to a float.
Peter Dalmaris: Yeah. Exactly. So, you got a bit of leeway there if the values in the JSON document itself are not precisely typed in, perhaps like missing decimal there. It's okay, the library can deal with that.
Benoit Blanchon: Yes. The library can deal with that. Exactly. And this is particularly true of ArduinoJSON, even if you use quotes around 21.2, making it a string instead. If you try to extract a float from a string value, ArduinoJSON knows it has to parse the float. So, that's really cool. You didn't know that, did you?
Peter Dalmaris: No. I did not know.
Benoit Blanchon: You're welcome.
Peter Dalmaris: You obviously thought about it.
Benoit Blanchon: No. I realized that this feature was needed just because many people were putting an issue on GitHub and asking this doesn't work. And we are not realizing that this was actually a string that was written by the [inaudible]. So, yeah, I already implemented the parsing code so it was basically free to do this in ArduinoJSON. And by free, I mean it doesn't make a larger code, which is another key feature of ArduinoJSON, it's a tiny code.
Peter Dalmaris: Yeah, you're right. Because it's still text, whether it's got the double quotes or not, JSON is still text. So, it's not a big leap to detect that condition and then deal with it.
Benoit Blanchon: Yeah. Yeah. It wasn't a big deal.
Peter Dalmaris: Right. Thanks.
Benoit Blanchon: Where was I? Let's compare this solution with our previous code. So, this is our previous code. As you can notice, the code with ArduinoJSON is a little shorter, which is good. It's not sensitive to ordering. Now, we can send the values in a different order, and that's okay. It supports escape sequences, in case you're transmitting a string. And it supports nesting. What I mean by that is that you could extract a value that is inside an object. In this case, it could be in ssid inside a Wi-Fi object. You could chain this dictionaries syntax like this.
Benoit Blanchon: And in here, we can serialized now with ArduinoJSON. This is our document. This is our code. Again, with ArduinoJSON, everything starts with a JSONDocument, still using the same JSONDocument as before. We first need to insert the values in the document before serializing it. So, we again use this dictionary like syntax, do the same for humidity, timestamp, and pressure.
Benoit Blanchon: And then, we just need to serializeJSON, and as the name suggests, it will serialize the document as the first parameter and send it through the second parameter. Those are your [inaudible].
Benoit Blanchon: Let's compare to the old code that we wrote, and it's about the same length. I think it's less error prone because it's simpler. It supports escape sequence as well. And it supports nesting. Indeed, we could chain the dictionary syntax and the subscript operator to insert a value in an object like this.
Benoit Blanchon: Let's talk about the JsonDocument for a minute. So, the JsonDocument, as I said, contains the memory representation of two actual documents. So, it's a memory structure that contains the object and all the values. It's optimized for low memory and low CPU hardware. And it contains a fixed-sized memory pool, it's a very important feature of ArduinoJSON.
Benoit Blanchon: So, when you create a JsonDocument, you have to specify its size at the beginning. And that's the amount of memory that we reserve for the parsing and for the serialization and deserialization.
Benoit Blanchon: And the advantage of this approach, is, first, it's extremely fast. The code is tiny and it prevents a fragmentation of the heap. And that's very important. That's, in my opinion, the number one problem in embedded development, the fragmentation. And I have an article on my blog, C++ for Arduino, I invite you to check this article if you want to know more about heap fragmentation.
Benoit Blanchon: So, thanks to these fixed-sized memory pool, the JsonDocument contains a blazing fast allocator which is necessary when you have a lot of nested values in your document. It's very important to do a very fast memory allocator.
Benoit Blanchon: The JsonDocument comes in two flavours, the StaticJsonDocument that we used in the previous slide and the DynamicJsonDocument. Let me talk about these two in this slide.
Benoit Blanchon: So, they both have a fixed capacity, a fixed-sized memory pool. You set the size of StaticJsonDocument in a template parameter, just as we did, the 200 value as parse as a template parameter. Whereas, if you use a DynamicJsonDocument, you can parse the value as a constructor parameter. This allows StaticJsonDocument to know its size at compile-time, that's very important. Whereas, DynamicJsonDocument knows its size only at run-time.
Benoit Blanchon: StaticJsonDocument allocates its memory pool as a regular byte rate, just as you would do in your code. Whereas, in DynamicJsonDocument, codes malloc and free to do dynamic memory allocation in the heap. This allows StaticJsonDocument to be stored in the stack. Whereas, DynamicJsonDocument is stored in the heap.
Benoit Blanchon: And I realized maybe not all of you may know what the stack and the heap are. Quickly put, the stack is the location where you store all your automatic variable. All local variables that you decline in a functions code. For example, I stored in the stack, if you are using an integer [inaudible].
Benoit Blanchon: And this memory is not managed at run-time. It's managed a compile-time by the provider. So, everything that you put in there costs absolutely zero instruction or maybe one instruction at run-time. So, the allocation here is the fastest you can do.
Benoit Blanchon: Whereas, on the heap or the free store, depending on stack method, is more elastic. You can, at run-time, allocate block and release a block, et cetera. That's the memory you use when you create the string class. And it's a little slower and a little less reliable because the call to malloc may fail.
Benoit Blanchon: That's why I prefer using StaticJsonDocument when I can. And my recommendation is to use it for small document, below 1 kilobyte. And reserve DynamicJsonDocument for larger documents. More on this topic on arduinojson.org. Any question, Peter?
Peter Dalmaris: No. Awesome. That difference between the stack and the heap, I think, at least beginner Arduino makers don't really understand because most of us did not come from a C or C++ programming requirement. We start with just making simple electronic gadgets and the programming just is an add-on, so we don't really get to that level of understanding. So, thank you for explaining, that's a really important point.
Benoit Blanchon: Yeah. That's why I cover this part on The Missing C++ Course on Mastering ArduinoJson.
Peter Dalmaris: I saw the chapter. I highly recommend it.
Benoit Blanchon: Many people told me that this was eye opening. They finally understand what's going on behind the scene.
Peter Dalmaris: All right. Thanks.
Benoit Blanchon: Well, let's see, we can choose the capacity for the document because, as I said, when you create a JSON, you have to give it a size. So, on the one hand, you need something that is large enough to store any valid document. And what I mean by valid is you are running on an Arduino Uno with 2K's of RAM.
Benoit Blanchon: It doesn't make sense to try to parse any possible JsonDocument of 1 megabyte, honestly. So, only think about your largest use case, what can be the biggest valid document, and exclude all the other one with crazy long strings or crazy number of values.
Peter Dalmaris: I think you have a calculator for that on your website, right?
Benoit Blanchon: Yeah. I'm going to talk about that.
Peter Dalmaris: On the other hand, your JsonDocument must be small enough to fit in the little RAM chip of your microcontroller. So, ArduinoJSON provides macros to compute exactly precisely the size of the required capacity, we have JSON_OBJECT_SIZE, JSON_ARRAY_SIZE. And on top of that, you have to remember to include the size of the strings.
Benoit Blanchon: Yeah, you're right, buddy. That's too complicated. Can we avoid that? Yes and no. Again, we cannot avoid parsing capacity to the JsonDocument. But we can avoid [inaudible] computation. As Peter said, you can go to arduinojson.org and click on the assistant tab, and it's going to bring you to this ArduinoJSON assistant. On the left side, you can copy and parse your JsonDocument. And, automatically, on the right side, it will display the right expression using the macros I just presented.
Benoit Blanchon: But I do prefer using the values on the bottom here. And what does this say? It says that for Arduino Uno that is running on the AVR architecture, our document requires JSONDocument of 72 bytes. So, that's the minimum required capacity to parse this document, so 200 bytes are more than enough.
Benoit Blanchon: As a bonus, there is an assistant also generates two programs for you. On the top is the serializing program and the bottom is the deserializing program. And as you can see, it's very similar to the code that we wrote in the previous lines.
Peter Dalmaris: Another code generation.
Benoit Blanchon: Yeah. In a sense, yes. And then, you could pre-parse this and this is just a starting block for your code. Of course, you are going to customize it.
Benoit Blanchon: And we are reaching the end of the presentation, so as I promised, I'm going to say a few last words about other formats. So, other generic format, at the time we branch to try XML and go the text format route. We could have taken another route and follow with pure binary format.
Benoit Blanchon: And so, the other three popular format in this area, Protocol Buffers, Cap'n Proto, and Apache Trift. And they are all basically the same, so I will only present one. In the next slide, I will talk to you about Protocol Buffers.
Benoit Blanchon: And then, we have another family of format, the Binary JSON. So, we have MessagePack, BSON, CBOR. I'm not a big fan of this format because while they do reduce the payload leader and they do reduce the work from the microcontroller, they also destroy the human readable aspect of JSON.
Benoit Blanchon: And I think the gain in size and in performance doesn't outweigh the loss of the human readable and portability of JSON. So, I don't think BinaryJSON makes a good performance.
Benoit Blanchon: Then, we have a third family which is the extensions to JSON. We have JSON5 and HanSON. They are both basically the same. They [inaudible] more, trading commas, comments, and stuff like that. I don't recommend using them because as soon as you do that you are writing code that is [inaudible] with other JSON code.
Benoit Blanchon: Then, in this family, we also have Amazon Ion. And this format is different from the two previous ones because it adds new formats like date, for example, and sequence of binary, but I need to check that.
Benoit Blanchon: As promised, let's talk a little about Protocal Buffer because it's a very important player in the serialization game. As I told you, it's a flexible binary format. It offers a decent speed. Of course, it cannot be as fast as a binary format from Generation 6 because it does some transformation. But it has a very low overhead.
Benoit Blanchon: And I think I'm pretty sure that the product version serialization code will be even smaller than what we did in Generation 6 because Protocal Buffers can store integer with a viable lengths. If you're [inaudible], we just send 1 byte, maybe less instead of the four mandatory bytes that are from in process.
Benoit Blanchon: To do this, Protocol Buffer requires that you write in advance the format of the message. So, this is our example, the four values that you should recognize. And as you can see, I put three values as required and one as optional.
Benoit Blanchon: This file is called a proto file; it must be compiled. So, that's what I dislike with Protocol Buffer that it uses code generation. Let me explain that.
Benoit Blanchon: So, we have a proto file, this file that describes our message. And let's say that it's called myformat.proto. You have to parse this file through through the protoc compiler. This compiler, this code generator, will generate a .pb.cc file, which is just another extension for C++.
Benoit Blanchon: But the problem is our code, our leader [inaudible] already in another cpp file. So, we need to involve the compiler, compile both file, which will generate two objects file, and then involve the linker, and that will generate our executive.
Benoit Blanchon: And I know some people will say, "Okay. That's right. That's okay. I'm just going to write [inaudible]." But other people will say, "What the fuck is a [inaudible]?" That's why I think Protocol Buffer is a very good solution, but it's absolutely not appropriate for Arduino sketches. I think that there's a lot of pain involved in the build process, that's why I don't recommend that for small Arduino sketches.
Benoit Blanchon: Well, that's all I have to say today. I hope it was okay.
Peter Dalmaris: Wow. That was quite a masterclass, I have to say.
Benoit Blanchon: Yeah. Intensive maybe talking about that.
Peter Dalmaris: It was intense and, like, it filled my brain to capacity with new stuff. I've read your books, some of them I knew. But having the explanation from the actual master just adds a whole new dimension. The examples of taking us from the beginning, the first iteration, walking us through, it's an excellent way to understand why you got from one to the other that's why I call it a masterclass.
Benoit Blanchon: This is a true story. This is my journey as a junior programmer trying to save things or transmit things, and step-by-step discovering the technique until I realize, "Okay. Maybe JSON is the solution for most people."
Peter Dalmaris: Yeah. That was the actual thinking process, right? That's how your brain worked to take you eventually to ArduinoJSON. Is that right?
Benoit Blanchon: I think so, yeah.
Peter Dalmaris: So, what I want to ask you is to try and decompose a little bit your presentation - distil it, I should say - that's a better word - to distil it to the keys or to the key learning items that are the most important. Because, again, for most of us, I think at this moment we are blown away from everything that we've learned. And it's hard to make sense of all of it. So, can you distil it, like, maybe to three or four points.
Benoit Blanchon: Yeah. I know this was going to be a little intense, maybe a little too much, so I prepared had a chart to compare the value solution that we saw today.
Peter Dalmaris: Let's check it out.
Benoit Blanchon: So, I prepared this little chart that summarize what we saw today with the various format. So, first, we started with custom binary format, and we saw that the payload was really small, that this format required absolutely no work from the CPU. We also see that this was not flexible. And as soon as we want to add fields or version, it became painful. And the pain began as soon as we wanted to add strings to the mix.
Benoit Blanchon: So, my recommendation is use this technique for a small project for flat data structures - I mean, avoid strings and nested stuff - and for isolated project because it's hard to maintain the compatibility. But for the example I gave at the beginning just two Arduino communicating, this is a very good solution.
Benoit Blanchon: Then, we saw - maybe not in that order - JSON. We saw that the payload is significantly high. Of course, the work from the CPU is significantly bigger too. But this format is very flexible. Not as flexible as XML, of course, but plenty enough for projects. And we saw that this was the easiest solution. As soon as you use ArduinoJSON, the code was really simple. So, I recommend using JSON for any complex structure or anything where you want to have a backward compatibility, especially for APIs.
Benoit Blanchon: We talked about XML. We saw that the payload was extremely flat or poor. CPU had a lot of work to do to deserialize that. XML is very flexible and we only touched the surface because you compare to JSON, you can add another layer of flexibility that are the attributes on the tags. But we also saw that this solution was very painful.
Benoit Blanchon: Maybe I wasn't very fair with XML. I used this library and XML is probably not the easiest to use. But, nevertheless, XML is [inaudible] and too fat to be used on embedded project. My recommend is keep XML to store texture and documents like web pages or books, and don't use XML for transition.
Benoit Blanchon: We talked about Protocol Buffers, the payload is minimum because the format compressed even the integers, not only the strings. The work from the CPU is, of course, a little higher than what we do with our custom binary format. The flexibility is amazing.
Benoit Blanchon: But I think it's quite difficult to use, especially because (1) you lose the human readable aspect. I know there are tools to decode inducer text, but it's still more complicated than JSON. And the build process, I think it's too complicated for most having a project. So, my recommendation toward Protocol Buffers is use this for complex messages when performance matters. I mean, really matters. Like for example, RPC codes where you want to have a very low latency.
Benoit Blanchon: Again, finally, we have the BinaryJSON, like MessagePack. The payload is a little lower than JSON. The CPU work is a little below JSON. It's as flexible as JSON because it's roughly the same thing. And we saw that the pain from the programmer side is all right because you can use ArduinoJSON, ArduinoJSON supports MessagePack. But you lose the human readable aspect. So, it's harder to develop.
Benoit Blanchon: And as I told you, I don't consider the MessagePack and also BinaryJSON is an interesting compromise for Arduino platform. So, I don't recommend using this.
Benoit Blanchon: I prepared a little chart because it's always easier when you see a drawing. So, it's a chart with one axis, the flexibility from very rigid two very agile. And the other axis is the pain from super happy developer to "Oh, my god. What is that?" And this is the chart. And the bubbles represents the performance.
Benoit Blanchon: As you can see, binary format and Protocol Buffer represent the best performance, but they are both on the edges. And what I want you to remember of this presentation is that JSON right here represents the best compromise and represents a good compromise between flexibility and ease of use and performance. That's all.
Peter Dalmaris: Right. Wow. Just that part of your presentation could be its own presentation, I think, that comparison of the different data interchange formats. I'm saying that because, like in the Internet of Things, data interchange is core, no matter what application you build, somehow it will have to involve exchanging data. So, these are some of the best options around. Thank you. This is amazing.
Peter Dalmaris: I've got another one while you are in the screen sharing, my next question just before we wrap it up is about resources. So, at this point, our viewers will probably want to learn more, what are some places or resources that you can recommend for learning some more about JSON, your library, or everything that you talked about, even the programming aspect of it.
Benoit Blanchon: Yeah. Well, if you want to learn more about ArduinoJSON, of course, you can go to arduinojson.org. We saw the system today, but there is also the API documentation, the examples, and the frequently asked questions.
Benoit Blanchon: If you want to learn more or support the project, you can purchase my book, Mastering ArduinoJSON. As we said, the book starts with a quick C++ refresher where I talk about the pointers, variable, location, stack, heap, fragmentation, stuff like that. And it also contains a tutorial on serialization and deserialization. And it's really different from what we saw today.
Benoit Blanchon: And the book closes with several case studies where you see ArduinoJSON in situation. For example, there is the project that communicates with GitHub. Another is voice underground and stuff like that. And [inaudible] also. So, I think this is a very important part of the book because you can see how the libraries intended to be used. It's much better to start from this example than copy pasting anything you can find on the web, because I can see some huge mistakes.
Peter Dalmaris: I think you updated the book recently, didn't you, to ArduinoJSON 6.
Benoit Blanchon: Yeah. This picture is from the original version. But the new one is called Mastering ArduinoJSON 6. I updated the new revision, so it's up to date. And if you purchase the e-book, you will obviously receive the updates. If you purchased the paperback version, it doesn't update.
Peter Dalmaris: Go e-book.
Benoit Blanchon: Yeah. There is a discount for the viewers of this summit, use the coupon code MMM19, and this will give you a discount.
Peter Dalmaris: There's a discount. We're going to have all that in the presentation page so that it's just a click away. All of these resources are a click away. Final question, Benoit, how can people get in touch with you if they want to ask questions or discuss anything with you?
Benoit Blanchon: Okay. If you have any question about ArduinoJSON, the best is to open an issue on pages. If you don't remember that, you will see contact links on arduinojson.org, and you'll see the various way to contact me. Most of the time, it's just through GitHub.
Benoit Blanchon: I have another blog that is called C++ for Arduino, I talked about this in the presentation. This has not been updated in a while, but I promise it will come back soon. Also, I have two YouTube channels, people can probably find me easily. And, also, my personal blog that is not about Arduino, that is anything technical. This blog is called Good Code Smell, and the address is blog.benoitblanchon.fr. That's it.
Peter Dalmaris: Great. All right. So, we are going to have all this information in your presentation page. Benoit, thank you for sharing all that. You got some really nice resources up there. And your blog articles about the Arduino and programming are really clearly [inaudible]. I think like a little tutorial showed them themselves so you can learn a lot by reading a page or two pages of your blog posts. So, I highly recommend people to check them out.
Peter Dalmaris: So, thank you so much for your presentation. It's been really amazing. Thank you for putting the time and effort to produce it.
Benoit Blanchon: Thank you for inviting me.
Get life time access to all Maker Mind-Meld masterclasses
Watch each masterclass from the comfort of your own home and learn about their tools, techniques, and thinking processes so that you can become the best maker you can be, at your own pace.
Get audio downloads of all 22 session, the Maker Mind Meld "Playbook" session notes.
Jump to another article
1. Silke Bethke, John Nyagah And Catherine Squire Blatti: Supporting Families Of Young Children Through STEM Education
2. Dr Peter Ellerton: Critical Thinking In STEM
3. Celinda Corsini: Am I Teaching Robots Or Humans?
4. Prof John Fischetti: Co-Constructing The Learning Journey With Our Children
5. Dr Ken Dovey: Leadership In Education, A Collective Achievement
6. Seven Vinton: Strategies For Extending Student Logical Reasoning
7. Alain Pannetrat – Building A Wired IoT Platform For Makers
8. Karsten Schulz – Making A Computer Processor With The B4 Kit
9. John Teel – 15 Steps To Develop Your New Electronic Hardware Product
10. Jordan Christman – Getting Started With FPGAs
11. Nicola O’Brien: Remote learning now and in the future
12. Dal Gemmell – Telcos Aren’t The Future. You Are.