Introduction to PHP’s Serialize Function

I was tinkering with a WordPress database the other day and was nosing around through some tables, looking for some statistical data which I never found, and noticed that a few of the tables were storing data in an unusual format (to me at least). Specifically one record I found looked like this:

a:6:{s:5:"width";i:250;s:6:"height";i:323;s:14:"hwstring_small";s:22:"height='96' width='74'";s:4:"file";s:72:"/uploads/2008/04/wireframes7.jpg";s:5:"sizes";a:2:{s:9:"thumbnail";a:3:{s:4:"file";s:23:"wireframes7-150x150.jpg";s:5:"width";i:150;s:6:"height";i:150;}s:6:"medium";a:3:{s:4:"file";s:23:"wireframes7-232x300.jpg";s:5:"width";i:232;s:6:"height";i:300;}}s:10:"image_meta";a:10:{s:8:"aperture";i:0;s:6:"credit";s:0:"";s:6:"camera";s:0:"";s:7:"caption";s:0:"";s:17:"created_timestamp";i:0;s:9:"copyright";s:0:"";s:12:"focal_length";i:0;s:3:"iso";i:0;s:13:"shutter_speed";i:0;s:5:"title";s:0:"";}}

Whoa. Looked like some sort of delimited data (it is) but not in a form I recognized. Now over the years I thought I had seen my share of input that is stored in a database. Like most PHP developers I didn’t learn PHP through a class, I learned it by reading and trial and error, but I don’t remember seeing something like this before. Now some of you already know what kind of data this is while others are probably scratching your head and wondering what kind of strange delimited data this is. Isn’t it obvious, that’s an array.

Serialization

cerealAccording to Wikipedia, serialization “is the process of converting an object into a sequence of bits so that it can be persisted on a storage medium (such as a file, or a memory buffer) or transmitted across a network connection link. When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object.

Now that I’ve gotten the obligatory definition out of the way, let’s humanize exactly what we’re talking about with a real world example.

Let’s say that you have a feedback form on your website and you want to store all the feedback in a database so that you can analyze it later. Along with the feedback that visitors provide you’re also interested in storing the date and time of their visit, their IP address, and some other information. One way to do this would be to create a table where you had a column for each field in your form. Every record entered in the table would need to insert each field of your feedback form, and other data you wanted to store (e.g. IP address, time submitted), into the table with a long SQL statement. Later if you needed to change your form, say add or remove a field, then you would not only need to change your table schema, but you’d also have to alter your SQL statement. Managing the feedback data this way would also result in the database table growing quite large over time. Since you can’t insert an object or array into a MySQL database (you’ll see an error like Unknown column ‘Array’ in ‘field list’ ) it would be nice if there was another way to easily store all the values we receive, especially if the number of fields could change. So what’s the alternative? PHP’s serialize() function.

PHP’s serialize function was designed for just this sort of job. What it does is basically flatten an array or object into a string (according to php.net, it “Generates a storable representation of a value”). Using our feedback form example, we could pass the $_POST array through the serialize function, and end up with a string which we can store in our database. When the serialize function is done doing its magic with whatever you throw at it, the output is a whole bunch of curly braces and semicolons which looks similar to a:1:{i:0;s:7:"slugdiv";} and the example in the first paragraph. (Note that whenever you enter user submitted data its a good idea to run it through the PHP mysql_real_escape_string function first.) So serializing an array converts it from the array to a string:

serialize

All the array keys and values are preserved by the serialize function so that they can be reconstituted later. If we use serialize() to store our feedback form data, we can eliminate all the fields in our table and just keep the one field for our serialized string. That’s a big space saver. Plus, by storing just the one serialized string in our database we can greatly simplify our SQL statements and no matter how many times we change our form, all the array contents will neatly fit in our table without it or the SQL statements needing to be modified.

Right about now you might be asking, well that’s pretty cool but how do I get my data back out of the database when I need it? No need to fear, just use unserialize() to reverse the process. If you assign the output of unserialize() to a variable then that variable will be a clone of what the original input was. Unserialize will keep all the keys and values of our feedback form’s $_POST array so that they’re instantly available just as they were when they were first created. For example, if you retrieved a serialized row from our table, we could run it through the unserialize function and assign the output to a variable: $reconstituted = unserialize($row['feedback']);.

Sounds great, so what’s the down side? Well by storing our feedback form data in serialized form in the database we lose the ability to use SQL to manipulate the data. So if we wanted to run a query that returned only the email addresses for all of our feedback, then we’d be out of luck. We’d have to use PHP to pull all the data out of the table, unserialize it, and then search through it using PHP instead of MySQL. If our data was stored in unserialized form, where each column corresponded to a field in our feedback form, then searching through and manipulating the data would be far faster and easier with MySQL. So knowing when to use serialization is important. Storing user preferences, or say attributes for an image (like WordPress does), is a perfect use for serialization. If you’ll have the need to search, update or delete parts of serialized data, or otherwise manipulate parts of the data you’re storing, then serialization may not be the way to go. However if you simply need to store arrays or objects in a database so that they can be pulled out in their entirety later, it may be the perfect solution.

5 thoughts on “Introduction to PHP’s Serialize Function”

  1. Well serialize() doesn’t obviate the need for a table; you’ll need to store the comment form data in a table or a file. The question is what format is better to store your data in? Should you store the comment form data in a table using serialize() or using separate fields, or should you store the data in a flat text file using serialize() or a delimiter?
    If your comment form has a fixed set of fields that probably won’t change over time, then I’d use defined fields (table or flat file) instead of serializing. If the data you’re collecting will change over time, then serialize may be a better option.

  2. One other thing I’m experiencing – although very infrequenty – is that serialize() creates a wrong string with some $_POST data. I’m not 100% sure about the validity of this issue yet; however there is a hint about its existence so I wanted to note it.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>