In the second of this two-part blog Principal Analyst Alex Caithness follows up his earlier look at Local Storage and Session Storage in Chrome with a contrasting examination of how these mechanisms work in Mozilla Firefox.
In part one of this two-part series we introduced the concepts behind the Web Storage API and how web developers use it to store data related to a website or web application on a user’s machine. We then went on to look at how Mozilla stores Local Storage data in an SQLite database (with some caveats about encoding and compression of that data).
In this part we’re going to tackle the more “ephemeral” Session Storage data (and find out how ephemeral it actually is), but this task will require us to first learn a bit more about how Firefox handles storing and restoring a browsing session.
The “sessionstore.jsonlz4” file in each Mozilla profile folder is involved in maintaining the current state of the browser: what windows are open, what tabs are open in each window, what the back-forwards list for those tabs are, scroll positions, cookies, and yes: the Session Storage which is isolated within each of those tabs. If you are using Firefox right now to view this post and you go looking for this file, you will likely not find it; that’s because it usually only gets created when the browser shuts down (and gets removed when the browser starts up). There will be, however, backups of this folder that you can access – but we’ll get to those in a little while.
This file contains JSON data that has been compressed using the LZ4 compression algorithm. Mozilla has its own framing format for LZ4 which comprises a header “6D 6F 7A 4C 7A 34 30 00” (or “mozLz40\0” in ASCII), followed by the length of the decompressed data, as a little-endian 32-bit integer. The compressed data follows, which in this case, when decompressed is UTF-8 encoded text containing JSON data. Our data-exploration tool RabbitHole can handily deal with both the decompression and the decoding and display of the JSON data.
At a high level, the structure of the JSON data in the decoded object comprises a root object which contains a “windows” property, the value of which is a list of objects, each representing the state of a window. Each of those objects contains a “tabs” property, the value of which is a list of objects, each representing a tab open in that window. These tab objects have a number of useful properties (key amongst them: the “entries” property which contains the back-forwards list of pages that have been visited in that tab – although it’s not our focus in this article, this is all useful stuff! For our purposes, it’s the “storage” property where we find the session storage data for each tab.
The value of the “storage” property is an object, the properties of which are named after the sites that have stored session storage data in that tab, the values of which are objects with properties for each key stored.
So, that’s relatively straightforward… but it isn’t the end of the story! Session storage is isolated to a single tab in the browser, and when that tab is closed, the data should go away, right? Well, that’s the theory, but let’s imagine a scenario where you accidentally close a tab: annoying, it’s OK though, because there’s a feature where you can re-open a closed tab for a while after it was closed. If that re-opened tab is to continue operating in the way it was before closing, it needs to maintain any session storage data that it previously had.
Enter the “_closedTabs” property! As well as the tab objects in the “tabs” property, this file maintains a list of closed tab objects for each window. These items can stick around for quite some time – including surviving the browser being shut down altogether. The objects in this list almost mirror the structure of those in the main “tabs” property, although to get to most of the data we had seen previously we first need to enter the “state” property; once there though we’ll find the “storage” key and more session storage data.
So, that’s it right, live tabs, closed tabs, that’s everything? Well, actually, no, not quite.
In the profile folder you will also find a folder named “sessionstore-backups”. Exactly what files you’ll find in there depends on the state of the browser, but all of the files found here have the same structure as the “sessionstore.jsonlz4” file we have already explored and can all contain previous session storage data.
If the browser has ever been used, you should expect to see a file called “previous.jsonlz4”, which should mirror the “sessionstore.jsonlz4” file that was last removed from the main profile folder when the browser was run. If the browser is currently running, or wasn’t closed down cleanly before this folder is examined, you should also see “recovery.jsonlz4” and “recovery.baklz4” which conform to the current (or very recent) state of the browser. The files with names that begin “upgrade.jsonlz4-“ are backups of the session store file when Firefox is updated to a new version. In our testing, there were only ever 3 of these files, with the oldest being deleted when the fourth file appears, but the old files may be recoverable from the file system and even if not, with the usual cadence of updates in Firefox, you will often have a set of data which is around a month old. Remember that these files can all contain closed tab data from the point that they were “backed-up” too.
It turns out that session storage in Mozilla isn’t as ephemeral as might be expected, although, as with all good things, it does require a little digging. Luckily, our data exploration tool RabbitHole supports all of the various data types, encodings and compression algorithms involved in decoding this data. If you’re Pythonically inclined, our open source library can help you extract this data programmatically and our fantastic, free tool Mister Skinnylegs can already extract many interesting artefacts from this data source.
Our experts are on hand to learn about your organisation and suggest the best approach to meet your needs. Contact an expert today.
Get in touch