

VirtualBox Live State File Format
source link: https://parsiya.net/blog/2018-01-29-virtualbox-live-state-file-format/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Jan 29, 2018 - 13 minute read - Comments - Forensics
VirtualBox Live State File Format
In Mounting Live Snapshots of Encrypted VMs in VirtualBox we mounted a live snapshot and logged into the machine. We also got a sneak peek of what kind of information we can extract from the live snapshot (sav
file).
In this post I will talk about parts of the live state file format and show some data that can be extracted from these files. The format is not formally documented but we have access to commented source.
To make it easier, I have uploaded the live state from our previous MysteryVM separately. You can download it from Google Drive (45 MB).
Dive into sav
According to this thread:
A .sav file is created using the SSM (saved state manager) code. You find the API in include/VBox/vmm/ssm.h and src/VBox/VMM/VMMR3/SSM.cpp.
These are the relevant files:
Sav File Header - SSMFILHDR
First 64 bytes are the file header.
00000000 7f 56 69 72 74 75 61 6c 42 6f 78 20 53 61 76 65 |.VirtualBox Save|
00000010 64 53 74 61 74 65 20 56 32 2e 30 0a 00 00 00 00 |dState V2.0.....|
00000020 05 00 01 00 1c 00 00 00 d0 cc 01 00 40 08 08 00 |........ÐÌ..@...|
00000030 2a 00 00 00 01 00 00 00 00 10 00 00 08 df 3b 9e |*............ß;.|
Inside SSM.cpp search for struct SSMFILEHDR
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
typedef struct SSMFILEHDR
{
/** Magic string which identifies this file as a version of VBox saved state
* file format (SSMFILEHDR_MAGIC_V2_0). */
char szMagic[32];
/** The major version number. */
uint16_t u16VerMajor;
/** The minor version number. */
uint16_t u16VerMinor;
/** The build number. */
uint32_t u32VerBuild;
/** The SVN revision. */
uint32_t u32SvnRev;
/** 32 or 64 depending on the host. */
uint8_t cHostBits;
/** The size of RTGCPHYS. */
uint8_t cbGCPhys;
/** The size of RTGCPTR. */
uint8_t cbGCPtr;
/** Reserved header space - must be zero. */
uint8_t u8Reserved;
/** The number of units that (may) have stored data in the file. */
uint32_t cUnits;
/** Flags, see SSMFILEHDR_FLAGS_XXX. */
uint32_t fFlags;
/** The maximum size of decompressed data. */
uint32_t cbMaxDecompr;
/** The checksum of this header.
* This field is set to zero when calculating the checksum. */
uint32_t u32CRC;
} SSMFILEHDR;
First 32 bytes contain the magic string
.
1
2
/** Saved state file v2.0 magic. */
#define SSMFILEHDR_MAGIC_V2_0 "\177VirtualBox SavedState V2.0\n\0\0\0"
\177
is octal (base 8) or 0x7F
.
00000020 05 00 01 00 1c 00 00 00 d0 cc 01 00 40 08 08 00 |........ÐÌ..@...|
Then the rest (in little-endian):
05 00
: Major version number ==5
.10 00
: Minor version number ==1
.1c 00 00 00
: Build number ==28
.d0 cc 01 00
: SVN revision ==117968
.40
: Host bits ==64
(32 bit or 64 bit host).08
: Size of RTGCPHYS (Guest physical memory address) ==08
.08
: Size of RTGCPTR (Guest context pointer) ==08
. I have no idea what this does.00
: Reserved header space - must be zero.
Thanks to the reserved header space, these fields start at a new line:
00000030 2a 00 00 00 01 00 00 00 00 10 00 00 08 df 3b 9e |*............ß;.|
2a 00 00 00
: Number of data units in the file ==42
.01 00 00 00
: Flags, discussed below.01
== it's a live save state file.00 10 00 00
: Maximum size of decompressed data ==4096
.08 df 3b 9e
: CRC-32 checksum of the header. Set to zero when calculating the checksum.
You can use the following code snippet to calculate CRC-32 in Go (remember to zero-out the last four bytes first):
Calculate CRC-32 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
package main
import (
"encoding/hex"
"fmt"
"hash/crc32"
)
var str1 = "7F5669727475616C426F7820536176656453746174652056322E300A00000000
050001001C000000D0CC0100400808002A000000010000000010000000000000"
func main() {
bytes1, err := hex.DecodeString(str1)
if err != nil {
panic(err)
}
fmt.Printf("%x", crc32.ChecksumIEEE(bytes1))
}
// Returns: 9e3bdf08
Try it at play.golang.org.
For some reason CyberChef does not show the correct CRC-32 checksum while other online calculators do. Some info might be lost when convert hex bytes to chars and calculating the checksum.
Flags
In short, flag is 01
if it's a live save state (which was in our case) and 00
if the stream is checksummed up to the footer (not sure what this flag value is used for).
Flags are defined as follows:
SSMFILEHDR::fFlags - SSM.cpp/** @name SSMFILEHDR::fFlags
* @{ */
/** The stream is checksummed up to the footer using CRC-32. */
#define SSMFILEHDR_FLAGS_STREAM_CRC32 RT_BIT_32(0)
/** Indicates that the file was produced by a live save. */
#define SSMFILEHDR_FLAGS_STREAM_LIVE_SAVE RT_BIT_32(1)
/** @} */
Saved State Units
After the header we can see a number of saved state units that conveniently start with the text Unit
.
Search for typedef struct SSMUNIT
in SSMInternal.h for more info.
We're going to look at the SSM unit, but all units have the same structure.
SSM Data Unit
First data unit after the file header is the "Saved State" unit or SSM.
SSM Unit Header - SSMFILEUNITHDRV2
Each unit has its own header that is defined as follows:
SSMFILEUNITHDRV2 - SSM.cpp 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
typedef struct SSMFILEUNITHDRV2
{
/** Magic (SSMFILEUNITHDR_MAGIC or SSMFILEUNITHDR_END). */
char szMagic[8];
/** The offset in the saved state stream of the start of this unit.
* This is mainly intended for sanity checking. */
uint64_t offStream;
/** The CRC-in-progress value this unit starts at. */
uint32_t u32CurStreamCRC;
/** The checksum of this structure, including the whole name.
* Calculated with this field set to zero. */
uint32_t u32CRC;
/** Data version. */
uint32_t u32Version;
/** Instance number. */
uint32_t u32Instance;
/** Data pass number. */
uint32_t u32Pass;
/** Flags reserved for future extensions. Must be zero. */
uint32_t fFlags;
/** Size of the data unit name including the terminator. (bytes) */
uint32_t cbName;
/** Data unit name, variable size. */
char szName[SSM_MAX_NAME_SIZE];
} SSMFILEUNITHDRV2;
THE SSM unit in our file is:
00000040 0a 55 6e 69 74 0a 00 00 40 00 00 00 00 00 00 00 |.Unit...@.......|
00000050 91 d4 5f f6 13 12 1d eb 01 00 00 00 00 00 00 00 |.Ô_ö...ë........|
00000060 ff ff ff ff 00 00 00 00 04 00 00 00 53 53 4d 00 |ÿÿÿÿ........SSM.|
00000070 92 39 0a 00 00 00 42 75 69 6c 64 20 54 79 70 65 |.9....Build Type|
00000080 07 00 00 00 72 65 6c 65 61 73 65 07 00 00 00 48 |....release....H|
00000090 6f 73 74 20 4f 53 09 00 00 00 77 69 6e 2e 61 6d |ost OS....win.am|
000000a0 64 36 34 00 00 00 00 00 00 00 00 |d64........|
First 8 bytes contain the data unit magic header. It can have two values:
Magics - SSM.cpp/** Data unit magic. */
#define SSMFILEUNITHDR_MAGIC "\nUnit\n\0"
/** Data end marker magic. */
#define SSMFILEUNITHDR_END "\nTheEnd"
TheEnd
appears in the the block after all data units.
In this data unit we have 0a 55 6e 69 74 0a 00 00
or \nUnit\n\0
.
Starting from offset 0x40
we have:
40 00 00 00
: Offset of this unit in the stream. For this unit it's0x40
. In other words, if we go this offset in the sav file, we will see this data unit.00 00 00 00
: "The CRC-in-progress value this unit starts at." No idea what this is. It could be the default value of the CRC bytes when we are creating the CRC checksum (see above). It has been zero for this data unit in all live states I have seen.91 d4 5f f6
: Checksum of the data unit with the checksum bytes set to zero.13 12 1d eb
: Data version ==eb 1d 12 13
. (?)01 00 00 00
: Instance number ==01
. (?)00 00 00 00
: Data pass number ==00
. (?)ff ff ff ff
: These 4 bytes do not appear in the spec. Could be alignment for the data unit header? These appear in every live state.00 00 00 00
: Flags reserved for future extensions. Must be zero.04 00 00 00
: Size of null-terminated data unit name in bytes,4
in this data unit. In other words, read this many bytes to get the data unit name with the null terminator.53 53 4d 00
: Data unit name and null terminator,SMM
.
SSM Unit Data
Going back up the SSM.cpp file we can see the specifications for the data in this unit:
00000070 92 39 0a 00 00 00 42 75 69 6c 64 20 54 79 70 65 |.9....Build Type|
00000080 07 00 00 00 72 65 6c 65 61 73 65 07 00 00 00 48 |....release....H|
00000090 6f 73 74 20 4f 53 09 00 00 00 77 69 6e 2e 61 6d |ost OS....win.am|
000000a0 64 36 34 00 00 00 00 00 00 00 00 |d64........|
The first byte in the record header indicates the type and flags.
We have 0x92
or 1001 0010
:
- bits 0..3 - Record type
0010
: type = Raw data record. - bit 4:
1
if important and0
if it can be skipped. This is an important data unit. - bits 5, 6: Must be
0
. - bit 7: Always
1
.
Record header byte 2 (optionally thru 7) is the size of the following data encoded in UTF-8 style.
We have 0x39
or 57
decimal. So we will read 57 bytes.
This part is easy. Read a little-endian uint32, that's field length. Then read that many bytes to get the field. This is a typical Pascal style string (as seen in ASN.1 format) which do not have null/string terminators.
0a 00 00 00 - 42 75 69 6c 64 20 54 79 70 65
: 10 bytes -Build Type
.07 00 00 00 - 72 65 6c 65 61 73 65
: 7 bytes -release
.07 00 00 00 - 48 6f 73 74 20 4f 53
: 7 bytes -Host OS
.09 00 00 00 - 77 69 6e 2e 61 6d 64 36 34
: 9 bytes -win.amd64
.00 00 00 00 - 00 00 00 00
: 0 bytes - empty (?).
End Data Unit
This is the last data unit. It starts with the magic header SSMFILEUNITHDR_END
:
1
2
/** Data end marker magic. */
#define SSMFILEUNITHDR_END "\nTheEnd"
We can see it at offset 0x4C7A712
:

00000000 0a 54 68 65 45 6e 64 00 12 a7 c7 04 00 00 00 00 |.TheEnd..§Ç.....|
00000010 94 e5 7d 35 47 2b 45 32 00 00 00 00 00 00 00 00 |.å}5G+E2........|
00000020 ff ff ff ff 00 00 00 00 00 00 00 00 |ÿÿÿÿ........|
End data unit does not have any data, it only has a header and does not have a name. Using the SSMFILEUNITHDRV2
struct we can dissect it:
0a 54 68 65 45 6e 64 00
: Header"\nTheEnd"
.12 a7 c7 04
: Offset of this unit ==04 c7 a7 12
.00 00 00 00
: CRC-in-progress.94 e5 7d 35
: Checksum of the data unit.47 2b 45 32
: Data version. (?)00 00 00 00
: Instance number.00 00 00 00
: Data pass number.ff ff ff ff
: Not in spec, but appears in every data unit.00 00 00 00
: Flags reserved for future extensions. Must be zero.00 00 00 00
: Size of data unit name in bytes. Zero here.
Directory
In our file, directory is just right after the end unit.

It has a header followed by some directory entries.
Directory Header - SSMFILEDIR
Inside SSM.cpp search for SSMFILEDIR
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/**
* Directory for the data units from the final pass.
*
* This is used to speed up SSMR3Seek (it would have to decompress and parse the
* whole stream otherwise).
*/
typedef struct SSMFILEDIR
{
/** Magic string (SSMFILEDIR_MAGIC). */
char szMagic[8];
/** The CRC-32 for the whole directory.
* Calculated with this field set to zero. */
uint32_t u32CRC;
/** The number of directory entries. */
uint32_t cEntries;
/** The directory entries (variable size). */
SSMFILEDIRENTRY aEntries[1];
} SSMFILEDIR;
/** The directory magic. */
#define SSMFILEDIR_MAGIC "\nDir\n\0\0"
00000000 0a 44 69 72 0a 00 00 00 ef 86 a6 b5 22 00 00 00 |.Dir....ï.¦µ"...|
0a 44 69 72 0a 00 00 00
: Magic header.ef 86 a6 b5
: CRC-32 of the whole directory.22 00 00 00
: Number of directories0x22
==34
decimal.
Then we have 34 directories (each directory is 16 bytes) of type SSMFILEDIRENTRY
.
Directories - SSMFILEDIRENTRY
The directories have a simple structure.
SSMFILEDIRENTRY - SSM.cpp 1
2
3
4
5
6
7
8
9
10
11
12
/**
* Directory entry.
*/
typedef struct SSMFILEDIRENTRY
{
/** The offset of the data unit. */
uint64_t off;
/** The instance number. */
uint32_t u32Instance;
/** The CRC-32 of the name excluding the terminator. (lazy bird) */
uint32_t u32NameCRC;
} SSMFILEDIRENTRY;
We will look at the first one:
00000010 40 00 00 00 00 00 00 00 00 00 00 00 41 7a 40 08 |@...........Az@.|
40 00 00 00 00 00 00 00
: Data unit offset. We have already seen this data unit. This is the SSM unit.00 00 00 00
: Instance number ==0x00
.41 7a 40 08
: CRC-32 of the name excluding the terminator.
File Footer - SSMFILEFTR
Finally we have the file footer.
SSMFILEFTR_MAGIC and SSMFILEFTR - SSM.cpp 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
/** Saved state file v2.0 magic. */
#define SSMFILEFTR_MAGIC "\nFooter"
/**
* Footer structure
*/
typedef struct SSMFILEFTR
{
/** Magic string (SSMFILEFTR_MAGIC). */
char szMagic[8];
/** The offset of this record in the stream. */
uint64_t offStream;
/** The CRC for the stream.
* This is set to zero if SSMFILEHDR_FLAGS_STREAM_CRC32 is clear. */
uint32_t u32StreamCRC;
/** Number directory entries. */
uint32_t cDirEntries;
/** Reserved footer space - must be zero. */
uint32_t u32Reserved;
/** The CRC-32 for this structure.
* Calculated with this field set to zero. */
uint32_t u32CRC;
} SSMFILEFTR;
Which is:
00000000 0a 46 6f 6f 74 65 72 00 6e a9 c7 04 00 00 00 00 |.Footer.n©Ç.....|
00000010 c7 2f 92 29 22 00 00 00 00 00 00 00 bc d2 96 8d |Ç/.)".......¼Ò..|
Based on the struct we can dissect it:
0a 46 6f 6f 74 65 72 00
:SSMFILEFTR_MAGIC
==\nFooter
.6e a9 c7 04
: Offset of this record in the stream ==04 c7 a9 6e
.00 00 00 00
: CRC-32 of the stream. It's set to zero.c7 2f 92 29
: This is not in the struct.22 00 00 00
: Number of directory entries ==34
decimal.00 00 00 00
: Reserved, must be zero.bc d2 96 8d
: CRC-32 for this structure. Set to zero during calculation.
There's one extra uint32
in the footer. I do not know what 29 92 2f c7
is.
Conclusion
We learned how to extract some information from VBox sav files with a practical example. Although the product is opensource, the format does not have public documentation.
Recommend
-
35
SQLite As An Application File Format Executive Summary An SQLite database file with a defined schema often makes an excellent application file format. Here are a dozen reasons why this is so:
-
12
Reverse Engineering Instruments’ File Format Have you ever wondered how applications store their data? Plenty of file formats like MP3 and JP...
-
17
Flash 6 File Format Spec Bug Reports Thursday, November 7, 2002 If you find any bugs or documentation errors in the Flash 6 File Format Specification,...
-
8
Macromedia Flash 6 File Format Specification is Available Friday, October 25, 2002 We have released the file format specification for Flash 6. This contains everything you need to know to create your own SWF file to pla...
-
8
Jan 23, 2018 - 11 minute read - Comments - Forensics
-
9
Reverse engineering the Notability file format I spend a fair amount of time drawing comics about programming. (I have a new zine called “profiling & tracing with perf”...
-
8
Virtualization Install FreeBSD 10.0 using memstick IMG file on Virtualbox ...
-
9
File format decision treeThis is a small decision tree to help with the best file format for each occasion. It is half serious and half tongue-in-cheek, and it minimizes the importance of SVG, but it's a good approach on how...
-
5
What is the best file format for web shortcuts Links primarily exist on the web, but they can also exist as files in your local file system. There are several formats for storing links as files that open in your web browser....
-
14
How To Convert VirtualBox Disk Image (VDI) and img to Qcow2 formatThis simple guide will show you the easiest way to convert VDI disk image to use on your Openstack Cloud environ...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK