The files written to external media by rekordbox for use in player
hardware contain a wealth of information that can be used in place of
queries to the
remotedb server on the players, which is important
because they can be obtained from the players’ NFS servers, even if
there are four players in use sharing the same media. Under those
remotedb queries are impossible. This document shares
what has been learned so far about the files, and how to interpret
The starting point for finding track metadata from a player is the database export file, which can be found within rekordbox media at the following path:
(If you are using the
FileFetcher to request this file, use that path as the
filePath argument, and use a
mountPath value of
/B/ if you want
to read it from the SD slot, or
/C/ to obtain it from the USB slot).
The file is a relational database format designed to be efficiently used by very low power devices (there were deployments on 16 bit devices with 32K of RAM). Today you are most likely to encounter it within the Pioneer Professional DJ ecosystem, because it is the format that their rekordbox software uses to write USB and SD media which can be mounted in DJ controllers and used to play and mix music.
The file consists of a series of fixed size pages. The first page contains a file header which defines the page size and the locations of database tables of different types, by the index of their first page. The rest of the pages consist of the data pages for all of the tables identified in the header.
Each table is made up of a series of rows which may be spread across any number of pages. The pages start with a header describing the page and linking to the next page. The rest of the page is used as a heap: rows are scattered around it, and located using an index structure that builds backwards from the end of the page. Each row of a given type has a fixed size structure which links to any variable-sized strings by their offsets within the page.
As changes are made to the table, some records may become unused, and there may be gaps within the heap that are too small to be used by other data. There is a bit map in the row index that identifies which rows are actually present. Rows that are not present must be ignored: they do not contain valid (or even necessarily well-formed) data.
Unless otherwise stated, all multi-byte numbers in the file are stored in little-endian byte order. Field names used in the byte field diagrams match the IDs assigned to them in the Kaitai Struct specification, unless that is too long to fit, in which case a subscripted abbreviation is used, and the text will mention the actual struct field name.
The first page begins with the file header, shown below. The header
starts with four zero bytes, followed by a four-byte integer,
len_page at byte
04, that establishes the size of each page
(including this first one), in bytes. This is followed by another
four-byte integer, num_tables at byte
08, which reports the
number of different tables that are present in the file. Each table
will have a table pointer entry in the “Table pointers” section of the
file header, described below, that identifies and locates the table.
The four-byte integer nextu at byte
0c has an unknown purpose,
but Mr. Lesniak named it
next_unused_page and said “Not used as any
empty_candidate, points past the end of the file.” The four-byte
integer sequence, at byte
14, was described “Always incremented by
at least one, sometimes by two or three.” and I assume this means it
reflects a version number that rekordbox updates when synchronizing
to the exported media.
Finally, there is another series of four zero bytes, and then the
header ends with the list of table pointers which begins at byte
There are as many of these as specified by num_tables, and each has
the following structure:
Each Table Pointer is a series of four four-byte integers. The first,
type, identifies the type of table being defined. The known table
types are shown in below. The second value, at byte
04 of the table
pointer, was called empty_candidate by Mr. Lesniak. It may link to
a chain of empty pages if the database is ever garbage collected, but
this is speculation on my part.
Track metadata: title, artist, genre, artwork ID, playing time, etc.
Musical genres, for reference by tracks and searching.
Artists, for reference by tracks and searching.
Albums, for reference by tracks and searching.
Music labels, for reference by tracks and searching.
Musical keys, for reference by tracks, searching, and key matching.
Color labels, for reference by tracks and searching.
Holds the hierarchical tree structure of playlists and folders grouping them.
Links tracks to playlists, in the right order.
File paths of album artwork images.
Details not yet confirmed.
Holds the list of history playlists in the History menu.
Links tracks to history playlists entries, in the right order.
Data used by rekordbox to synchronize history playlists (not yet studied).
Other than the type, the two important values are first_page at
08 and last_page at byte
0c. These tell us how to find
the table. They are page indices, where the page containing the file
header has index 0, the page with index 1 begins at byte len_page,
and so on. In other words, the first page of the table identified by
the current table pointer can be found within the file starting at the
byte len_page × first_page.
The table is a linked list of pages: each page contains the index of the next page after it. However, you need to keep track of the last_page value for the table, because it tells you not to try to follow the next page link once you reach the page with that index. (If you do keep going, you will start reading pages of some different table.) The structure of the table pages themselves are described in the next section.
As far as we know, the remainder of the first page after the table pointers is unused.
The table header is followed by the table pages themselves. These each have the size specified by len_page in the above diagram, and the following structure:
Data pages all seem to have the header structure described here, but not all of them actually store data. Some of them are “strange” and we have not yet figured out why. The discussion below describes how to recognize a strange page, and avoid trying to read it as a data page.
The first four bytes of a table page always seem to be zero. This is followed by a four-byte value page_index which identifies the index of this page within the list of table pages (the header has index 0, the first actual data page the index 1, and so on). This value seems to be redundant, because it can be calculated by dividing the offset of the start of the page by len_page, but perhaps it serves as a sanity check.
This is followed by another four-byte value, type, which identifies the type of the page, using the values shown in the preceding table. This again seems redundant because the table header which was followed to reach this page also identified the table type, but perhaps it is another sanity check, or an alternate way to tell, when following page links, that you have reached the end of the table you are interested in. Speaking of which, the next four-byte value, next_page, is that link: it identifies the index at which the next page of this table can be found, as long as we have not already reached the final page of the table, as described in File Header.
The exact meaning of unknown1 is unclear. Mr. Flesinak said
“sequence number (0→1: 8→13, 1→2: 22, 2→3: 27)” but I don’t know how
to interpret that. Even less is known about unknown2 . But
num_rows_small at byte
18 within the page (abbrviated nrs in
the byte field diagram above) holds the number of rows that are
present in the page, unless num_rows_large (below) holds a value
that is larger than it (but not equal to
1fff). This seems like a
strange mechanism for dealing with the fact that some tables (like
playlist entries) have a lot of very small rows, too many to count
with a single byte. But then why not just always use
The purpose of the next two bytes are is also unclear. Of u3 Mr. Flesniak said “a bitmask (first track: 32)”, and he described u4 as “often 0, sometimes larger, especially for pages with a high number of rows (e.g. 12 for 101 rows)”.
1b is called page_flags (abbrviated pf in the
diagram). According to Mr. Flesniak, “strange” (non-data) pages will
have the value
64, and other pages have had the values
34. Crate Digger considers a page to be a data page if
40 = `0`.
1d are called free_size (abbreviated frees
in the diagram), and store the amount of unused space in the page heap
(excluding the row index which is built backwards from the end of the
page); used_size at bytes
1d (abbreviated useds)
stores the number of bytes that are in use in the page heap.
21, u5 , are of unclear purpose. Mr. Flesniak
labeled them “(0→1: 2).”
23, num_rows_large (abbrviated numrl in
the diagram) hold the number of entries in the row index at the end of
the page when that value is too large to fit into num_rows_small
(as mentioned above), and that situation seems to be indicated when
this value is larger than num_rows_small, but not equal to
u6 at bytes
25 seems to have the value
strange pages, and
0000 for data pages. And Mr. Flesniak describes
u7 at bytes
27 as “always 0 except 1 for history
pages, num entries for strange pages?”
After these header fields comes the page heap. Rows are allocated
within this heap starting at byte
28. Since rows can be different
sizes, there needs to be a way to locate them. This takes the form of
a row index, which is built from the end of the page backwards, in
groups of up to sixteen row pointers along with a bitmask saying which
of those rows are still part of the table (they might have been
deleted). The number of row index entries is determined, as described
above, by the value of either num_rows_small or
The bit mask for the first group of up to sixteen rows, labeled
rowpf0 in the diagram (meaning “row presence flags group 0”), is
found near the end of the page. The last two bytes after each row
bitmask (for example pad0 after rowpf0) have an unknown
purpose and may always be zero, and the rowpf0 bitmask takes up
the two bytes that precede them. The low order bit of this value will
be set if row 0 is really present, the next bit if row 1 is really
present, and so on. The two bytes before these flags, labeled
ofs0, store the offset of the first row in the page. This offset
is the number of bytes past the end of the page header at which the
row itself can be found. So if row 0 begins at the very beginning of
the heap, at byte
28 in the page, ofs0 would have the value
As more rows are added to the page, space is allocated for them in the heap, and additional index entries are added at the end of the heap, growing backwards. Once there have been sixteen rows added, all of the bits in rowpf0 are accounted for, and when another row is added, before its offset entry ofs16 can be added, another row bit-mask entry rowpf1 needs to be allocated, followed by its corresponding pad1. And so the row index grows backwards towards the rows that are being added forwards, and once they are too close for a new row to fit, the page is full, and another page gets allocated to the table.
The structure of the rows themselves is determined by the type of the table, using the values shown in Table types.
Album rows hold an album name and ID along with an artist association,
with the structure shown below. The unknown value at
01 seems to usually have the values
80 00. It is
followed by a two-byte value Mr. Flesniak called index_shift,
although I don’t know what that means, and another four bytes of
unknown purpose. But at bytes
0b we finally find a value
we have a use for: artist_id holds the ID of an artist row
associated with this track row. This is followed by id, the ID of
this track row itself, at bytes
0f. We assume that there
are index tables somewhere that would let us locate the page and row
index of a record given its table type and ID, but we have not yet
found and figured them out.
This is followed by five more bytes with unknown meaning, and the final byte in the row, ofs_name is a pointer to the track name (labeled on in the byte field diagram). To find the location of the name, add ofs_name bytes to the address of the start of the track row itself. The name itself is encoded in a surprisingly baroque way, explained in DeviceSQL Strings.
Artist rows hold an Artist name and ID, with the structure shown in
Artist row with nearby name or
Artist row with far name. The subtype value at
01 determines which variant is used. If the artist
name was allocated close enough to the row to be reached by a single
byte offset, offset, subtype has the value
0060, and the row has
the structure in Artist row with nearby name. If
the name is too far away for that, subtype has the value
the row has the structure in Artist row with far
In either case, subtype is followed by the unexplained two-byte
value found in many row types that Mr. Flesniak called
index_shift, and then by id, the ID of this artist row itself,
07, an unknown value at byte
ofs_name_near at byte
09 (labeled on), the one-byte
name offset used only in the first variant.
If subtype is
0064, the value of ofs_name_near is ignored, and
instead the two-byte value ofs_name_far (labeled ofar) is
Whichever name offset is used, it is a pointer to the artist name. To find the location of the name, add the value of the offset to the address of the start of the artist row itself. This gives the address of a DeviceSQL string holding the name, with the structure explained in DeviceSQL Strings.
Artwork rows hold an id (which tracks refer to) and the path at which the corresponding album art image file can be found, with the structure shown below. Note that in this case, the DeviceSQL string path is embedded directly into the row itself, rather than being located elsewhere in the heap through an offset. The structure of the string itself is still as described in DeviceSQL Strings.
The art file pointed to by this path will be the original
resolution 80x80 pixel image. Recent versions of rekordbox will also
add a higher resolution image, at 240x240 pixels. Its path can be
found by adding the string
_m right before the file extension. So
for example if the original resolution path is
high resolution file can be found at
Color rows hold a numeric color id (which controls the actual color
displayed on the player interface) at bytes
06 and a
text label or name starting at byte
08 which is a
DeviceSQL string shown in the information panel
for tracks that are assigned the color. The rows have the structure
shown below. There are several bytes in the row that are not yet known
to have any meaning.
Regardless of the names assigned to the colors by the user, the row id values map to the following colors in the user interface of rekordbox and on CDJs:
Genre rows hold a numeric genre id (which tracks can be assigned) at
03 and a text name starting at byte
which is a DeviceSQL string. The rows
have the structure shown below:
The History menu automatically records playlists of the tracks performed off a particular USB or SD card in a new, numbered playlist each time the media is mounted in a player. These playlists have names like "HISTORY 001". This table lists all the history playlists which have been created for the current database, tying their name to an ID which is used to match the History Entry Rows that make up the playlist for the corresponding performance.
History entry rows list the tracks that belong to a particular history
playlist, and also establish the order in which they were played. They
have a very simple structure, shown below, containing only three
values. The track_id at bytes
03 identifies the
track that was played at this position in the playlist, by
corresponding to the id of a row in the Track table.
The playlist_id at bytes
07 identifies the history
playlist to which it belongs, by corresponding to the id of a row
in the History Playlist list. The
entry_index at bytes
0b specifies the position
within the playlist at which this entry belongs.
Key rows represent musial keys. They hold a numeric id (which tracks
can be assigned) at bytes
03 and a text name starting
08 which is a DeviceSQL string.
(There seems to be a second copy of the ID at bytes
The rows have the structure shown below:
Playlist tree rows are used to organize the hierarchical structure of the playlist menu. There is probably an index somewhere that makes it possible to find the right rows directly when loading a playlist, but we have not yet figured out how indices work in DeviceSQL databases, so Crate Digger simply reads all the rows and builds its own in-memory index of the tree.
Playlist tree rows can either represent a playlist “folder” which
contains other folders and playlists, or a regular playlist which
holds only tracks. The rows are identified by an id at
0f, and also contain a parent_id at
03 which is how the hierarchical structure is
represented: the contents of a folder are the other rows in this table
whose parent_id folder is equal to the id of the folder.
Similarly, the tracks that make up a regular playlist are the Playlist Entry Rows whose playlist_id is equal to this row’s id.
Each playlist tree row also has a text name starting at
14 which is a DeviceSQL string
displayed when navigating the hierarchy, a sort_order indicator at
0b (this may be the same value used to select sort
orders when requesting menus using the dbserver protocol, shown in the
analysis, but this has not yet been confirmed), and a value that
specifies whether the row defines a folder or a playlist. In the
Kaitai Struct, this value is called raw_is_folder, is found at
13, and has a non-zero value for folders. For
convenience, the struct also defines a derived value, is_folder,
which is a boolean.
The rows have the following structure:
Playlist entry rows list the tracks that belong to a particular
playlist, and also establish the order in which they should be played.
They have a very simple structure, shown below, containing only three
values. The entry_index at bytes
03 specifies the
position within the playlist at which this entry belongs. The
track_id at bytes
07 identifies the track to be
played at this position in the playlist, by corresponding to the id
of a row in the Track table, and the playlist_id at
0b identifies the playlist to which it belongs, by
corresponding to the id of a row in the
Track rows describe audio tracks that can be played from the media export, and provide many details about the music including links to other tables like artists, albums, keys, and others. They have the structure shown below:
The first two bytes, labeled u1, have an unknown purpose; they
24 followed by
00. They are followed by the
unexplained two-byte value found in many row types that Mr. Flesniak
called index_shift, and a four-byte value he called bitmask,
although we do not know what the bits mean. The value at
0b, sample_rate, is the first one we have a
solid understanding of: it holds the playback sample rate of the audio
file, in samples per second (this will be 0 if it is unknown or
0f hold the value composer_id which identifies
the composer of the track, if known, as a non-zero id value of an
Artist row. The size of the audio file, in bytes, is
found in file_size at bytes
13. This is followed by
an unknown four-byte value, u2, which may be another ID, and two
unknown two-byte values, u3 (about which Mr. Flesniak says “always
19048?”) and u4 (“always 30967?”).
If there is cover art for the track, there will be a non-zero value in
1f), identifying the id of an
If a dominant musical key was identified for the track there will be a
non-zero value in key_id (bytes
represents the id of a Key row. If the track is known
to be a remake, the non-zero Artist row id of the
original performer will be found at bytes
original_artist_id. If there is a known record label for the
track, the non-zero value in label_id (bytes
will link to the id of a Label row id. Similarly,
if there is a known remixer, there will be a non-zero value in
2f) linking to the id of an
The field bitrate at bytes
33 stores the playback bit
rate of the track, and track_number at bytes
holds the position of the track within its album. tempo at
3b holds the playback tempo of the start of the
track in beats per minute, multiplied by 100 (in order to support a
precision of BPM). If there is a known genre for
the track, there will be a non-zero value in genre_id at
3f, representing the id of a Genre
If the track is part of an album, there will be a non-zero value in
album_id at bytes
43, and this will be the id of
an Album row. The Artist row id of
the primary performer associated with the track is found in
artist_id at bytes
47. And the id of the track
itself is found in id at bytes
4b. If the album is
known to consist of multiple discs, the disc number on which this
track is found will be in disc_number at bytes
And the number of times the track has been played is found in
The year in which the track was recorded, if known, is in year at
51. The sample depth of the track audio file (bits
per sample) is in sample_depth at bytes
playback time of the track (in seconds, at normal speed) is in
duration at bytes
55. The purpose of the next two
bytes, labeled u5, is unknown; they seem to always hold the value
58, color_id (labeled cid in the diagram), holds
the color assigned to the track in rekordbox, as the id of a
Color row, or zero if no color has been assigned.
59, rating (labeled r in the diagram) holds the
rating (0 to 5 stars) assigned the track. The next two bytes, labeled
u6, have an unknown purpose, and seem to always have the value 1.
The two bytes after them, labeled u7, are also unknown; Mr.
Flesniak said “alternating 2 and 3”.
The rest of the track row is an array of 21 two-byte offsets that point to DeviceSQL strings. To find the start of the string, add the address of the start of the track row to the offset. The purpose of each string is described in the following table. For convenience, the strings can be accessed as Kaitai Struct instance values with the names shown in the table:
Unknown, named by @flesniak.
Unknown, “thought track number, wrong”.
Unknown, “strange things”.
Unknown, “strange things” (as above).
Unknown, named by @flesniak.
Unknown, usually empty.
When the track was added to the rekordbox collection.
When the track was released.
Name of the track remix, if any.
Unknown, usually empty.
File path of the track analysis.
When track analysis was performed.
Track comment assigned by the DJ.
Unknown, usually empty.
Name of track audio file.
File path of track audio.
Many row types store string values, sometimes by directly embedding them, but more often by storing an offset to a location elsewhere in the heap. In either case the string itself uses the structure described in this section. Strings can be stored in a variety of formats. The first byte of the structure seems to be a bunch of flags from which the format can be determined. We are not certain of the details because not all formats are present in the export files we have seen, so this represents our best guess so far.
Our best guess as to the interpretation of these bits follows:
If set, the string seems to be little-endian
The string following is encoded in ASCII, endianness does not apply
The contained string is encoded in UTF-8, endianness does not apply (not yet seen in practice though DeviceSQL claims to support encoding in ASCII, UTF-8, and UTF-16)
The contained string is encoded in UTF-16, endianness is determined by E
If this bit is set, then the string is a Short ASCII string, and the other "flag" bits in this byte actually store its length (see below)
The details of this analysis are somewhat speculative because the only
bit patterns we have seen in practice when S is zero are
for long-form ascii strings and
0b10010000 for long-form utf16le
strings (rekordbox probably just does not use the other supported
As described above, when S is
1, we are dealing with a short ASCII
string, and other flags are replaced by the seven-bit length field for
the string field (including this type-and-length byte, so the string
itself can be up to 126 characters long).
The flag byte described above is labeled lk (lengthAndKind) below. If S (the low-order bit of lk) is set, it means the string field holds a short ASCII string. The length of such a field can be extracted by right-shifting lk once (or, equivalently, dividing it by two). This length is for the entire string field, including lk itself, so the maximum length of actual string data is 126 bytes.
|DeviceSQL strings do not have terminator bytes, so attempting to read more bytes than present can lead to garbage characters being present or crashing the parser for the more complex unicode strings. ISRC Strings are the only exception.
Again the flag byte described above is labeled lk (lengthAndKind) below. If S (the low-order bit of lk) is zero, it means the string field holds a long or wide string, whose format is specified by the other flag bits described above, and whose length is determined by a two-byte length field which follows the flag byte:
As always, length represents the length of the entire field including the header bytes, so the length of the actual string data is . We have only ever seen zero values for the pad byte. The encoding of the string data is determined by the flag bits in lk_ as described above.
When an International Standard Recording Code is present as the
first string pointer in a track row, it is marked with kind
does not actually hold a UTF-16-LE string. Instead, the first byte
pad value following the length is the value
03 and then there are
bytes of ASCII, followed by a null byte. Crate
Digger does not yet attempt to cope with this.
00 byte in DeviceSQL UTF strings; we previously believed they were big-endian.
02 as content.