diff options
| author | Nico Weber <nicolasweber@gmx.de> | 2019-05-01 19:15:05 +0000 |
|---|---|---|
| committer | Nico Weber <nicolasweber@gmx.de> | 2019-05-01 19:15:05 +0000 |
| commit | 0a4aeec16eda1ebbaff4539c174926f43305142b (patch) | |
| tree | dac5f07566aa52aa0adaf7c25b11504547a29f60 /llvm/docs/PDB/MsfFile.rst | |
| parent | a0df4d37b0f6bccbd0ce0793f229bd8b130eeb8b (diff) | |
| download | bcm5719-llvm-0a4aeec16eda1ebbaff4539c174926f43305142b.tar.gz bcm5719-llvm-0a4aeec16eda1ebbaff4539c174926f43305142b.zip | |
Convert PDB docs to unix line endings. No other changes.
llvm-svn: 359712
Diffstat (limited to 'llvm/docs/PDB/MsfFile.rst')
| -rw-r--r-- | llvm/docs/PDB/MsfFile.rst | 358 |
1 files changed, 179 insertions, 179 deletions
diff --git a/llvm/docs/PDB/MsfFile.rst b/llvm/docs/PDB/MsfFile.rst index dfbbf9ded7f..a53ebe3e884 100644 --- a/llvm/docs/PDB/MsfFile.rst +++ b/llvm/docs/PDB/MsfFile.rst @@ -1,179 +1,179 @@ -=====================================
-The MSF File Format
-=====================================
-
-.. contents::
- :local:
-
-.. _msf_layout:
-
-File Layout
-===========
-
-The MSF file format consists of the following components:
-
-1. :ref:`msf_superblock`
-2. :ref:`msf_freeblockmap` (also know as Free Page Map, or FPM)
-3. Data
-
-Each component is stored as an indexed block, the length of which is specified
-in ``SuperBlock::BlockSize``. The file consists of 1 or more iterations of the
-following pattern (sometimes referred to as an "interval"):
-
-1. 1 block of data
-2. Free Block Map 1 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 1)
-3. Free Block Map 2 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 2)
-4. ``SuperBlock::BlockSize - 3`` blocks of data
-
-In the first interval, the first data block is used to store
-:ref:`msf_superblock`.
-
-The following diagram demonstrates the general layout of the file (\| denotes
-the end of an interval, and is for visualization purposes only):
-
-+-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
-| Block Index | 0 | 1 | 2 | 3 - 4095 | \| | 4096 | 4097 | 4098 | 4099 - 8191 | \| | ... |
-+=============+=======================+==================+==================+==========+====+======+======+======+=============+====+=====+
-| Meaning | :ref:`msf_superblock` | Free Block Map 1 | Free Block Map 2 | Data | \| | Data | FPM1 | FPM2 | Data | \| | ... |
-+-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+
-
-The file may end after any block, including immediately after a FPM1.
-
-.. note::
- LLVM only supports 4096 byte blocks (sometimes referred to as the "BigMsf"
- variant), so the rest of this document will assume a block size of 4096.
-
-.. _msf_superblock:
-
-The Superblock
-==============
-At file offset 0 in an MSF file is the MSF *SuperBlock*, which is laid out as
-follows:
-
-.. code-block:: c++
-
- struct SuperBlock {
- char FileMagic[sizeof(Magic)];
- ulittle32_t BlockSize;
- ulittle32_t FreeBlockMapBlock;
- ulittle32_t NumBlocks;
- ulittle32_t NumDirectoryBytes;
- ulittle32_t Unknown;
- ulittle32_t BlockMapAddr;
- };
-
-- **FileMagic** - Must be equal to ``"Microsoft C / C++ MSF 7.00\\r\\n"``
- followed by the bytes ``1A 44 53 00 00 00``.
-- **BlockSize** - The block size of the internal file system. Valid values are
- 512, 1024, 2048, and 4096 bytes. Certain aspects of the MSF file layout vary
- depending on the block sizes. For the purposes of LLVM, we handle only block
- sizes of 4KiB, and all further discussion assumes a block size of 4KiB.
-- **FreeBlockMapBlock** - The index of a block within the file, at which begins
- a bitfield representing the set of all blocks within the file which are "free"
- (i.e. the data within that block is not used). See :ref:`msf_freeblockmap` for
- more information.
- **Important**: ``FreeBlockMapBlock`` can only be ``1`` or ``2``!
-- **NumBlocks** - The total number of blocks in the file. ``NumBlocks * BlockSize``
- should equal the size of the file on disk.
-- **NumDirectoryBytes** - The size of the stream directory, in bytes. The stream
- directory contains information about each stream's size and the set of blocks
- that it occupies. It will be described in more detail later.
-- **BlockMapAddr** - The index of a block within the MSF file. At this block is
- an array of ``ulittle32_t``'s listing the blocks that the stream directory
- resides on. For large MSF files, the stream directory (which describes the
- block layout of each stream) may not fit entirely on a single block. As a
- result, this extra layer of indirection is introduced, whereby this block
- contains the list of blocks that the stream directory occupies, and the stream
- directory itself can be stitched together accordingly. The number of
- ``ulittle32_t``'s in this array is given by ``ceil(NumDirectoryBytes / BlockSize)``.
-
-.. _msf_freeblockmap:
-
-The Free Block Map
-==================
-
-The Free Block Map (sometimes referred to as the Free Page Map, or FPM) is a
-series of blocks which contains a bit flag for every block in the file. The
-flag will be set to 0 if the block is in use, and 1 if the block is unused.
-
-Each file contains two FPMs, one of which is active at any given time. This
-feature is designed to support incremental and atomic updates of the underlying
-MSF file. While writing to an MSF file, if the active FPM is FPM1, you can
-write your new modified bitfield to FPM2, and vice versa. Only when you commit
-the file to disk do you need to swap the value in the SuperBlock to point to
-the new ``FreeBlockMapBlock``.
-
-The Free Block Maps are stored as a series of single blocks thoughout the file
-at intervals of BlockSize. Because each FPM block is of size ``BlockSize``
-bytes, it contains 8 times as many bits as an interval has blocks. This means
-that the first block of each FPM refers to the first 8 intervals of the file
-(the first 32768 blocks), the second block of each FPM refers to the next 8
-blocks, and so on. This results in far more FPM blocks being present than are
-required, but in order to maintain backwards compatibility the format must stay
-this way.
-
-The Stream Directory
-====================
-The Stream Directory is the root of all access to the other streams in an MSF
-file. Beginning at byte 0 of the stream directory is the following structure:
-
-.. code-block:: c++
-
- struct StreamDirectory {
- ulittle32_t NumStreams;
- ulittle32_t StreamSizes[NumStreams];
- ulittle32_t StreamBlocks[NumStreams][];
- };
-
-And this structure occupies exactly ``SuperBlock->NumDirectoryBytes`` bytes.
-Note that each of the last two arrays is of variable length, and in particular
-that the second array is jagged.
-
-**Example:** Suppose a hypothetical PDB file with a 4KiB block size, and 4
-streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}.
-
-Stream 0: ceil(1000 / 4096) = 1 block
-
-Stream 1: ceil(8000 / 4096) = 2 blocks
-
-Stream 2: ceil(16000 / 4096) = 4 blocks
-
-Stream 3: ceil(9000 / 4096) = 3 blocks
-
-In total, 10 blocks are used. Let's see what the stream directory might look
-like:
-
-.. code-block:: c++
-
- struct StreamDirectory {
- ulittle32_t NumStreams = 4;
- ulittle32_t StreamSizes[] = {1000, 8000, 16000, 9000};
- ulittle32_t StreamBlocks[][] = {
- {4},
- {5, 6},
- {11, 9, 7, 8},
- {10, 15, 12}
- };
- };
-
-In total, this occupies ``15 * 4 = 60`` bytes, so ``SuperBlock->NumDirectoryBytes``
-would equal ``60``, and ``SuperBlock->BlockMapAddr`` would be an array of one
-``ulittle32_t``, since ``60 <= SuperBlock->BlockSize``.
-
-Note also that the streams are discontiguous, and that part of stream 3 is in the
-middle of part of stream 2. You cannot assume anything about the layout of the
-blocks!
-
-Alignment and Block Boundaries
-==============================
-As may be clear by now, it is possible for a single field (whether it be a high
-level record, a long string field, or even a single ``uint16``) to begin and
-end in separate blocks. For example, if the block size is 4096 bytes, and a
-``uint16`` field begins at the last byte of the current block, then it would
-need to end on the first byte of the next block. Since blocks are not
-necessarily contiguously laid out in the file, this means that both the consumer
-and the producer of an MSF file must be prepared to split data apart
-accordingly. In the aforementioned example, the high byte of the ``uint16``
-would be written to the last byte of block N, and the low byte would be written
-to the first byte of block N+1, which could be tens of thousands of bytes later
-(or even earlier!) in the file, depending on what the stream directory says.
+===================================== +The MSF File Format +===================================== + +.. contents:: + :local: + +.. _msf_layout: + +File Layout +=========== + +The MSF file format consists of the following components: + +1. :ref:`msf_superblock` +2. :ref:`msf_freeblockmap` (also know as Free Page Map, or FPM) +3. Data + +Each component is stored as an indexed block, the length of which is specified +in ``SuperBlock::BlockSize``. The file consists of 1 or more iterations of the +following pattern (sometimes referred to as an "interval"): + +1. 1 block of data +2. Free Block Map 1 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 1) +3. Free Block Map 2 (corresponds to ``SuperBlock::FreeBlockMapBlock`` 2) +4. ``SuperBlock::BlockSize - 3`` blocks of data + +In the first interval, the first data block is used to store +:ref:`msf_superblock`. + +The following diagram demonstrates the general layout of the file (\| denotes +the end of an interval, and is for visualization purposes only): + ++-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+ +| Block Index | 0 | 1 | 2 | 3 - 4095 | \| | 4096 | 4097 | 4098 | 4099 - 8191 | \| | ... | ++=============+=======================+==================+==================+==========+====+======+======+======+=============+====+=====+ +| Meaning | :ref:`msf_superblock` | Free Block Map 1 | Free Block Map 2 | Data | \| | Data | FPM1 | FPM2 | Data | \| | ... | ++-------------+-----------------------+------------------+------------------+----------+----+------+------+------+-------------+----+-----+ + +The file may end after any block, including immediately after a FPM1. + +.. note:: + LLVM only supports 4096 byte blocks (sometimes referred to as the "BigMsf" + variant), so the rest of this document will assume a block size of 4096. + +.. _msf_superblock: + +The Superblock +============== +At file offset 0 in an MSF file is the MSF *SuperBlock*, which is laid out as +follows: + +.. code-block:: c++ + + struct SuperBlock { + char FileMagic[sizeof(Magic)]; + ulittle32_t BlockSize; + ulittle32_t FreeBlockMapBlock; + ulittle32_t NumBlocks; + ulittle32_t NumDirectoryBytes; + ulittle32_t Unknown; + ulittle32_t BlockMapAddr; + }; + +- **FileMagic** - Must be equal to ``"Microsoft C / C++ MSF 7.00\\r\\n"`` + followed by the bytes ``1A 44 53 00 00 00``. +- **BlockSize** - The block size of the internal file system. Valid values are + 512, 1024, 2048, and 4096 bytes. Certain aspects of the MSF file layout vary + depending on the block sizes. For the purposes of LLVM, we handle only block + sizes of 4KiB, and all further discussion assumes a block size of 4KiB. +- **FreeBlockMapBlock** - The index of a block within the file, at which begins + a bitfield representing the set of all blocks within the file which are "free" + (i.e. the data within that block is not used). See :ref:`msf_freeblockmap` for + more information. + **Important**: ``FreeBlockMapBlock`` can only be ``1`` or ``2``! +- **NumBlocks** - The total number of blocks in the file. ``NumBlocks * BlockSize`` + should equal the size of the file on disk. +- **NumDirectoryBytes** - The size of the stream directory, in bytes. The stream + directory contains information about each stream's size and the set of blocks + that it occupies. It will be described in more detail later. +- **BlockMapAddr** - The index of a block within the MSF file. At this block is + an array of ``ulittle32_t``'s listing the blocks that the stream directory + resides on. For large MSF files, the stream directory (which describes the + block layout of each stream) may not fit entirely on a single block. As a + result, this extra layer of indirection is introduced, whereby this block + contains the list of blocks that the stream directory occupies, and the stream + directory itself can be stitched together accordingly. The number of + ``ulittle32_t``'s in this array is given by ``ceil(NumDirectoryBytes / BlockSize)``. + +.. _msf_freeblockmap: + +The Free Block Map +================== + +The Free Block Map (sometimes referred to as the Free Page Map, or FPM) is a +series of blocks which contains a bit flag for every block in the file. The +flag will be set to 0 if the block is in use, and 1 if the block is unused. + +Each file contains two FPMs, one of which is active at any given time. This +feature is designed to support incremental and atomic updates of the underlying +MSF file. While writing to an MSF file, if the active FPM is FPM1, you can +write your new modified bitfield to FPM2, and vice versa. Only when you commit +the file to disk do you need to swap the value in the SuperBlock to point to +the new ``FreeBlockMapBlock``. + +The Free Block Maps are stored as a series of single blocks thoughout the file +at intervals of BlockSize. Because each FPM block is of size ``BlockSize`` +bytes, it contains 8 times as many bits as an interval has blocks. This means +that the first block of each FPM refers to the first 8 intervals of the file +(the first 32768 blocks), the second block of each FPM refers to the next 8 +blocks, and so on. This results in far more FPM blocks being present than are +required, but in order to maintain backwards compatibility the format must stay +this way. + +The Stream Directory +==================== +The Stream Directory is the root of all access to the other streams in an MSF +file. Beginning at byte 0 of the stream directory is the following structure: + +.. code-block:: c++ + + struct StreamDirectory { + ulittle32_t NumStreams; + ulittle32_t StreamSizes[NumStreams]; + ulittle32_t StreamBlocks[NumStreams][]; + }; + +And this structure occupies exactly ``SuperBlock->NumDirectoryBytes`` bytes. +Note that each of the last two arrays is of variable length, and in particular +that the second array is jagged. + +**Example:** Suppose a hypothetical PDB file with a 4KiB block size, and 4 +streams of lengths {1000 bytes, 8000 bytes, 16000 bytes, 9000 bytes}. + +Stream 0: ceil(1000 / 4096) = 1 block + +Stream 1: ceil(8000 / 4096) = 2 blocks + +Stream 2: ceil(16000 / 4096) = 4 blocks + +Stream 3: ceil(9000 / 4096) = 3 blocks + +In total, 10 blocks are used. Let's see what the stream directory might look +like: + +.. code-block:: c++ + + struct StreamDirectory { + ulittle32_t NumStreams = 4; + ulittle32_t StreamSizes[] = {1000, 8000, 16000, 9000}; + ulittle32_t StreamBlocks[][] = { + {4}, + {5, 6}, + {11, 9, 7, 8}, + {10, 15, 12} + }; + }; + +In total, this occupies ``15 * 4 = 60`` bytes, so ``SuperBlock->NumDirectoryBytes`` +would equal ``60``, and ``SuperBlock->BlockMapAddr`` would be an array of one +``ulittle32_t``, since ``60 <= SuperBlock->BlockSize``. + +Note also that the streams are discontiguous, and that part of stream 3 is in the +middle of part of stream 2. You cannot assume anything about the layout of the +blocks! + +Alignment and Block Boundaries +============================== +As may be clear by now, it is possible for a single field (whether it be a high +level record, a long string field, or even a single ``uint16``) to begin and +end in separate blocks. For example, if the block size is 4096 bytes, and a +``uint16`` field begins at the last byte of the current block, then it would +need to end on the first byte of the next block. Since blocks are not +necessarily contiguously laid out in the file, this means that both the consumer +and the producer of an MSF file must be prepared to split data apart +accordingly. In the aforementioned example, the high byte of the ``uint16`` +would be written to the last byte of block N, and the low byte would be written +to the first byte of block N+1, which could be tens of thousands of bytes later +(or even earlier!) in the file, depending on what the stream directory says. |

