OpenCore
1.0.4
OpenCore Bootloader
|
Go to the source code of this file.
Macros | |
#define | IS_COMPRESSED_BLOCK 0x8000 |
#define | BLOCK_LENGTH_BITS 0xFFF |
#define | UNIT_MASK 0xF |
Functions | |
STATIC EFI_STATUS | GetNextCluster (IN OUT COMPRESSED *Clusters) |
STATIC EFI_STATUS | GetDataRunByte (IN COMPRESSED *Clusters, OUT UINT8 *Result) |
STATIC EFI_STATUS | GetTwoDataRunBytes (IN COMPRESSED *Clusters, OUT UINT16 *Result) |
STATIC EFI_STATUS | DecompressBlock (IN COMPRESSED *Clusters, OUT UINT8 *Dest OPTIONAL) |
STATIC EFI_STATUS | ReadCompressedBlock (IN RUNLIST *Runlist, OUT UINT8 *Dest OPTIONAL, IN UINTN BlocksTotal) |
EFI_STATUS | Decompress (IN RUNLIST *Runlist, IN UINT64 Offset, IN UINTN Length, OUT UINT8 *Dest) |
Variables | |
UINT64 | mUnitSize |
STATIC UINT64 | mBufferSize |
Copyright (c) 2022, Mikhail Krichanov. All rights reserved. SPDX-License-Identifier: BSD-3-Clause
Functional and structural descriptions follow NTFS Documentation by Richard Russon and Yuval Fledel
Definition in file Compression.c.
#define BLOCK_LENGTH_BITS 0xFFF |
Definition at line 13 of file Compression.c.
#define IS_COMPRESSED_BLOCK 0x8000 |
Definition at line 12 of file Compression.c.
#define UNIT_MASK 0xF |
Definition at line 14 of file Compression.c.
EFI_STATUS Decompress | ( | IN RUNLIST * | Runlist, |
IN UINT64 | Offset, | ||
IN UINTN | Length, | ||
OUT UINT8 * | Dest ) |
Definition at line 549 of file Compression.c.
STATIC EFI_STATUS DecompressBlock | ( | IN COMPRESSED * | Clusters, |
OUT UINT8 *Dest | OPTIONAL ) |
The basic idea is that substrings of the block which have been seen before are compressed by referencing the string rather than mentioning it again.
#include <ntfs.h>
#include <stdio.h>
This is compressed to #include <ntfs.h>
(-18,10)stdio(-17,4)
Pairs like (-18,10) are recorded in two bytes. -> The shortest possible substring is 3 bytes long. -> One can subtract 3 from the length before encoding it.
The references are always backward, and never 0. -> One can store them as positive numbers, and subtract one. -> (-18,10) -> (17,7); (-17,4) -> (16,1).
Given that a block is 4096 in size, you might need 12 bits to encode the back reference. This means that you have only 4 bits left to encode the length.
Dynamic allocation of bits for the back-reference.
for (i = clear_pos - 1, lmask = 0xFFF, dshift = 12; i >= 0x10; i >>= 1) { lmask >>= 1; // bit mask for length dshift–; // shift width for delta }
Now that we can encode a (offset,length) pair as 2 bytes, we still have to know whether a token is back-reference, or plain-text. This is 1 bit per token. 8 tokens are grouped together and preceded with the tags byte.
">\n(-18,10)stdio" would be encoded as "00000100" (the 1 bit indicates the back reference).
"00000000"#include"00000000" <ntfs.h"00000100">
(17,7)stdio"00000001"(16,1)
As a compression unit consists of 16 clusters (default), it usually contains more than one of these blocks. If you want to access the second block, it would be a waste of time to decompress the first one. Instead, each block is preceded by a 2-byte length. The lower 12 bits are the length, the higher 4 bits are of unknown purpose. Actually, (n-1) is stored in the low 12 bits.
The compression method is based on independently compressing blocks of X clusters, where X = 2 ^ ATTR_HEADER_NONRES.CompressionUnitSize.
If the block grows in size, it will be stored uncompressed. A length of exactly 4095 is used to indicate this case.
Bit 0x8000 is the flag specifying that the block is compressed.
Definition at line 162 of file Compression.c.
STATIC EFI_STATUS GetDataRunByte | ( | IN COMPRESSED * | Clusters, |
OUT UINT8 * | Result ) |
Definition at line 58 of file Compression.c.
STATIC EFI_STATUS GetNextCluster | ( | IN OUT COMPRESSED * | Clusters | ) |
Definition at line 21 of file Compression.c.
STATIC EFI_STATUS GetTwoDataRunBytes | ( | IN COMPRESSED * | Clusters, |
OUT UINT16 * | Result ) |
Definition at line 82 of file Compression.c.
STATIC EFI_STATUS ReadCompressedBlock | ( | IN RUNLIST * | Runlist, |
OUT UINT8 *Dest | OPTIONAL, | ||
IN UINTN | BlocksTotal ) |
The set of VCNs containing the stream of a compressed file attribute is divided in compression units (also called chunks) of 16 cluster. The alpha stage: if all the 16 clusters of a compression unit are full of zeroes, this compression unit is called a Sparse unit and is not physically stored. Instead, an element with no Offset field (F=0, the Offset is assumed to be 0 too) and a Length of 16 clusters is put in the Runlist. The beta stage: if the compression of the unit is possible, N (< 16) clusters are physically stored, and an element with a Length of N is put in the Runlist, followed by another element with no Offset field (F=0, the Offset is assumed to be 0 too) and a Length of 16 - N. If the unit is not compressed, 16 clusters are physically stored, and an element with a Length of 16 is put in the Runlist.
Example
Runlist: 21 14 00 01 11 10 18 11 05 15 01 27 11 20 05
Decode 0x14 at 0x100 0x10 at + 0x18 0x05 at + 0x15 0x27 at + none 0x20 at + 0x05
Absolute LCNs 0x14 at 0x100 0x10 at 0x118 0x05 at 0x12D 0x27 at none 0x20 at 0x132
Regroup 0x10 at 0x100 Unit not compressed
0x04 at 0x110 Unit not compressed 0x0C at 0x118
0x04 at 0x118 Compressed unit 0x05 at 0x12D 0x07 at none
0x10 at none Sparse unit
0x10 at none Sparse unit
0x10 at 0x132 Unit not compressed
0x10 at 0x142 Unit not compressed
Definition at line 376 of file Compression.c.
STATIC UINT64 mBufferSize |
Definition at line 17 of file Compression.c.