OpenCore  1.0.4
OpenCore Bootloader
Loading...
Searching...
No Matches
Compression.c File Reference
#include "NTFS.h"
#include "Helper.h"

Go to the source code of this file.

Macros

#define IS_COMPRESSED_BLOCK   0x8000
 
#define BLOCK_LENGTH_BITS   0xFFF
 
#define UNIT_MASK   0xF
 

Functions

STATIC EFI_STATUS GetNextCluster (IN OUT COMPRESSED *Clusters)
 
STATIC EFI_STATUS GetDataRunByte (IN COMPRESSED *Clusters, OUT UINT8 *Result)
 
STATIC EFI_STATUS GetTwoDataRunBytes (IN COMPRESSED *Clusters, OUT UINT16 *Result)
 
STATIC EFI_STATUS DecompressBlock (IN COMPRESSED *Clusters, OUT UINT8 *Dest OPTIONAL)
 
STATIC EFI_STATUS ReadCompressedBlock (IN RUNLIST *Runlist, OUT UINT8 *Dest OPTIONAL, IN UINTN BlocksTotal)
 
EFI_STATUS Decompress (IN RUNLIST *Runlist, IN UINT64 Offset, IN UINTN Length, OUT UINT8 *Dest)
 

Variables

UINT64 mUnitSize
 
STATIC UINT64 mBufferSize
 

Detailed Description

Copyright (c) 2022, Mikhail Krichanov. All rights reserved. SPDX-License-Identifier: BSD-3-Clause

Functional and structural descriptions follow NTFS Documentation by Richard Russon and Yuval Fledel

Definition in file Compression.c.

Macro Definition Documentation

◆ BLOCK_LENGTH_BITS

#define BLOCK_LENGTH_BITS   0xFFF

Definition at line 13 of file Compression.c.

◆ IS_COMPRESSED_BLOCK

#define IS_COMPRESSED_BLOCK   0x8000

Definition at line 12 of file Compression.c.

◆ UNIT_MASK

#define UNIT_MASK   0xF

Definition at line 14 of file Compression.c.

Function Documentation

◆ Decompress()

EFI_STATUS Decompress ( IN RUNLIST * Runlist,
IN UINT64 Offset,
IN UINTN Length,
OUT UINT8 * Dest )

Definition at line 549 of file Compression.c.

◆ DecompressBlock()

STATIC EFI_STATUS DecompressBlock ( IN COMPRESSED * Clusters,
OUT UINT8 *Dest OPTIONAL )

The basic idea is that substrings of the block which have been seen before are compressed by referencing the string rather than mentioning it again.

#include <ntfs.h>
#include <stdio.h>
This is compressed to #include <ntfs.h>
(-18,10)stdio(-17,4)

Pairs like (-18,10) are recorded in two bytes. -> The shortest possible substring is 3 bytes long. -> One can subtract 3 from the length before encoding it.

The references are always backward, and never 0. -> One can store them as positive numbers, and subtract one. -> (-18,10) -> (17,7); (-17,4) -> (16,1).

Given that a block is 4096 in size, you might need 12 bits to encode the back reference. This means that you have only 4 bits left to encode the length.

Dynamic allocation of bits for the back-reference.

for (i = clear_pos - 1, lmask = 0xFFF, dshift = 12; i >= 0x10; i >>= 1) { lmask >>= 1; // bit mask for length dshift–; // shift width for delta }

Now that we can encode a (offset,length) pair as 2 bytes, we still have to know whether a token is back-reference, or plain-text. This is 1 bit per token. 8 tokens are grouped together and preceded with the tags byte.

">\n(-18,10)stdio" would be encoded as "00000100" (the 1 bit indicates the back reference).

"00000000"#include"00000000" <ntfs.h"00000100">
(17,7)stdio"00000001"(16,1)

As a compression unit consists of 16 clusters (default), it usually contains more than one of these blocks. If you want to access the second block, it would be a waste of time to decompress the first one. Instead, each block is preceded by a 2-byte length. The lower 12 bits are the length, the higher 4 bits are of unknown purpose. Actually, (n-1) is stored in the low 12 bits.

The compression method is based on independently compressing blocks of X clusters, where X = 2 ^ ATTR_HEADER_NONRES.CompressionUnitSize.

If the block grows in size, it will be stored uncompressed. A length of exactly 4095 is used to indicate this case.

Bit 0x8000 is the flag specifying that the block is compressed.

Definition at line 162 of file Compression.c.

◆ GetDataRunByte()

STATIC EFI_STATUS GetDataRunByte ( IN COMPRESSED * Clusters,
OUT UINT8 * Result )

Definition at line 58 of file Compression.c.

◆ GetNextCluster()

STATIC EFI_STATUS GetNextCluster ( IN OUT COMPRESSED * Clusters)

Definition at line 21 of file Compression.c.

◆ GetTwoDataRunBytes()

STATIC EFI_STATUS GetTwoDataRunBytes ( IN COMPRESSED * Clusters,
OUT UINT16 * Result )

Definition at line 82 of file Compression.c.

◆ ReadCompressedBlock()

STATIC EFI_STATUS ReadCompressedBlock ( IN RUNLIST * Runlist,
OUT UINT8 *Dest OPTIONAL,
IN UINTN BlocksTotal )
The set of VCNs containing the stream of a compressed file attribute is divided
in compression units (also called chunks) of 16 cluster.

The alpha stage: if all the 16 clusters of a compression unit are full of zeroes,
this compression unit is called a Sparse unit and is not physically stored.
Instead, an element with no Offset field (F=0, the Offset is assumed to be 0 too)
and a Length of 16 clusters is put in the Runlist.

The beta stage: if the compression of the unit is possible,
N (< 16) clusters are physically stored, and an element with a Length of N
is put in the Runlist, followed by another element with no Offset field
(F=0, the Offset is assumed to be 0 too) and a Length of 16 - N.

If the unit is not compressed, 16 clusters are physically stored,
and an element with a Length of 16 is put in the Runlist.

Example

Runlist: 21 14 00 01 11 10 18 11 05 15 01 27 11 20 05

Decode 0x14 at 0x100 0x10 at + 0x18 0x05 at + 0x15 0x27 at + none 0x20 at + 0x05

Absolute LCNs 0x14 at 0x100 0x10 at 0x118 0x05 at 0x12D 0x27 at none 0x20 at 0x132

Regroup 0x10 at 0x100 Unit not compressed

0x04 at 0x110 Unit not compressed 0x0C at 0x118

0x04 at 0x118 Compressed unit 0x05 at 0x12D 0x07 at none

0x10 at none Sparse unit

0x10 at none Sparse unit

0x10 at 0x132 Unit not compressed

0x10 at 0x142 Unit not compressed

Definition at line 376 of file Compression.c.

Variable Documentation

◆ mBufferSize

STATIC UINT64 mBufferSize

Definition at line 17 of file Compression.c.

◆ mUnitSize

UINT64 mUnitSize
extern

Definition at line 12 of file Data.c.