Source code

Revision control

Copy as Markdown

Other Tools

LZMA SDK 18.05↩
--------------↩
LZMA SDK provides the documentation, samples, header files,↩
libraries, and tools you need to develop applications that ↩
use 7z / LZMA / LZMA2 / XZ compression.↩
LZMA is an improved version of famous LZ77 compression algorithm. ↩
It was improved in way of maximum increasing of compression ratio,↩
keeping high decompression speed and low memory requirements for ↩
decompressing.↩
LZMA2 is a LZMA based compression method. LZMA2 provides better ↩
multithreading support for compression than LZMA and some other improvements.↩
7z is a file format for data compression and file archiving.↩
7z is a main file format for 7-Zip compression program (www.7-zip.org).↩
7z format supports different compression methods: LZMA, LZMA2 and others.↩
7z also supports AES-256 based encryption.↩
XZ is a file format for data compression that uses LZMA2 compression.↩
XZ format provides additional features: SHA/CRC check, filters for ↩
improved compression ratio, splitting to blocks and streams,↩
LICENSE↩
-------↩
LZMA SDK is written and placed in the public domain by Igor Pavlov.↩
Some code in LZMA SDK is based on public domain code from another developers:↩
1) PPMd var.H (2001): Dmitry Shkarin↩
2) SHA-256: Wei Dai (Crypto++ library)↩
Anyone is free to copy, modify, publish, use, compile, sell, or distribute the ↩
original LZMA SDK code, either in source code form or as a compiled binary, for ↩
any purpose, commercial or non-commercial, and by any means.↩
LZMA SDK code is compatible with open source licenses, for example, you can ↩
include it to GNU GPL or GNU LGPL code.↩
LZMA SDK Contents↩
-----------------↩
Source code:↩
- C / C++ / C# / Java - LZMA compression and decompression↩
- C / C++ - LZMA2 compression and decompression↩
- C / C++ - XZ compression and decompression↩
- C - 7z decompression↩
- C++ - 7z compression and decompression↩
- C - small SFXs for installers (7z decompression)↩
- C++ - SFXs and SFXs for installers (7z decompression)↩
Precomiled binaries:↩
- console programs for lzma / 7z / xz compression and decompression↩
- SFX modules for installers.↩
UNIX/Linux version ↩
------------------↩
To compile C++ version of file->file LZMA encoding, go to directory↩
CPP/7zip/Bundles/LzmaCon↩
and call make to recompile it:↩
make -f makefile.gcc clean all↩
In some UNIX/Linux versions you must compile LZMA with static libraries.↩
To compile with static libraries, you can use ↩
LIB = -lm -static↩
Also you can use p7zip (port of 7-Zip for POSIX systems like Unix or Linux):↩
Files↩
-----↩
DOC/7zC.txt - 7z ANSI-C Decoder description↩
DOC/7zFormat.txt - 7z Format description↩
DOC/installer.txt - information about 7-Zip for installers↩
DOC/lzma.txt - LZMA compression description↩
DOC/lzma-sdk.txt - LZMA SDK description (this file)↩
DOC/lzma-history.txt - history of LZMA SDK↩
DOC/lzma-specification.txt - Specification of LZMA↩
DOC/Methods.txt - Compression method IDs for .7z↩
bin/installer/ - example script to create installer that uses SFX module,↩
bin/7zdec.exe - simplified 7z archive decoder↩
bin/7zr.exe - 7-Zip console program (reduced version)↩
bin/x64/7zr.exe - 7-Zip console program (reduced version) (x64 version)↩
bin/lzma.exe - file->file LZMA encoder/decoder for Windows↩
bin/7zS2.sfx - small SFX module for installers (GUI version)↩
bin/7zS2con.sfx - small SFX module for installers (Console version)↩
bin/7zSD.sfx - SFX module for installers.↩
7zDec.exe↩
---------↩
7zDec.exe is simplified 7z archive decoder.↩
It supports only LZMA, LZMA2, and PPMd methods.↩
7zDec decodes whole solid block from 7z archive to RAM.↩
The RAM consumption can be high.↩
Source code structure↩
---------------------↩
Asm/ - asm files (optimized code for CRC calculation and Intel-AES encryption)↩
C/ - C files (compression / decompression and other)↩
Util/↩
7z - 7z decoder program (decoding 7z files)↩
Lzma - LZMA program (file->file LZMA encoder/decoder).↩
LzmaLib - LZMA library (.DLL for Windows)↩
SfxSetup - small SFX module for installers ↩
CPP/ -- CPP files↩
Common - common files for C++ projects↩
Windows - common files for Windows related code↩
7zip - files related to 7-Zip↩
Archive - files related to archiving↩
Common - common files for archive handling↩
7z - 7z C++ Encoder/Decoder↩
Bundles - Modules that are bundles of other modules (files)↩
Alone7z - 7zr.exe: Standalone 7-Zip console program (reduced version)↩
Format7zExtractR - 7zxr.dll: Reduced version of 7z DLL: extracting from 7z/LZMA/BCJ/BCJ2.↩
Format7zR - 7zr.dll: Reduced version of 7z DLL: extracting/compressing to 7z/LZMA/BCJ/BCJ2↩
LzmaCon - lzma.exe: LZMA compression/decompression↩
LzmaSpec - example code for LZMA Specification↩
SFXCon - 7zCon.sfx: Console 7z SFX module↩
SFXSetup - 7zS.sfx: 7z SFX module for installers↩
SFXWin - 7z.sfx: GUI 7z SFX module↩
Common - common files for 7-Zip↩
Compress - files for compression/decompression↩
Crypto - files for encryption / decompression↩
UI - User Interface files↩
Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll↩
Common - Common UI files↩
Console - Code for console program (7z.exe)↩
Explorer - Some code from 7-Zip Shell extension↩
FileManager - Some GUI code from 7-Zip File Manager↩
GUI - Some GUI code from 7-Zip↩
CS/ - C# files↩
7zip↩
Common - some common files for 7-Zip↩
Compress - files related to compression/decompression↩
LZ - files related to LZ (Lempel-Ziv) compression algorithm↩
LZMA - LZMA compression/decompression↩
LzmaAlone - file->file LZMA compression/decompression↩
RangeCoder - Range Coder (special code of compression/decompression)↩
Java/ - Java files↩
SevenZip↩
Compression - files related to compression/decompression↩
LZ - files related to LZ (Lempel-Ziv) compression algorithm↩
LZMA - LZMA compression/decompression↩
RangeCoder - Range Coder (special code of compression/decompression)↩
Note: ↩
Asm / C / C++ source code of LZMA SDK is part of 7-Zip's source code.↩
7-Zip's source code can be downloaded from 7-Zip's SourceForge page:↩
LZMA features↩
-------------↩
- Variable dictionary size (up to 1 GB)↩
- Estimated compressing speed: about 2 MB/s on 2 GHz CPU↩
- Estimated decompressing speed: ↩
- 20-30 MB/s on modern 2 GHz cpu↩
- 1-2 MB/s on 200 MHz simple RISC cpu: (ARM, MIPS, PowerPC)↩
- Small memory requirements for decompressing (16 KB + DictionarySize)↩
- Small code size for decompressing: 5-8 KB↩
LZMA decoder uses only integer operations and can be ↩
implemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).↩
Some critical operations that affect the speed of LZMA decompression:↩
1) 32*16 bit integer multiply↩
2) Mispredicted branches (penalty mostly depends from pipeline length)↩
3) 32-bit shift and arithmetic operations↩
The speed of LZMA decompressing mostly depends from CPU speed.↩
Memory speed has no big meaning. But if your CPU has small data cache, ↩
overall weight of memory speed will slightly increase.↩
How To Use↩
----------↩
Using LZMA encoder/decoder executable↩
--------------------------------------↩
Usage: LZMA <e|d> inputFile outputFile [<switches>...]↩
e: encode file↩
d: decode file↩
b: Benchmark. There are two tests: compressing and decompressing ↩
with LZMA method. Benchmark shows rating in MIPS (million ↩
instructions per second). Rating value is calculated from ↩
measured speed and it is normalized with Intel's Core 2 results.↩
Also Benchmark checks possible hardware errors (RAM ↩
errors in most cases). Benchmark uses these settings:↩
(-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. ↩
Also you can change the number of iterations. Example for 30 iterations:↩
LZMA b 30↩
Default number of iterations is 10.↩
<Switches>↩
-a{N}: set compression mode 0 = fast, 1 = normal↩
default: 1 (normal)↩
d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB)↩
The maximum value for dictionary size is 1 GB = 2^30 bytes.↩
Dictionary size is calculated as DictionarySize = 2^N bytes. ↩
For decompressing file compressed by LZMA method with dictionary ↩
size D = 2^N you need about D bytes of memory (RAM).↩
-fb{N}: set number of fast bytes - [5, 273], default: 128↩
Usually big number gives a little bit better compression ratio ↩
and slower compression process.↩
-lc{N}: set number of literal context bits - [0, 8], default: 3↩
Sometimes lc=4 gives gain for big files.↩
-lp{N}: set number of literal pos bits - [0, 4], default: 0↩
lp switch is intended for periodical data when period is ↩
equal 2^N. For example, for 32-bit (4 bytes) ↩
periodical data you can use lp=2. Often it's better to set lc0, ↩
if you change lp switch.↩
-pb{N}: set number of pos bits - [0, 4], default: 2↩
pb switch is intended for periodical data ↩
when period is equal 2^N.↩
-mf{MF_ID}: set Match Finder. Default: bt4. ↩
Algorithms from hc* group doesn't provide good compression ↩
ratio, but they often works pretty fast in combination with ↩
fast mode (-a0).↩
Memory requirements depend from dictionary size ↩
(parameter "d" in table below). ↩
MF_ID Memory Description↩
bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing.↩
bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing.↩
bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing.↩
hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing.↩
-eos: write End Of Stream marker. By default LZMA doesn't write ↩
eos marker, since LZMA decoder knows uncompressed size ↩
stored in .lzma file header.↩
-si: Read data from stdin (it will write End Of Stream marker).↩
-so: Write data to stdout↩
Examples:↩
1) LZMA e file.bin file.lzma -d16 -lc0 ↩
compresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) ↩
and 0 literal context bits. -lc0 allows to reduce memory requirements ↩
for decompression.↩
2) LZMA e file.bin file.lzma -lc0 -lp2↩
compresses file.bin to file.lzma with settings suitable ↩
for 32-bit periodical data (for example, ARM or MIPS code).↩
3) LZMA d file.lzma file.bin↩
decompresses file.lzma to file.bin.↩
Compression ratio hints↩
-----------------------↩
Recommendations↩
---------------↩
To increase the compression ratio for LZMA compressing it's desirable ↩
to have aligned data (if it's possible) and also it's desirable to locate↩
data in such order, where code is grouped in one place and data is ↩
grouped in other place (it's better than such mixing: code, data, code,↩
data, ...).↩
Filters↩
-------↩
You can increase the compression ratio for some data types, using↩
special filters before compressing. For example, it's possible to ↩
increase the compression ratio on 5-10% for code for those CPU ISAs: ↩
x86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.↩
You can find C source code of such filters in C/Bra*.* files↩
You can check the compression ratio gain of these filters with such ↩
7-Zip commands (example for ARM code):↩
No filter:↩
7z a a1.7z a.bin -m0=lzma↩
With filter for little-endian ARM code:↩
7z a a2.7z a.bin -m0=arm -m1=lzma ↩
It works in such manner:↩
Compressing = Filter_encoding + LZMA_encoding↩
Decompressing = LZMA_decoding + Filter_decoding↩
Compressing and decompressing speed of such filters is very high,↩
so it will not increase decompressing time too much.↩
Moreover, it reduces decompression time for LZMA_decoding, ↩
since compression ratio with filtering is higher.↩
These filters convert CALL (calling procedure) instructions ↩
from relative offsets to absolute addresses, so such data becomes more ↩
compressible.↩
For some ISAs (for example, for MIPS) it's impossible to get gain from such filter.↩
---↩