|
Lempel–Ziv–Storer–Szymanski (LZSS) is a lossless data compression algorithm, a derivative of LZ77, that was created in 1982 by James Storer and Thomas Szymanski. LZSS was described in article "Data compression via textual substitution" published in ''Journal of the ACM'' (pp. 928–951). LZSS is a dictionary encoding technique. It attempts to replace a string of symbols with a reference to a dictionary location of the same string. The main difference between LZ77 and LZSS is that in LZ77 the dictionary reference could actually be longer than the string it was replacing. In LZSS, such references are omitted if the length is less than the "break even" point. Furthermore, LZSS uses one-bit flags to indicate whether the next chunk of data is a literal (byte) or a reference to an offset/length pair. == Example == Here is the beginning of Dr. Seuss's Green Eggs and Ham, with character numbers at the beginning of lines for convenience.
This text takes 177 bytes in uncompressed form. Assuming a break even point of 2 bytes (and thus 2 byte pointer/offset pairs), and one byte newlines, this text compressed with LZSS becomes 94 bytes long:
Note: this does not include the 12 bytes of flags indicating whether the next chunk of text is a pointer or a literal. Adding it, the text becomes 106 bytes long, which is still shorter than the original 177 bytes. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Lempel–Ziv–Storer–Szymanski」の詳細全文を読む スポンサード リンク
|