A Matrix Data Structure to Organize Cumulative Frequencies for Efficiency of Adaptive Arithmetic Compression

Jyotika Doshi

A Matrix Data Structure to Organize Cumulative Frequencies for Efficiency of Adaptive Arithmetic Compression

Jyotika Doshi

Abstract

A novel data structure, namely Cumulative Frequency
Matrix (CFM), is proposed here for maintaining cumulative
frequencies used in adaptive arithmetic data compression method. Arithmetic coding is based on cumulative probability interval of symbols. It computes subintervals using cumulative frequency of a symbol while encoding and decoding. While decoding, it also needs to find subinterval in which a given value falls. Adaptive methods need to compute cumulative frequencies on fly. CFM is suggested here for efficiency of algorithms to operate on cumulative frequency at runtime. CFM is a 2-D array of 16 rows and 16 columns for order-0
model. In order-0 model, a symbol is of single byte and it has two nibbles; say L for left and R for right. In CFM, row corresponds to left nibble and column corresponds to right nibble. As each nibble is of four bits, row index and column index varies from 0 to 15. Within row L, it stores partial cumulative frequency of only those symbols whose
left nibble is L. Thus matrix element (L, R) contains the partial cumulative frequency of a symbol with right nibble as R among all the symbols with left nibble as L. This paper analyses algorithm forvarious operations needed in adaptive arithmetic coding. Operations are discussed here for initializing data structure, maintainin cumulative frequencies, computing subinterval, finding subinterval. CFM data structure is compared with Heap as described by Solomon
and Fenwick’s Binary Indexed Tree data structure. As compared with these related data structures in use, proposed data structure CFM is much simpler and efficient as well.

Keywords

Adaptive Arithmetic Coding, Algorithm Analysis, Cumulative Frequency Maintenance, Data Compression, Data Structure.

Full Text:

PDF

References

I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data

compression,” Commun. ACM, vol. 30, no. 6, pp. 520–540, 1987.

A.Moffat, “Linear time adaptive coding”, IEEE Trans Info. Theory, 36,

(2), 401-406 (1990)

D.W.Jones, Application of splay trees to data compression”, Comm ACM,

, (8), 996-1007 (1988)

Peter M. Fenwick, “A New Data Structure for Cumulative Frequency

Tables”, Software – Practice and Experience, Vol 24(3), 327-336 (March

W. Weaver and C.E. Shannon. The Mathematical Theory of

Communication. University of Illinois Press, Urbana, Illinois, 1949.

Republished in paperback 1963.

Eric Bodden, Malte Clasen, Joachim Kneis, “Arithmetic Coding

revealed-A guided tour from theory to praxis”, Sable Technical Report No.

-5, May 2007, available at

http://www.bodden.de/legacy/arithmetic-coding/

http://www.stringology.org/DataCompression/lexicon.html#adaptive_co

mpress_method

Paul G. Howard, Jeffrey S. Vitter, “Analysis of Arithmetic Coding for

Data Compression”, Information Processing and Management, Vol. 28,

No. 6, pp. 749-764 (1992)

David Solomon, “Data Compression – The Complete Reference”, 3rd

edition, Springer, 2004

Ian H. Witten, Alistair Moffat, Timothy C. Bell, “Managing

Gigabytes-Compressing and Indexing Documents and Images”, 2nd

edition, Morgan Kaufmann Publishers

Ida Mengyi Pu, Fundamental Data Compression,

Butterworth-Heinemann, 2006

P. G. Howard and J. S. Vitter, “Practical implementation of arithmetic

coding,” in Image and Text Compression, J. A. Storer, Ed. Norwell, MA:

Kluwer Academic, 1992, pp. 85–112.

A. Moffat, R. M. Neal, and I. H. Witten, “Arithmetic coding revisited,”

ACM Trans. Inf. Syst., vol. 16, no. 3, pp. 256–294, 1998.

Ranjan Bose, Saumitr Pathak, “A Novel Compression and Encryption

Scheme Using Variable Model Arithmetic Coding and Coupled Chaotic

System”, IEEE Transaction on Circuits and Systems – I: Regular Papers,

Vol. 53, no. 4, 2006, pp 848-857

Algorithm Tutorials available at http://community.topcoder.com

Amir Said, “Introduction to Arithmetic Coding - Theory and Practice”,

Hewlett-Packard Laboratories Report, HPL-2004-76, Palo Alto, CA,

April 2004 available at http://www.hpl.hp.com/techreports

Bin Zhu, En-hui Yang, Ahmed H. Tewfik, “Arithmetic Coding with Dual

Symbol Sets and Its Performance Analysis”, IEEE transactions on Image

Processing, Vol. 8, No. 12, 1999, pp 1667

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 3.0 License.

Username
Password
Remember me