Open Access Open Access  Restricted Access Subscription or Fee Access

Assessment of Clustering Approaches for Gene Expression Data: A Survey

Monika Parmar, K.P. Merry


Bioinformatics is one of the outskirts and interdisciplinary territory of examination. The essential objective of bioinformatics is to dig into and to translate the natural methodology. Quality declaration is the most natural level at which the genotype of a creature, inner model of hereditary data offers climb to the phenotype, the outward physical divulgence of this data. The quantitative examination of quality representation has turned into a natural piece of most advanced organic examinations, extending from immaculate scholastic research through to medication revelation and human services. Gathering of quality declaration information can help in recognizing characteristic structures and discovering helpful examples among the quality representation information. Grouping is one of the broadly utilized methodologies for looking at and examining the quality outflow information. Bunching calculation helps in understanding quality capacity, quality regulation, subtypes of cells and other cell capacities. This review paper envelops different bunching calculations for the gathering of quality interpretation information.


Bioinformatics, Gene Expression, Genotype, Phenotype, Clustering.

Full Text:



T. Chandrashekhar, K. Thangavel and E.Elayaraja, “Effective clustering algorithms for gene expression data”,International Journal of Computer Applications (0975 – 8887) Volume 32– No.4, October 2011.

SauravjoytiSarmah and Dhruba K. Bhattacharyya. May 2010 “An Effective Technique for Clustering Incremental Gene Expression data”, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3.

Gengxin Chen, Saied A. Jaradat, Nila Banerjee, Tetsuya S. Tanaka, Minoru S.H. Ko, Michael Q. Zhang, “Evaluation and Comparison of Clustering Algorithms in Analyzing ES Cell Gene Expression Data”.

Daxin Jiang, Chun Tang, and Aidong Zhang, “Cluster Analysis for Gene Expression Data: A Survey”, IEEE transactions on knowledge and data engineering, VOL. 16, NO. 11, November 2004, pp. 1370-1386.

M.B. Eisen, P.T. Spellman, P.O. Brown, and D. Botstein, “Cluster Analysis and Display of Genome Wide Expression Patterns,” Proc. Nat’l Academy of Science, vol. 95, no. 25, pp. 14863-14868, Dec. 1998.

U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, “Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Array,” Proc. Nat’l Academy of Science, vol. 96, no. 12, pp. 6745- 6750, June 1999.

K. Rose, E. Gurewitz, and G. Fox, Physical Rev. Letters, vol. 65, 1990, pp. 945-948.

Osama Abu Abbas, Comparison between clustering algorithms, The International Arab Journal of Information and technology, Vol. 5, No. 3. pp. 320-325.

D. Jiang, J. Pei, and A. Zhang, “DHC: a density-based hierarchical clustering method for time series gene expression data,” in Proceedings of BIBE2003: 3rd IEEE International Symposium on Bioinformatics and Bioengineering, Bethesda, Maryland, 2003, p. 393.

G.Shu, B. Zeng, Y. Chen, and O. Smith, “Performance assessment of kernel density clustering for gene expression role data,” Comparative and Functional Genomics, vol. 4, p. 287-299, 2003.

R. Jarvis and E. Patrick, “Clustering using a similarity measure based on shared nearest neighbours,” IEEE Transactions on Computers, vol. 11, 1973.

A. Ben-Dor, B. Chor, R. Karp, and Z. Yakhini, “Discovering local structure in gene expression data: The order preserving submatrix problem,” in Proc. Of the 6th Annual Int. Conf. on Computational Biology. New York, USA: ACM Press, 2002, pp. 49–57.

A. Dempster, N. Laird, and D. Rubin, “Maximum likelihood from incomplete data via the em algorithm,” J. R. Statist. Soc., vol. B 39 (1), pp. 1–38, 1977.

J. Travis and Y. Huang, “Clustering of gene expression data based on shape similarity,” EURASIP Journal on Bioinformatics and Systems Biology,, vol. 2009(195712), 2009.

M. Beal and Z. Ghahramani, “The variationalbayesianem algorithm for incomplete data: with application to scoring graphical modelstructures,” in Proceedings of the 7th Valencia International Meeting on Bayesian Statistics, vol. 63(4), Spain, 2003, pp. 453–464.

E.Hartuv, A. Schmitt, J. Lange, S. Meier-Ewert, H. Lehrach, and R. Shamir, “An algorithm for clustering cDNAs for gene expression analysis using short oligonucleotide fingerprints,” in Proceedings of 3rd International Symposium on Computational Molecular Biology (RECOMB 99). ACM Press, 1999, pp. 188–197.

R. Sharan and R. Shamir, “CLICK: A clustering algorithm with applications to gene expression analysis,” in Proceedings of 8th International Conference on Intelligent Systems for Molecular Biology. AAAI Press, 2000.

A.Ben-Dor, R. Shamir and Z.Yakhini, “Clustering gene expression patterns,” Journal of Computational Biology, vol.5, pp.281-297, 1999.

A. Bellaachia, D. Portnoy, and A. G. Chen, Y.andElkahloun, “E-CAST: A data mining algorithm for gene expression data,” in Proceedings of the BIOKDD02: Workshop on Data Mining in Bioinformatics (with SIGKDD02 Conference), 2002, pp. 49.

R. Das, D. Bhattacharyya, and J. Kalita, “A new approach for clustering gene expression time series data,” International Journal of Bioinformatics Research and Applications, vol. 5(3), pp. 310–328, 2009.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.