Open Access Open Access  Restricted Access Subscription or Fee Access

Analysis of Biological Sequence by Data Mining

N. Senthil Vel Murugan, V. Vallinayagam, K. Senthamarai Kannan

Abstract


Data mining allows users to discover novelty in huge amounts of data. The recent studies have used individual structures for study while this study focuses on sequential pattern mining. This study attempts to study sequential patterns extracted from gene data. The data for the present study were collected from the Gen Bank. The data taken for study is DNA sequence of samples affected by Liver cancer. It can be inferred from the analysis that increases or decrease in protein level, hormone level contributes to Lever cancer. The aim of this paper is analyze the above liver cancer DNA sequence data and reduce the variable size by Principal Component Analysis and Singular value decomposition technique and which proteins will affect quickly as possible using Similarity techniques. The reasonable results verify the validity of our method.

Keywords


Data Mining; Liver Cancer; Principal Component Analysis; Singular Value Decomposition; DNA

Full Text:

PDF

References


Berk., “Data Mining within a Regression Framework”, in Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers oded Maimon and Lior Rokach (eds.), Kluwer Academic Publishers, 2004.

Chang Su Lee, “A Framework of Adaptive T-S type Rough Fuzzy Inference Systems”, Ph.D thesis, School of Electrical Electronic and Computer Engineering, the University of Western Australia, 2009.

Dunham, M.H., “Data mining: Introductory and advanced topics”. Prentice Hall, Upper Saddle River, New Jersey, USA, 2002.

Durbin,R., Eddy, S., Krogh, A., et al., “Biological Sequence Analysis”, Cambridge University Press, Cambridge, UK, 1998

Fayyad U.M, Piatetsky-Shapiro G, Smyth P and Uthurusamy, “Advances in Knowledge Discovery and Data Mining”, AAI/MIT Press, pp.181-203,1996.

Geetanjali Bhosale, Kamath,, “Fuzzy inference system for teaching staff performance appraisal”, International journal of Computer and Information Technology, vol. 2, No. 3, pp. 381– 384,2013.

Golub et. al “Calculating the singular values and Pseudo-Inverse of a Matrix, Numer. Anal. Ser. B, p.205-224, Vol. 2, No.2, 1965.

Good, I.J “Some applications of the Singular value decomposition of matrix”, Technometrics, vol.11, no.4, pp.823-831, 1969.

Hardin.J, “Microarray data from a statistician‟s point of view‟ in STATS 42, winter 2005.

James F, Brule, “Fuzzy Systems – A Tutorial”, 1985, http://www.ortechLowengr.com/fuzzy/tutor.txt.

LeBlance, M., and Tibshirani, R., “Combining Estimates on Regression and Classification.” Journal of the American Statistical Association, 1996, 91: 1641-1650.

Mehraban Sangatash M, Mohebbi M, Shahidi F, Vahidian Kamyad A, Qhods Rohani M, “Application of fuzzy logic to classify raw milk based on qualitative properties”, International journal of AgriScience, vol.2(12), 2012, pp.1168-1178.

Renato Coppi, Maria A. Gil, Henk A.L. Kiers, “The fuzzy approach to statistical analysis”, Computational Statistics and Data Analysis, Elsevier, article in press, 2013.

Senthil vel murugan et. al.” Analysis of Liver cancer by Data Mining”, International Journal of Computer Applications,Vol. 61– No.3, pp.23-25, 2013.

Wall, Michael E., Andreas Rechtsteiner, Luis M. Rocha."Singular value decomposition and principal component analysis". in A Practical Approach to Microarray Data Analysis. D.P. Berrar, W. Dubitzky, M. Granzow, eds. pp. 91-109, Kluwer: Norwell, MA (2003).

World Health Organization, “WHO Expert Committee on Diabetics Mellitus”, Second Report, Geneva, World Health Org., Tech. Rep. Ser., no. 646, 1980.

Zakaria Suliman Zubi, Marim Aboajela Emsaed, “Using sequence DNA chips data to Mining and Diagnosing Cancer Patients”, International Journal of Computers, Issue4, Vol.4, pg.201-214, 2010.

Zalinda Othman, Khairanum Subari and Norhashimah, “Application of Fuzzy Inference Systems and Genetic Algorithms in Integrated Process Planning and Scheduling” International Journal of The Computer, The Internet and Management, Vol. 10, No2, 2002, pp. 81 – 96.

Zadeh, L.A.(1965),” Fuzzy sets”, Information Control, vol. 8, pp. 338-358.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.