Open Access Open Access  Restricted Access Subscription or Fee Access

Data Mining Techniques for Library Professionals

L. Santhi, Dr.N. Radhakrishnan, Dr. B.S. Swaroop Rani


Data mining or knowledge discovery refers to the process of finding information in large repositories of data. The term data mining also refers to the step in the knowledge discovery process in which special algorithms are employed in hopes of identifying interesting patterns in the data. These interesting patterns are then analyzed yielding knowledge. The desired outcome of data mining activities is to discover knowledge that is not explicit in the data, and to put that knowledge to use.

Librarians involved in digital libraries are already benefiting from data mining techniques as they explore ways to automatically classify information and explore new approaches for subject clustering. As the field grows, new applications for libraries are likely to evolve and it will be important for library administrators to have a basic understanding of the technology.

A wide variety of data mining techniques are also employed by industry and government. Many of these activities pose threats to personal privacy. As professionals ethically bound to ensure that individual privacy is safe-guarded, data mining activities should be monitored and kept on every librarian’s radar.

This paper is for Library professionals who would like a better understanding of knowledge discovery and data mining techniques. It explains the historical development of this new discipline, explains specific data mining methods, and concludes that future development should focus on developing tools and techniques.


Classification, Data Mining, Dependency, Knowledge Discovery, Patterns.

Full Text:



Andrassoya, E., & Paralic, J. (1999, September). Knowledge discovery in databases - a comparison of different views. Presented at the 10th International Conference on Information and Intelligent Systems, Sept. 1999, Varazdin, Croatia.

Dunham, M.H. (2003). Data mining introductory and advanced topics. Upper Saddle River, NJ: Pearson Education, Inc.

Fayyad, U.M, Piatetsky-Shapiro, G., & Smyth, P. (1996, Fall). From data mining to knowledge discovery in databases. AI Magazine, 17(3), pp. 37-54.

Frawley, W.J., Piatetsky-Shapiro, G., & Matheus, C.J. (1991). Knowledge Discovery in Databases: An Overview. In Piatetsky-Shapiro, G. & Frawley, W.J. (Eds.), Knowledge discovery in databases (pp. 1-27). Cambridge, MA: AAAI Press/MIT Press.

Han, J., & Kamber, M. (2001). Data mining: concepts and techniques (Morgan-Kaufman Series of Data Management Systems). San Diego: Academic Press.

Hearst, M. (1999, June). Untangling Text Data Mining. Presentation at the 37th Annual Meeting of the Association of Computational Linguistics, University of Maryland, MD.

Kobayashi, M. & Aono, M. (2003) Vector space models for search and cluster mining. In Survey of text mining:clustering, classification, and retrieval. New York: Springer Science+Business Media, Inc.

Senellart, P.P., & Blondel, V.D. (2003). Automatic Discovery of Similar Words. In Berry, M.W. (Ed.). In Survey of text mining: clustering, classification, and retrieval. New York: Springer Science+Business Media, Inc.

Weiss, S.M., Indurkha, N., Zhang, T., Damerau, F.J. (2005). Text mining: Predictive methods for analyzing unstructured information. New York: Springer Science+Business Media, Inc.

Witten,I.H.,Frank, E.(2005).Data mining: practicalmachine learning tools and techniques (second edition, Morgan-Kaufman Series of Data management systems

Zaïane, O.R., Han, J., Li, Z., Hou, J. (1998). Mining multimedia data. Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative Research, Toronto, Ontario, Canada. Retrieved March 22, 2006 from ACM Digital Library.

Zanasi, A., Brebbia, C.A., Ebecken, N.F.F. (2005). Preface. In Zanasi, A., Brebbia, C.A., Ebecken, N.F.F.(Eds.), Sixth International Conference on Data Mining: Data Mining VI. Southampton, England: WIT Press.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.