Open Access Open Access  Restricted Access Subscription or Fee Access

Rainstorm Prediction using Support Vector Machine in Hadoop Cluster

C.P Shabariram

Abstract


Rainfall data is collected to predict the storm warnings from the hydrological data. This is considered as a research idea as it consumes huge number of records from the distributed system. This paper describes a novel solution to manage the data based on spatial temporal characteristics using a Map Reduce Framework. The workload is classified using support vector machine (SVM). Various rainstorm prediction concepts are achieved using the big raw rainfall data. The dataset impact parameters are classified into local, hourly, and overall storms. The proposed system serves as a tool for predicting rainstorm from a large amount of rainfall data in an efficient manner. The result indicates the proposed system improves the performance in terms of accuracy and efficiency.


Keywords


Storm Analysis, Map Reduce, Rainfall, Hydrological Data, Support Vector Machine.

Full Text:

PDF

References


W. H. Asquith, M. C. Roussel, T. G. Cleveland, X. Fang, and D. B.Thompson, “Statistical Characteristics of Storm Interevent Time,Depth, and Duration for Eastern New Mexico, Oklahoma, and Texas,” Professional Paper 1725. U.S. Geological Survey, 2006

W. H. Asquith, “Depth-Duration Frequency of Precipitation for Texas,” Water-Resources Investigations Report 98-4044. U.S.Geological Survey (USGS), 1998.

J. Dean and S. Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” Proceedings of the 6th Symposium on Operating Systems Design and Implementation (OSDI’04), 2004

R. Elmasri and S. Navathe, Fundamentals of Database Systems, 6th ed. Pearson Education, Massachusetts, 2010

K. Jitkajornwanich, R. Elmasri, C. Li, and J. McEnery, “Extracting Storm-Centric Characteristics from Raw Rainfall Data for Storm Analysis and Mining,” Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data (ACM SIGSPATIAL BIGSPATIAL’12), 2012, pp. 91-99.

K. Jitkajornwanich, U. Gupta, R. Elmasri, L. Fegaras, and J.McEnery, “Using MapReduce to Speed Up Storm Identification from Big Raw Rainfall Data,” Proceedings of the 4th International Conference on Cloud Computing, GRIDs, and Virtualization (CLOUD COMPUTING’13), 2013, pp. 49-55.

C. Lam, Hadoop in Action. Dreamtech Press, New Delhi, 2011

A. Overeem, T. A. Buishand, and I. Holleman, “Rainfall Depth-Duration-Frequency Curves and Their Uncertainties,” Journal of Hydrology, vol. 348, 2008, pp. 124-134.

Virginia Department of Conservation and Recreation, “Stormwater Management: Hydrologic Methods,” retrieved: May 2, 2012, from: ttp://dcr.cache.vi.virginia.gov/stormwater_management/documents/Chapter_4.pdf

M. Young, “The Technical Writer's Handbook”. Mill Valley, CA: UniversityScience,1989

Alex Holmens, Hadoop in practice, Manning Publications Co., ISBN 9781617290237.

https://en.wikipedia.org/wiki/SVM

Lewis, Csordas, Killcoyne, Hermjakob, Hoopmann, Moritz, Deutsch and Boyle, “HYDRA : A Scalable Proteomic Search Engine which utilize the Hadoop Distributed Computing Framework”,Bioinfomatics of BMC, 2012.

Grolinger, Hayes, Higashino, LHeureux and Allison, “Challenges for MapReduce in Big Data”.In the proceedings of IEEE SERVICES 2014, Alaska, USA.

Gu and Zhang, “Some comments on Big data and Data Science”. DOI 10.1007/s40745-014-0021-9,2015.

Manisha Sahane , Sanjay Sirsat , Razaullah Khan, “Analysis of Research Data using MapReduce Word Count Algorithm”,Internl. Journal of Advanced Research in Computer and Commn. Engg., Volume 4, 2015.

Du et al, “Review on the Application and the Handling Techniques of Bigdata in Chinese Realty Enterprise”. Data Science, DOI10.1007/s40745-014-00255, 2015

Lee,Hsiao and Hsieha, “A Dynamic data placement strategy for Hadoop in Heterogeneous Environment”,Research on Big Data, 2014.

Li and Nath, “Scalable data summarization on BigData”,Distributed and Parallel Databases 32,313-314 DOI10.1007/s10619-014-7145-y,2014.

Highland et al, “Fitting the problem to the paradigm : Algorithm characteristics required for effective use of Map Reduce”,In the proceedings of Missouri University of Science,2012.

Wang et al, “Random forests on Hadoop for genome wide association studies of multi variate neuro imaging phenotypes”,In the proceedings of Bioinfomatics Conference,2013.

Addair, Dodge, Walter and Ruppert, “Large-scale seismic signal analysis with Hadoop”. Computers & Geosciences, 66, 145 154, 2014.

http://www.ibm.com/developerworks/library/

Buono ,Danelutto and Lametti, “Map, Reduce and MapReduce, the skeleton way”. In the proceedings of, ICCS 2010,Procedia Computer Science 1, 2095 2103, 2014.

Castane, Nunez et al, “Dimensioning scientific computing system to improve performance of Map Reduce based applications”,In the proceedings of ICCS 2012,Procedia Computer Science 9,226-235,2012.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.