Data Channel Integration in Hadoop Environments
Abstract
In the field of distributed computing is growing and speedily becoming a natural part of large as well as smaller enterprises IT processes. Driving the progress is the cost efficiency of distributed systems compared to centralized options, the physical limitations of single machinery and reliability concerns. There are frameworks within the field which aims to create a standardized platform to facilitate the progress and implementation of distributed services and applications. Apache Hadoop is one of those papers. Hadoop is a framework for distributed processing and data storage. It contains support for many different modules for different purposes such as Distributed database management, safety, data streaming and processing. In calculation to offering storage much cheaper than traditional centralized relation database, Hadoop chains powerful methods of handling very large amounts of data as it streams through and is stored on the system. These methods are widely used for all kinds of big data dealing out in large IT companies with a need for low-latency, high-throughput processing of the data. More and more companies are looking towards implementing Hadoop in their IT process; one of them is Unomaly, a company which offers agnostic, proactive anomaly detection. The anomaly detection system analyses system logs to detect discrepancies. The anomaly finding system is reliant on large amounts of data to build an exact image of the target system. Integration with Hadoop would result in the possibility to consume incredibly large amounts of data as it is streamed to the Hadoop storage or other parts of the system.
In this degree paper an integration layer application has been developed to allow Hadoop integration with Unomalys system. Research has been conducted throughout the paper in order to determine the best way of implement the integration. The first part of the result of the paper is a PoC application for real time data channel between Hadoop clusters and the Unomaly system. The second part is a recommendation of how the integration should be designed, based on the studies conducted in the paper work.
Keywords
Full Text:
PDFReferences
Apache Hadoop, “Apache Hadoop NextGen MapReduce (YARN),” Apache Hadoop, 10 April 2015.
E. Baldeschwieler, “Reality Check: Contributors to Apache Hadoop,” Hortonworks, 7 October 2011.
S. Loughran, “Distributions and commercial support,” Apache Hadoop, 10 December 2014.
Hortonworks, “Hortonworks Data Platform,” Hortonworks, 15 October 2014.
Hortonworks, “A fast, scalable, fault-tolerant messaging system,” Hortonworks, 1 October 2012.
G. Shapira and J. Holoman, “Apache Kafka for Beginners,” Cloudera, 12 September 2014.
A.S.Syed Navaz, J.Antony Daniel Rex, P.Anjala Mary. “An Efficient Intrusion Detection Scheme for Mitigating Nodes Using Data Aggregation in Delay Tolerant Network” September – 2015, International Journal of Scientific & Engineering Research, Vol No - 6, Issue No - 9, pp. 421 – 428.
A.S.Syed Navaz, P.Jayalakshmi, N.Asha. “Optimization of Real-Time Video Over 3G Wireless Networks” September – 2015, International Journal of Applied Engineering Research, Vol No - 10, Issue No - 18, pp. 39724 – 39730.
R. Jain, “Introduction to Kafka and Zookeeper,” Hyderabad Hadoop User Group, 16 June 2013.
S. Gwen and J. Holoman, “Flafka: Apache Flume Meets Apache Kafka for Event Processing,” Cloudera, 6 November 2014.
Apache Flume, “Welcome to Apache Flume,” Apache Flume, 20 May 2015.
A.S.Syed Navaz & Dr.G.M. Kadhar Nawaz “Flow Based Layer Selection Algorithm for Data Collection in Tree Structure Wireless Sensor Networks” March – 2016, International Journal of Applied Engineering Research, Vol No - 11, Issue No - 5, pp.–3359-3363.
A.S.Syed Navaz & Dr.G.M. Kadhar Nawaz “Layer Orient Time Domain Density Estimation Technique Based Channel Assignment in Tree Structure Wireless Sensor Networks for Fast Data Collection” June - 2016, International Journal of Engineering and Technology, Vol No - 8, Issue No - 3, pp.–1506-1512.
P. Abrahamsson, O. Salo, J. Ronkainen and J. Warsta, Agile software development methods: Review and analysis, VTT Publications 478, 2002.
J. Rao, “Clients,” Apache Kafka
E. Koolmeister, Interviewee, Inteview with Nordea. 20 April 2015.
J. Pettersson, Interviewee, Interview with BigData AB. 19 May 2015.
J. Hsieh and L. Urban, “Apache Flume Powered By,” Apache Flume, 8 May 2012.
J. Rao, “Apache Kafka Powered By,” Apache Kafka, 24 April 2015.
M. Zadeh, “Kafka at LinkedIn: Current and Future,” Linkedin, 29 January 2015.
R. Johnson, “Facebook's Scribe technology now open source,” Facebook, 24 October 2008.
Apache Storm, “Companies Using Apache Storm,” Apache Storm, 2014.
Apache Spark, “Spark Streaming Programming Guide,” Apache Spark,
Hortonworks, “Apache Storm: A system for processing streaming data in real time,”
Hortonworks, September 2014.
P. T. Goets, “Apache Storm and Spark Streaming Compared,” Hortonworks, 11 August 2014.
A.S.Syed Navaz, C.Prabhadevi & V.Sangeetha”Data Grid Concepts for Data Security in Distributed Computing” January 2013, International Journal of Computer Applications, Vol 61 – No 13, pp 6-11.
A.S.Syed Navaz, V.Sangeetha & C.Prabhadevi, “Entropy Based Anomaly Detection System to Prevent DDoS Attacks in Cloud” January 2013, International Journal of Computer Applications, Vol 62 – No 15, pp 42-47.
A.S.Syed Navaz, M.Ravi & T.Prabhu, “Preventing Disclosure of Sensitive Knowledge by Hiding Inference” February 2013, International Journal of Computer Applications, Vol 63 – No 1. pp. 32-38.
A.S.Syed Navaz, H.Iyyappa Narayanan & R.Vinoth.” Security Protocol Review Method Analyzer (SPRMAN)”, August – 2013, International Journal of Advanced Studies in Computers, Science and Engineering, Vol No – 2, Issue No – 4, pp. 53-58.
A.S.Syed Navaz & Dr.G.M. Kadhar Nawaz & A.S.Syed Fiaz “Slot Assignment Using FSA and DSA Algorithm in Wireless Sensor Network” October – 2014, Australian Journal of Basic and Applied Sciences, Vol No –8, Issue No –16, pp.11-17.
A.S.Syed Navaz & A.S.Syed Fiaz, “Load Balancing in P2P Networks using Random Walk Algorithm” March – 2015, International Journal of Science and Research, Vol No – 4, Issue No – 3, pp.2062-2066.
A.S.Syed Navaz, J.Antony Daniel Rex, S.Jensy Mary. “Cluster Based Secure Data Transmission in WSN” July – 2015, International Journal of Scientific & Engineering Research, Vol No - 6, Issue No - 7, pp. 1776 – 1781.
S.Jensy Mary, A.S Syed Navaz & J.Antony Daniel Rex, “QA Generation Using Multimedia Based Harvesting Web Information” November – 2015, International Journal of Innovative Research in Computer and Communication Engineering, Vol No - 3, Issue No - 11, pp.10381-10386.
A.S Syed Navaz & A.S.Syed Fiaz “Network Intelligent Agent for Collision Detection with Bandwidth Calculation” December – 2015, MCAS Journal of Research, Vol No – 2, pp.88-95, ISSN: 2454-115X.
A.S.Syed Fiaz, N.Asha, D.Sumathi & A.S.Syed Navaz “Data Visualization: Enhancing Big Data More Adaptable and Valuable” February – 2016, International Journal of Applied Engineering Research, Vol No - 11, Issue No - 4, pp.–2801-2804.
M.Ravi & A.S.Syed Navaz "Rough Set Based Grid Computing Service in Wireless Network" November - 2016, International Research Journal of Engineering and Technology, Vol No - 3, Issue No - 11, pp.1122– 1126.
A.S.Syed Fiaz, I.Alsheba & R.Meena “Using Neural Networks to Create an Adaptive Character Recognition System”, Sep 2015, Discovery - The International Daily journal, Vol.37 (168), pp.53-58.
A.S.Syed Fiaz, M. Usha and J. Akilandeswari “A Brokerage Service Model for QoS support in Inter-Cloud Environment“,March 2013, International Journal of Information and Computation Technology, Vol.3, No.3, pp 257-260,
A.S.Syed Fiaz, R.Pushpapriya, S.Kirubashini & M.Sathya “Generation and allocation of subscriber numbers for telecommunication“, March2013, International Journal of Computer Science Engineering and Information Technology Research, Vol No: 3; Issue No: 1, pp. 257-266.
A.S.Syed Fiaz, N.Devi, S.Aarthi "Bug Tracking and Reporting System", March 2013, International Journal of Soft Computing and Engineering, Vol No: 3; Issue No: 1, pp. 257-266.
M. Usha, J. Akilandeswari and A.S.Syed Fiaz “An efficient QoS framework for Cloud Brokerage Services”, Dec. 2012, International Symposium on Cloud and Service Computing, pp: 76-79, 17-18, IEEE Xplore.
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.