Open Access Open Access  Restricted Access Subscription or Fee Access

Benchmark Test on Cassandra Using YCSB

Sandeep Kaur, Amneet Kaur

Abstract


Big data systems provide us with capability of storing, capturing and analyzing big data, but such systems face lot of challenges. So there is need to develop benchmarks in order to compare and evaluate such systems. Big data benchmarks generate system workloads and tests for evaluating its performance and provide meaningful results. In this paper, benchmark test on Cassandra database is performed using Yahoo Cloud Service Benchmark. Python is used to establish connection with Yahoo Cloud Service Benchmark in windows operating system. At first keyspace is created in Cassandra, then loading and running data in Yahoo Cloud Service Benchmark.  The final results will appear in the form of throughput, latency and run time.   


Keywords


Big Data; YCSB; Maven; Cassandra; Database; Git; Benchmark; Throughput; Workload.

Full Text:

PDF

References


Gandini A, Gribaudo M, Knottenbelt W. J, Osman R and Piazzolla P, ‘Performance evaluation of NoSQL databases” In Computer Performance Engineering, pp 16-29.

Han J, Haihong E, Le G and Du J “Survey on NoSQL database”. In Pervasive computing and applications (ICPCA), 2011 6th international conference on pp. 363-366 IEEE..

Abramova V, Bernardino J and Furtado P “Testing Cloud Benchmark Scalability with Cassandra” In Services (SERVICES), 2014 IEEE World Congress on pp. 434-441 IEEE.

Faria mehak “configuration manual- yahoo cloud service benchmark”

Abramova V, and Bernardino J “NoSQL databases: MongoDB vs cassandra” In Proceedings of the International C* Conference on Computer Science and Software Engineering pp. 14-22 ACM.

Manoj V “Comparative Study of NoSQL Document, Column Store Databases and Evaluation of Cassandra” 2014 International Journal of Database Management Systems (IJDMS) Vol, 6.

http://en.wikipedia.org/wiki/YCSB

Dinh, T. D. (2009). Hadoop Performance Evaluation. Research report (practicum stage), Ruprecht-Karls Universitat Heidelberg (Institute of Computer Science).

Dede E, Sendir B, Kuzlu P, Hartog J and Govindaraju M (2013) An Evaluation of Cassandra for Hadoop. In Cloud Computing (CLOUD) 2013 IEEE Sixth International Conference (pp. 494-501). IEEE.

https://www.amax.com/enterprise/pdfs/AMAX%20Emulex%20Hadoop % 20Whitepaper.pdf

Fadika Z, Dede E, Govindaraju M, & Ramakrishnan L Benchmarking mapreduce implementations for application usage scenarios. In Grid Computing (GRID) 2011 12th IEEE ACM International Conference on (pp 90-97) IEEE, 2011, September.

Pavlo A, Paulson E, Rasin A and Abadi D J, DeWitt D J, Madden S, and Stonebraker M A comparison of approaches to large-scale data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data (pp 165-178) ACM, 2009.

Nagdive A S, Tugnayat R M & Tembhurkar M P.Overview on Performance Testing Approach in Big Data. International Journal of Advanced Research in Computer Science, 5(8).

Gudipadi M, Rao S, Mohan D N &Gajja N K. Bigdata: Testing approach to overcome quality challenges in Infosys lab Briefings 11

Chaudhary U and Singh H Mapreduce performance evaluation through benchmarking and stress testing on multi-node Hadoop cluster. International Journal of Computational Engineering Research (IJCER) 4:2250-3005, 2014.

http://planetcassandra.org/nosql-performancebenchmar ks


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.