Open Access Open Access  Restricted Access Subscription or Fee Access

A Survey on Vectorization on Intel Xeon Phi Coprocessor: Technique for Performance Optimization

Niraj J. Tiwari, Pujashree S. Vidap, Pallavi G. Gavali

Abstract


In computer science, vectorization is the process of converting an algorithm from a scalar implementation to a vector process. It does an operation on all the pairs of operands stored in SIMD registers at a time. This is totally different from task parallelism using MPI, OpenMP or alternative parallel libraries wherever extra cores or nodes are added to take care of information belonging to separate tasks placed on completely different cores or nodes. It adds a form of parallelism to software. It is a well-known technique for performance optimization. It makes the full use of the features provided by the hardware for parallelism. We discuss two vectorization techniques in this paper which are The Loop Unrolling technique and the Two Way Vectorization technique. The aim of this work is to study the work done until now on vectorization and its related techniques for performance optimization.

Keywords


Vectorization, SIMD, Parallelism

Full Text:

PDF

References


Gil Rapaport, Ayal Zaks ,Yosi Ben-Asher “Streamlining Whole Function Vectorization in C using Higher Order Vector Semantics” 2015 IEEE International Parallel and Distributed Process-ing Symposium Workshops

Xinmin Tian, Hideki Saito, Serguei V. Preis, Eric N. Garcia, Sergey S. Kozhukhov Matt Masten, Aleksei G. Cherkasov and Nikolay Panchenko ”Practical SIMD Vectorization Techniques for Intel Xeon Phi Coprocessors” 2013 IEEE 27th International Symposium on Parallel and Distributed Processing

Suttinee Sawadsitang, Kongrath Suankaewmanee, Shyh-hao Kuo, Bhume Bhumiratana ”Interactive Computer-Aided Code Vectorization” 2012 Ninth International Joint Conference on Computer Science and Software Engineering(JCSSE).

Hanbing Li0,Isabelle Puaut, Erven Rohou ”Tracing Flow Information for Tighter WCET Estimation: Application to Vectorization” 2015 IEEE 21st International Conference on Embedded and Real-Time Computing Systems and Applications

Ralf Karrenberg, Sebastian Hack “Whole-Function Vectorization” 978-1-61284-357-5/11/2011 IEEE

Juan M. Cebrian and Lasse Natvig, Jan Christian Meyer “Improving Energy Efficiency through Parallelization and Vectorization on Intel R CoreTM i5 and i7 Processors” 2012 SC Companion: High Performance Computing, Networking Storage and Analysis

Jim Jeffers, James Reinders. “Intel Xeon Phi Co-Processor High Performance Programming”

“Top500 supercomputing sites” http://www.top500.org

B. Hadri. “Application Experiences on a Cluster Supercomputer Equipped with Intel Xeon Phi Coprocessors” In Supercomputing Lab, 2013

Markus Weinhardt and Wayne Luk, Member, IEEE “Pipeline Vectorization” IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 20, NO. 2, FEBRUARY 2001 ,4

Olaf Krzikalla1, Kim Feldhoff1, Ralph Mller-Pfefferkorn1, and Wolfgang E. Nagel1 “Auto-Vectorization Techniques for Modern SIMD Architectures”

Xinmin Tian, Hideki Saito, Serguei V. Prei‡, Eric N. Garcia, Sergey S. Kozhukhov Matt Masten, Aleksei G. Cherkasov and Nikolay PanchenkoS.” Practical SIMD Vectorization Techniques for Intel® Xeon Phi™ Coprocessors” 2013 IEEE 27th International Symposium on Parallel & Distributed Processing Workshops and PhD Forum.

Alain Ketterlin Philippe Clauss, “Profiling Data-Dependence to Assist Parallelization:Framework, Scope, and Optimization”, 2012 IEEE/ACM 45th Annual International Symposium on Microarchitecture


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.