Publication detail

Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

MAŠEK, J. BURGET, R. POVODA, L. DUTTA, M.

Original Title

Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

English Title

Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL

Type

journal article - other

Language

en

Original Abstract

Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high–performance computation capabilities with a good price. This paper deals with a multi–GPU OpenCL and CUDA implementations of k–Nearest Neighbor (k– NN) algorithm. This work compares performances of OpenCL and CUDA implementations where each of them is suitable for different number of used attributes. The proposed CUDA algorithm achieves acceleration up to 880x in comparison with a single thread CPU version. The common k-NN was modified to be faster when the lower number of k neighbors is set. The performance of algorithm was verified with two GPUs dual-core NVIDIA GeForce GTX 690 and CPU Intel Core i7 3770 with 4.1 GHz frequency. The results of speed up were measured for one GPU, two GPUs, three and four GPUs. We performed several tests with data sets containing up to 4 million elements with various number of attributes.

English abstract

Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high–performance computation capabilities with a good price. This paper deals with a multi–GPU OpenCL and CUDA implementations of k–Nearest Neighbor (k– NN) algorithm. This work compares performances of OpenCL and CUDA implementations where each of them is suitable for different number of used attributes. The proposed CUDA algorithm achieves acceleration up to 880x in comparison with a single thread CPU version. The common k-NN was modified to be faster when the lower number of k neighbors is set. The performance of algorithm was verified with two GPUs dual-core NVIDIA GeForce GTX 690 and CPU Intel Core i7 3770 with 4.1 GHz frequency. The results of speed up were measured for one GPU, two GPUs, three and four GPUs. We performed several tests with data sets containing up to 4 million elements with various number of attributes.

Keywords

Artificial intelligence, big data, comparison, CUDA, GPU, high performance computing, k-NN, multi–GPU, OpenCL.

Released

10.06.2016

Publisher

International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems

Pages from

101

Pages to

107

Pages count

7

URL

BibTex


@article{BUT125826,
  author="Jan {Mašek} and Radim {Burget} and Lukáš {Povoda} and Malay Kishore {Dutta}",
  title="Multi–GPU Implementation of Machine Learning Algorithm using CUDA and OpenCL",
  annote="Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high–performance computation capabilities with a good price. This paper deals with a multi–GPU OpenCL and CUDA implementations of k–Nearest Neighbor (k– NN) algorithm. This work compares performances of OpenCL and CUDA implementations where each of them is suitable
for different number of used attributes. The proposed CUDA algorithm achieves acceleration up to 880x in comparison with a single thread CPU version. The common k-NN was modified to be faster when the lower number of k neighbors is set. The performance of algorithm was verified with two GPUs dual-core
NVIDIA GeForce GTX 690 and CPU Intel Core i7 3770 with 4.1 GHz frequency. The results of speed up were measured for one GPU, two GPUs, three and four GPUs. We performed several tests with data sets containing up to 4 million elements with
various number of attributes.",
  address="International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems",
  chapter="125826",
  doi="10.11601/ijates.v5i2.142",
  howpublished="online",
  institution="International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems",
  number="2",
  volume="5",
  year="2016",
  month="june",
  pages="101--107",
  publisher="International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems",
  type="journal article - other"
}