Detail publikace

Can the performance of GPGPU really beat CPU in evolutionary design task?

Originální název

Can the performance of GPGPU really beat CPU in evolutionary design task?

Anglický název

Can the performance of GPGPU really beat CPU in evolutionary design task?

Jazyk

en

Originální abstrakt

With the appearance of modern general purpose graphical processor units (GPU), a powerful and cheap architecture has entered the field of scientific computation. This highly parallel architecture, formerly designed for floating point graphical operation acceleration, is now being used for the acceleration of various algorithms. During the past few years, various papers dealing with the utilization of GPUs in general purpose computing have been published. Even evolutionary algorithms have been accelerated [1, 3], among them genetic programming and its variants. In order to achieve maximal performance of genome evaluation, various approaches of candidate solution evaluation have been proposed. The genome can be evaluated as a program which can be directly downloaded into the GPU [1] or interpreted by using an interpreter program running on the GPU [2]. Due to the architectural limitations, the second method appears to be more promising in comparison with the previous one. The GPUs are accessible via special frameworks providing an interface between GPU and CPU. The purpose of these frameworks is to provide a comfortable programming interface for rapid application development at different abstraction level. Thus, the utilized framework has a serious impact on the application's performance, since the higher abstraction the lower performance. In this work [4] we focus on the acceleration of CGP, which will be utilized for the evolutionary design of image filters. The application is written by using the nVidia CUDA framework, which allows a low-level access to the GPU resources. Several different ways, how to implement the candidate solution evaluation, with various performance impacts are discussed. Obtained results are compared with a CPU-based implementation. The experimental results show, that the accelerated application does not exhibit the desired performance and even in some cases is outperformed by a CPU-based application.

Anglický abstrakt

With the appearance of modern general purpose graphical processor units (GPU), a powerful and cheap architecture has entered the field of scientific computation. This highly parallel architecture, formerly designed for floating point graphical operation acceleration, is now being used for the acceleration of various algorithms. During the past few years, various papers dealing with the utilization of GPUs in general purpose computing have been published. Even evolutionary algorithms have been accelerated [1, 3], among them genetic programming and its variants. In order to achieve maximal performance of genome evaluation, various approaches of candidate solution evaluation have been proposed. The genome can be evaluated as a program which can be directly downloaded into the GPU [1] or interpreted by using an interpreter program running on the GPU [2]. Due to the architectural limitations, the second method appears to be more promising in comparison with the previous one. The GPUs are accessible via special frameworks providing an interface between GPU and CPU. The purpose of these frameworks is to provide a comfortable programming interface for rapid application development at different abstraction level. Thus, the utilized framework has a serious impact on the application's performance, since the higher abstraction the lower performance. In this work [4] we focus on the acceleration of CGP, which will be utilized for the evolutionary design of image filters. The application is written by using the nVidia CUDA framework, which allows a low-level access to the GPU resources. Several different ways, how to implement the candidate solution evaluation, with various performance impacts are discussed. Obtained results are compared with a CPU-based implementation. The experimental results show, that the accelerated application does not exhibit the desired performance and even in some cases is outperformed by a CPU-based application.

BibTex


@inproceedings{BUT33441,
  author="Václav {Šimek} and Zdeněk {Vašíček} and Karel {Slaný}",
  title="Can the performance of GPGPU really beat CPU in evolutionary design task?",
  annote="With the appearance of modern general purpose graphical processor units (GPU),
a powerful and cheap architecture has entered the field of scientific
computation. This highly parallel architecture, formerly designed for floating
point graphical operation acceleration, is now being used for the acceleration
of
various algorithms. 

During the past few years, various papers dealing with the utilization of GPUs in
general purpose computing have been published. Even evolutionary algorithms have
been accelerated [1, 3], among them genetic programming and its variants. In
order to achieve maximal performance of genome evaluation, various approaches of
candidate solution evaluation have been proposed. The genome can be evaluated as
a program which can be directly downloaded into the GPU [1] or interpreted by
using an interpreter program running on the GPU [2]. Due to the architectural
limitations, the second method appears to be more promising in comparison with
the previous one.

The GPUs are accessible via special frameworks providing an interface between GPU
and CPU. The purpose of these frameworks is to provide a comfortable programming
interface for rapid application development at different abstraction level. Thus,
the utilized framework has a serious impact on the application's performance,
since the higher abstraction the lower performance.

In this work [4] we focus on the acceleration of CGP, which will be utilized for
the evolutionary design of image filters. The application is written by using the
nVidia CUDA framework, which allows a low-level access to the GPU resources.
Several different ways, how to implement the candidate solution evaluation, with
various performance impacts are discussed. Obtained results are compared with
a CPU-based implementation. The experimental results show, that the accelerated
application does not exhibit the desired performance and even in some cases is
outperformed by a CPU-based application.",
  address="Masaryk University",
  booktitle="4th Doctoral Workshop on Mathematical and Engineering Methods in Computer Science",
  chapter="33441",
  edition="NEUVEDEN",
  howpublished="print",
  institution="Masaryk University",
  year="2008",
  month="october",
  pages="264--264",
  publisher="Masaryk University",
  type="conference paper"
}