Detail publikace

GPU Accelerated Solver of Time-Dependent Air Pollutant Transport Equations

Originální název

GPU Accelerated Solver of Time-Dependent Air Pollutant Transport Equations

Anglický název

GPU Accelerated Solver of Time-Dependent Air Pollutant Transport Equations

Jazyk

en

Originální abstrakt

Main objective of this paper is to outline possible ways how to achieve a substantial acceleration in case of advection-diffusion equation (A-DE) calculation, which is commonly used for a description of the pollutant behavior in atmosphere. A-DE is a kind of partial differential equation (PDE) and in general case it is usually solved by numerical integration due to its high complexity. These types of calculations are time consuming thus the main idea of our work is to adopt CUDA platform and commodity GPU card to do the calculations in a faster way. The solution is based on method of lines with 4th order Runge-Kutta scheme to handle the integration. As a matter of fact, the selected approach involves number of auxiliary variables and thus the memory management is critical in order to achieve desired performance. From a technical point of view, we have implemented a particular variant of the A-DE system, where the pollutant concentration is time-dependent. An efficient data handling is primarily based on the exploitation of shared memory blocks and texture caches inside GPU chip. Detailed evaluation of the obtained results is given in this paper where an astonishing execution speed up of GPU-based solution is demonstrated in comparison to standard CPU.

Anglický abstrakt

Main objective of this paper is to outline possible ways how to achieve a substantial acceleration in case of advection-diffusion equation (A-DE) calculation, which is commonly used for a description of the pollutant behavior in atmosphere. A-DE is a kind of partial differential equation (PDE) and in general case it is usually solved by numerical integration due to its high complexity. These types of calculations are time consuming thus the main idea of our work is to adopt CUDA platform and commodity GPU card to do the calculations in a faster way. The solution is based on method of lines with 4th order Runge-Kutta scheme to handle the integration. As a matter of fact, the selected approach involves number of auxiliary variables and thus the memory management is critical in order to achieve desired performance. From a technical point of view, we have implemented a particular variant of the A-DE system, where the pollutant concentration is time-dependent. An efficient data handling is primarily based on the exploitation of shared memory blocks and texture caches inside GPU chip. Detailed evaluation of the obtained results is given in this paper where an astonishing execution speed up of GPU-based solution is demonstrated in comparison to standard CPU.

BibTex


@inproceedings{BUT33783,
  author="Radim {Dvořák} and Václav {Šimek} and František {Zbořil} and Vladimír {Drábek}",
  title="GPU Accelerated Solver of Time-Dependent Air Pollutant Transport Equations",
  annote="Main objective of this paper is to outline possible ways how to achieve a
substantial acceleration in case of advection-diffusion equation (A-DE)
calculation, which is commonly used for a description of the pollutant behavior
in atmosphere. A-DE is a kind of partial differential equation (PDE) and in
general case it is usually solved by numerical integration due to its high
complexity. These types of calculations are time consuming thus the main idea of
our work is to adopt CUDA platform and commodity GPU card to do the calculations
in a faster way. The solution is based on method of lines with 4th order
Runge-Kutta scheme to handle the integration. As a matter of fact, the selected
approach involves number of auxiliary variables and thus the memory management is
critical in order to achieve desired performance. From a technical point of view,
we have implemented a particular variant of the A-DE system, where the pollutant
concentration is time-dependent. An efficient data handling is primarily based on
the exploitation of shared memory blocks and texture caches inside GPU chip.
Detailed evaluation of the obtained results is given in this paper where an
astonishing execution speed up of GPU-based solution is demonstrated in
comparison to standard CPU.",
  address="IEEE Computer Society",
  booktitle="12th EUROMICRO Conference on Digital System Design DSD 2009",
  chapter="33783",
  doi="10.1109/DSD.2009.146",
  edition="NEUVEDEN",
  howpublished="print",
  institution="IEEE Computer Society",
  year="2009",
  month="may",
  pages="1--7",
  publisher="IEEE Computer Society",
  type="conference paper"
}