Course detail
Parallel Computations on GPU
FIT-PCGAcad. year: 2019/2020
The course covers the architecture and programming of graphics processing units by the NVidia and partially AMD. First, the architecture of GPUs is studied in detail. Then, the model of the program execution using hierarchical thread organisation and the SIMT model is discussed. Next, the memory hierarchy and synchronization techniques are described. After that, the course explains novel techniques of dynamic parallelism and data-flow processing concluded by practical usage of multi-GPU systems in environments with shared (NVLink) and distributed (MPI) memory. The second part of the course is devoted to high level programming techniques and libraries based on the OpenACC technology.
Supervisor
Department
Learning outcomes of the course unit
Knowledge of the parallel programming on GPUs in the area of general purpose computing, orientation in the area of accelerated systems, libraries and tools.
Understanding of hardware limitations having impact on the efficiency of software solutions.
Prerequisites
Knowledge gained in courses AVS and partially in PRL and PPP.
Co-requisites
Not applicable.
Recommended optional programme components
Not applicable.
Recommended or required reading
aktuální PPT prezentace přednášek
Dokumentace Nvidia: https://docs.nvidia.com/cuda/
Dokumentace OpenACC: https://www.openacc.org/
Kirk, D., and Hwu, W.: Programming Massively Parallel Processors: A Hands-on Approach, Elsevier, 2010, s. 256, ISBN: 978-0-12-381472-2
Sanders, J., & Kandrot, E: CUDA by Example: An Introduction to General-Purpose GPU Programming. Review Literature And Arts Of The Americas. Addison-Wesley, 2010.
Storti,D., and Yurtoglu, M.: CUDA for Engineers: An Introduction to High-Performance Parallel Computing, Addison-Wesley Professional; 1 edition, 2015. ISBN 978-0134177410.
Chandrasekaran, S., and Juckeland, G.: OpenACC for Programmers: Concepts and Strategies, Addison-Wesley Professional, 2017, ISBN 978-0134694283
Planned learning activities and teaching methods
Not applicable.
Assesment methods and criteria linked to learning outcomes
Assessment of two projects, 14 hours in total and, computer laboratories and a midterm examination.
Exam prerequisites:
Language of instruction
Czech
Work placements
Not applicable.
Aims
To familiarize yourself with the architecture and programming of graphics processing unit in the area of general purpose computuing using the NVidia libraries and OpenACC standard. To learn how to design and implement accelerated programs exploiting the potential of GPUs. To gain knowledge about the available libraries for programming on GPUs.
Specification of controlled education, way of implementation and compensation for absences
- Missed labs can be substituted in alternative dates.
- There will be a place for missed labs in the last week of the semester.
Classification of course in study plans
- Programme MITAI Master's
specialization NADE , any year of study, winter semester, 5 credits, elective
specialization NBIO , any year of study, winter semester, 5 credits, elective
specialization NGRI , any year of study, winter semester, 5 credits, elective
specialization NNET , any year of study, winter semester, 5 credits, elective
specialization NVIZ , any year of study, winter semester, 5 credits, elective
specialization NCPS , any year of study, winter semester, 5 credits, elective
specialization NSEC , any year of study, winter semester, 5 credits, elective
specialization NEMB , any year of study, winter semester, 5 credits, elective
specialization NHPC , any year of study, winter semester, 5 credits, compulsory
specialization NISD , any year of study, winter semester, 5 credits, elective
specialization NIDE , any year of study, winter semester, 5 credits, elective
specialization NISY , any year of study, winter semester, 5 credits, elective
specialization NMAL , any year of study, winter semester, 5 credits, elective
specialization NMAT , any year of study, winter semester, 5 credits, elective
specialization NSEN , any year of study, winter semester, 5 credits, elective
specialization NVER , any year of study, winter semester, 5 credits, elective
specialization NSPE , any year of study, winter semester, 5 credits, elective
Type of course unit
Lecture
26 hours, optionally
Teacher / Lecturer
Syllabus
- Architecture of graphics processing units.
- CUDA programming model, tread execution.
- CUDA memory hierarchy.
- Synchronization and reduction.
- Dynamic parallelism and unified memory.
- Design and optimization of GPU algorithms.
- Stream processing, computation-communication overlapping.
- Multi-GPU systems.
- Nvidia Thrust library.
- OpenACC basics.
- OpenACC memory management.
- Code optimization with OpenACC.
- Libraries and tools for GPU programming.
Exercise in computer lab
12 hours, compulsory
Teacher / Lecturer
Syllabus
- CUDA: Memory transfers, simple kernels
- CUDA: Shared memory
- CUDA: Texture and constant memory
- CUDA: Dynamic parallelism and unified memory.
- OpenACC: basic techniques.
- OpenACC: advanced techniques.
Project
14 hours, compulsory
Teacher / Lecturer
Syllabus
- Development of an application in Nvidia CUDA
- Development of an application in OpenACC