Beta 1

Title Verification of Algorithms for Numerical Processors using a GPU
Author Varin, Olivier
Supervisor Nannarelli, Alberto (Embedded Systems Engineering, Department of Informatics and Mathematical Modeling, Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark)
Institution Technical University of Denmark, DTU, DK-2800 Kgs. Lyngby, Denmark
Thesis level Master's thesis
Year 2010
Abstract In hardware design, engineers are limited by the current technology when it comes to certain heavy computations like the verification of a double precision floating point numerical processor. This type of operation demands a very large amount of computation that a standard CPU can not complete at the moment. A solution to this problem could be to run the verification on a massively parallel, more powerful device: the graphics processing unit (GPU). Originally created to carry out graphic calculations for the execution of video games, the GPU is a powerful and highly parallel processor that has become more programmable recently. GPUs are increasingly used as accelerators for complicated generalpurpose computing applications. This thesis focuses on creating a method that tries to solve the problem stated by taking a high-level hardware description of a double-precision floating point numerical processor, converting it into a format that can take advantage of the parallelism of the GPU, and running the verification of the processor. The GPUs used for this thesis are the NVIDIA Tesla C1060 and the NVIDIA Tesla C2050, both based on a similar architecture. A Python high-level hardware description language is used as the input for describing the processors, then a converter written in Python generates a test bench that can be read by the NVIDIA GPUs. The test bench is written in a parallel programming language specific to those GPUs: the CUDA language. In this thesis, a selection of commonly used components has been translated into CUDA. To illustrate the method, a processor calculating the Newton- Raphson reciprocal approximation has been created and tested on different processors. In the end, for the same verification, the two GPUs showed on average a latency 100 times (for the C1060) and 220 times (for the C2050) lower than an Intel Core2 Duo 2.80GHz processor.
Imprint Technical University of Denmark (DTU) : Kgs. Lyngby, Denmark
Series IMM-M.Sc.-2010-100
Original PDF ep10_100.pdf (0.70 MB)
Admin Creation date: 2010-12-21    Update date: 2010-12-21    Source: dtu    ID: 271747    Original MXD