In recent years, graphics processing units (GPUs) have thoroughly permeated consumer processor designs. It is now essentially impossible to find a smartphone, tablet or laptop without a substantial integrated GPU on the processor die. Utilizing these omnipresent GPUs, however, remains a challenge. Writing correct and performant parallel code, a notoriously difficult task, is exacerbated by the high degrees of parallelism that GPUs demand to attain their full potential. GPU programming models have also grown more expressive over time to support increasingly general-purpose GPU (GPGPU) programming. This extra expressiveness, unfortunately, allows many kinds of subtle performance and correctness bugs to arise, several of which are particular to GPGPU programming.

We have built a series of systems to help programmers grapple with the complexity of GPU programming, including a concurrency bug detector (Barracuda) that scale to millions of threads, a compiler analysis (GPU Drano) that can detect performance bottlenecks without having to execute code, and compiler transformations that can automatically repair some kinds of performance bugs by leveraging novel aspects of GPU scheduling. We are building out a suite of software tools that can boost GPU programmer productivity by automatically resolving many of the correctness and performance issues that plague GPU code.

For more information, see the Computer Architecture Research page.

Collaborators

Support

NSF, Nvidia

Students and Postdocs

  • Nimit Singhania
  • Yuanfeng Peng
  • Omar Navarro Leija