!til &compilers To improve the ease of GPU programming, this dissertation presents a system forfully-automatic parallelization for C and C++ codes for GPUs. The system consistsof a compiler and a run-time system. The compiler generates pipeline parallelizationsfor GPUs and the run-time system provides software-only shared memory. The maincontributions are: the first automatic data management and communication opti-mization framework for GPUs and the first automatic pipeline parallelization systemfor GPUs.

Reference: https://liberty.princeton.edu/Publications/phdthesis_tjablin.pdf