Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems
Anatomy of GPU Memory System for Multi-Application Execution
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
21st century digital design tools
Proceedings of the 50th Annual Design Automation Conference on - DAC ’13
Tenth international conference on architectural support for programming languages and operating systems on Proceedings of the 10th international conference on architectural support for programming languages and operating systems (ASPLOS-X) - ASPLOS ’02
Proceedings of the 2015 International Symposium on Memory Systems - MEMSYS ’15
2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)
Niladrish Chatterjee
Nandita Vijaykumar
Eiman Ebrahim
Kevin Hsieh
Chris Malachowsky
Mike O’Connor
Synchoricity and NOCs could make Billion Gate Custom Hardware Centric SOCs Affordable
Data movement aware computation partitioning