Anatomy of GPU Memory System for Multi-Application Execution
Exploiting Core Criticality for Enhanced GPU Performance
?C-States
Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities
Race-to-sleep + content caching + display caching
Modeling and synthesizing task placement constraints in Google compute clusters
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation - PACT ’16
Proceedings of the 2nd ACM Symposium on Cloud Computing - SOCC ’11
Proceedings of the 2016 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Science - SIGMETRICS ’16
Proceedings of the 2015 International Symposium on Memory Systems - MEMSYS ’15
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture - MICRO-50 ’17
Mahmut T. Kandemir
Ashutosh Pattnaik
Onur Kayiran
Adwait Jog
Onur Mutlu
Xulong Tang
TetriSched: Global Rescheduling with Adaptive Plan-ahead in Dynamic Heterogeneous Clusters
Data movement aware computation partitioning