default search action
PACT 2014: Edmonton, AB, Canada
- José Nelson Amaral, Josep Torrellas:
International Conference on Parallel Architectures and Compilation, PACT '14, Edmonton, AB, Canada, August 24-27, 2014. ACM 2014, ISBN 978-1-4503-2809-8
Keynote I
- Klara Nahrstedt:
Internet of mobile things: challenges and opportunities. 1-2
Best papers
- Nuno Diegues, Paolo Romano, Luís E. T. Rodrigues:
Virtues and limitations of commodity hardware transactional memory. 3-14 - Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, Kathryn S. McKinley:
Cooperative cache scrubbing. 15-26 - Harshvardhan, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:
KLA: a new algorithmic paradigm for parallel graph computations. 27-38 - Uday Bondhugula, Vinayaka Bandishti, Albert Cohen, Guillain Potron, Nicolas Vasilache:
Tiling and optimizing time-iterated computations on periodic domains. 39-50
Session 2A: cache hierarchies (I)
- Cheng-Chieh Huang, Vijay Nagarajan:
ATCache: reducing DRAM cache latency via a small SRAM tag cache. 51-60 - Lunkai Zhang, Dmitri B. Strukov, Hebatallah Saadeldeen, Dongrui Fan, Mingzhe Zhang, Diana Franklin:
SpongeDirectory: flexible sparse directories utilizing multi-level memristors. 61-74 - Gaurav Chadha, Scott A. Mahlke, Satish Narayanasamy:
EFetch: optimizing instruction fetch for event-driven webapplications. 75-86 - Biswabandan Panda, Shankar Balachandran:
XStream: cross-core spatial streaming based MLC prefetchers for parallel applications in CMPs. 87-98
Session 2B1: parallelism studies
- Cedomir Segulja, Tarek S. Abdelrahman:
What is the cost of weak determinism? 99-112 - Ehsan Fatehi, Paul Gratz:
ILP and TLP in shared memory applications: a limit study. 113-126
Session 2B2: algorithms
- Wookeun Jung, Jongsoo Park, Jaejin Lee:
Versatile and scalable parallel histogram construction. 127-138 - Robert D. Cameron, Thomas C. Shermer, Arrvindh Shriraman, Kenneth S. Herdy, Dan Lin, Benjamin R. Hull, Meng Lin:
Bitwise data parallelism in regular expression matching. 139-150
Session 3A: gpus (I)
- Rashid Kaleem, Rajkishore Barik, Tatiana Shpeisman, Brian T. Lewis, Chunling Hu, Keshav Pingali:
Adaptive heterogeneous scheduling for integrated GPUs. 151-162 - James A. Jablin, Thomas B. Jablin, Onur Mutlu, Maurice Herlihy:
Warp-aware trace scheduling for GPUs. 163-174 - Shin-Ying Lee, Carole-Jean Wu:
CAWS: criticality-aware warp scheduling for GPGPU workloads. 175-186
Session 3B: transactional memory
- Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy:
Invyswell: a hybrid transactional memory for haswell's restricted transactional memory. 187-200 - Lihang Zhao, Jeffrey T. Draper:
Consolidated conflict detection for hardware transactional memory. 201-212 - Kaushik Ravichandran, Ada Gavrilovska, Santosh Pande:
DeSTM: harnessing determinism in STMs for application development. 213-224
Session 4A: energy efficiency
- Qiumin Xu, Murali Annavaram:
PATS: pattern aware scheduling and power gating for GPGPUs. 225-236 - Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das, Ronald G. Dreslinski, Thomas F. Wenisch, Scott A. Mahlke:
Heterogeneous microarchitectures trump voltage scaling for low-power cores. 237-250 - Hamid Reza Ghasemi, Nam Sung Kim:
RCS: runtime resource and core scaling for power-constrained multi-core processors. 251-262
Session 4B: runtime systems
- Sean Treichler, Michael Bauer, Alex Aiken:
Realm: an event-based low-level runtime for distributed memory architectures. 263-276 - Matthias Diener, Eduardo Henrique Molina da Cruz, Philippe Olivier Alexandre Navaux, Anselm Busse, Hans-Ulrich Heiß:
kMAF: automatic kernel-level management of thread and data affinity. 277-288 - Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan:
Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems. 289-300
Keynote II
- Bob Blainey:
Domain-specific models for innovation in analytics. 301-302
Session 5A1: compiler frameworks
- Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, Saman P. Amarasinghe:
OpenTuner: an extensible framework for program autotuning. 303-316 - Rahul Garg, Laurie J. Hendren:
Velociraptor: an embedded compiler toolkit for numerical programs targeting CPUs and GPUs. 317-330
Session 5A2: scheduling
- Hao Wang, Ripudaman Singh, Michael J. Schulte, Nam Sung Kim:
Memory scheduling towards high-throughput cooperative heterogeneous computing. 331-342 - Dragos Sbirlea, Zoran Budimlic, Vivek Sarkar:
Bounded memory scheduling of dynamic task graphs. 343-356
Session 6A: cache hierarchies (II)
- Wei Ding, Mahmut T. Kandemir, Diana R. Guttman, Adwait Jog, Chita R. Das, Praveen Yedlapalli:
Trading cache hit rate for memory performance. 357-368 - Guilherme Piccoli, Henrique Nazaré Santos, Raphael Ernani Rodrigues, Christiane Pousa, Edson Borin, Fernando Magno Quintão Pereira:
Compiler support for selective page migration in NUMA architectures. 369-380 - Ying Ye, Richard West, Zhuoqun Cheng, Ye Li:
COLORIS: a dynamic cache partitioning system using page coloring. 381-392
Session 6B: performance tools and i/o
- Arnamoy Bhattacharyya, Torsten Hoefler:
PEMOGEN: automatic adaptive performance modeling during program runtime. 393-404 - Xu Liu, Kamal Sharma, John M. Mellor-Crummey:
ArrayTool: a lightweight profiler to guide array regrouping. 405-416 - Arash Tavakkol, Mohammad Arjomand, Hamid Sarbazi-Azad:
Design for scalability in enterprise SSDs. 417-430
Session 7: gpus (II)
- Davoud Anoushe Jamshidi, Mehrzad Samadi, Scott A. Mahlke:
D2MA: accelerating coarse-grained data transfer for GPUs. 431-442 - Janghaeng Lee, Mehrzad Samadi, Scott A. Mahlke:
VAST: the illusion of a large memory space for GPUs. 443-454 - Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle:
Automatic optimization of thread-coarsening for graphics processors. 455-466
Poster session
- Javier Cabezas, Lluís Vilanova, Isaac Gelado, Thomas B. Jablin, Nacho Navarro, Wen-mei W. Hwu:
Automatic execution of single-GPU computations across multiple GPUs. 467-468 - Alexandros-Herodotos Haritatos, Georgios I. Goumas, Nikos Anastopoulos, Konstantinos Nikas, Kornilios Kourtis, Nectarios Koziris:
LCA: a memory link and cache-aware co-scheduling approach for CMPs. 469-470 - Simon Holmbacka, Sébastien Lafond, Johan Lilius:
A run-time power manager exploiting software parallelism. 471-472 - Magnus Jahre:
Graph-based performance accounting for chip multiprocessor memory systems. 473-474 - Snehasish Kumar, Arrvindh Shriraman, Vijayalakshmi Srinivasan, Dan Lin, Jordon Phillips:
SQRL: hardware accelerator for collecting software data structures. 475-476 - Yulong Luo, Guangming Tan:
Optimizing stencil code via locality of computation. 477-478 - Deepak Majeti, Kuldeep S. Meel, Rajkishore Barik, Vivek Sarkar:
ADHA: automatic data layout framework for heterogeneous architectures. 479-480 - William F. Ogilvie, Pavlos Petoumenos, Zheng Wang, Hugh Leather:
Active learning accelerated automatic heuristic construction for parallel program mapping. 481-482 - Sreepathi Pai, R. Govindarajan, Matthew J. Thazhuthaveetil:
Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels. 483-484 - Xiang Pan, Radu Teodorescu:
Using STT-RAM to enable energy-efficient near-threshold chip multiprocessors. 485-486 - Raj Parihar, Jacob Brock, Chen Ding, Michael C. Huang:
Protection and utilization in shared cache through rationing. 487-488 - Pushkar Ratnalikar, Arun Chauhan:
Automatic parallelism through macro dataflow in high-level array languages. 489-490 - Sudarshan Srinivasan, Nithesh kurella, Israel Koren, Rance Rodrigues, Sandip Kundu:
A runtime support mechanism for fast mode switching of a self-morphing core for power efficiency. 491-492 - Bradley Thwaites, Gennady Pekhimenko, Hadi Esmaeilzadeh, Amir Yazdanbakhsh, Onur Mutlu, Jongse Park, Girish Mururu, Todd C. Mowry:
Rollback-free value prediction with approximate loads. 493-494 - Erik Tomusk, Christophe Dubach, Michael F. P. O'Boyle:
Measuring flexibility in single-ISA heterogeneous processors. 495-496 - Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, Jeffrey S. Vetter:
SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling. 497-498
Poster Board
- Serguei Makarov, Angela Demke Brown, Ashvin Goel:
An event-based language for dynamic binary translation frameworks. 499-500 - Peng Li, Jeremy Buhler:
Improving performance of streaming applications with filtering and control messages. 501-502 - Jeeva Paudel, José Nelson Amaral:
Stratified sampling for even workload partitioning. 503-504 - Tejaswi Agarwal, Michela Becchi:
Design of a hybrid MPI-CUDA benchmark suite for CPU-GPU clusters. 505-506 - Sudharsan Jagathrakshakan, Venkata Kalyan Tavva, Madhu Mutyam:
Data remapping for an energy efficient burst chop in DRAM memory systems. 507-508 - Alexandre Isoard:
Data-reuse optimizations for pipelined tiling with parametric tile sizes. 509-510 - Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:
From petascale to the pocket: Adaptively scaling parallel programs for mobile SoCs. 511-512 - Alessandro Fanfarillo, Tobias Burnus, Valeria Cardellini, Salvatore Filippone, Dan Nagle, Damian W. I. Rouson:
Coarrays in GNU Fortran. 513-514 - Thomas R. W. Scogland, Wu-Chun Feng:
Locality-aware memory association for multi-target worksharing in OpenMP. 515-516 - Harshvardhan, Nancy M. Amato, Lawrence Rauchwerger:
Processing big data graphs on memory-restricted systems. 517-518
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.