default search action
23rd ICS 2009: Yorktown Heights, NY, USA
- Michael Gschwind, Alexandru Nicolau, Valentina Salapura, José E. Moreira:
Proceedings of the 23rd international conference on Supercomputing, 2009, Yorktown Heights, NY, USA, June 8-12, 2009. ACM 2009, ISBN 978-1-60558-498-0
Keynote Address I
- Mateo Valero:
A european perspective on supercomputing. 1
Keynote Address II
- Don G. Grice:
The roadrunner project and the importance of energy efficiency on the road to exascale computing. 2
Keynote Address III
- Ian T. Foster:
Computing outside the box. 3
Applications of the cell processor
- Konstantis Daloukas, Christos D. Antonopoulos, Nikolaos Bellas:
Implementation of a wide-angle lens distortion correction algorithm on the cell broadband engine. 4-13 - Daniele Paolo Scarpazza, Gregory F. Russell:
High-performance regular expression scanning on the Cell/B.E. processor. 14-25 - Srinivas Chellappa, Franz Franchetti, Markus Püschel:
Computer generation of fast fourier transforms for the cell broadband engine. 26-35 - Tao Liu, Haibo Lin, Tong Chen, Kevin O'Brien, Ling Shao:
DBDB: optimizing DMATransfer for the cell be architecture. 36-45
Cache enhancement techniques
- Julien Dusser, Thomas Piquet, André Seznec:
Zero-content augmented caches. 46-55 - Mohammad Hammoud, Sangyeun Cho, Rami G. Melhem:
Dynamic cache clustering for chip multiprocessors. 56-67 - Lingxiang Xiang, Tianzhou Chen, Qingsong Shi, Wei Hu:
Less reused filter: improving l2 cache performance via filtering less reused lines. 68-79 - Chuanjun Zhang, Bing Xue:
Divide-and-conquer: a bubble replacement for low level caches. 80-89
Optimizing parallel applications
- Hiroshi Nakashima, Yohei Miyake, Hideyuki Usui, Yoshiharu Omura:
OhHelp: a scalable domain-decomposing dynamic load balancing for particle-in-cell simulations. 90-99 - Mehmet Belgin, Godmar Back, Calvin J. Ribbens:
Pattern-based sparse matrix representation for memory-efficient SMVM kernels. 100-109 - Abhinav Bhatele, Laxmikant V. Kalé, Sameer Kumar:
Dynamic topology aware load balancing algorithms for molecular dynamics applications. 110-116
Transactional memory I
- JaeWoong Chung, Woongki Baek, Christos Kozyrakis:
Fast memory snapshot for concurrent programmingwithout synchronization. 117-125 - Vladimir Gajinov, Ferad Zyulkyarov, Osman S. Unsal, Adrián Cristal, Eduard Ayguadé, Tim Harris, Mateo Valero:
QuakeTM: parallelizing a complex sequential application using transactional memory. 126-135 - Arrvindh Shriraman, Sandhya Dwarkadas:
Refereeing conflicts in hardware transactional memory. 136-146
Compilers
- Albert Hartono, Muthu Manikandan Baskaran, Cédric Bastoul, Albert Cohen, Sriram Krishnamoorthy, Boyana Norris, J. Ramanujam, P. Sadayappan:
Parametric multi-level tiling of imperfectly nested loops. 147-157 - Cheng Wang, Youfeng Wu, Edson Borin, Shiliang Hu, Wei Liu, Dave Sager, Tin-Fook Ngai, Jesse Fang:
Dynamic parallelization of single-threaded binary programs using speculative slicing. 158-168 - Alexandru Nicolau, Guangqiang Li, Alexander V. Veidenbaum, Arun Kejariwal:
Synchronization optimizations for efficient execution on multi-cores. 169-180 - Jun Shirako, Jisheng M. Zhao, V. Krishna Nandivada, Vivek Sarkar:
Chunking parallel loops in the presence of synchronization. 181-192
High performance communications I
- Qasim Ali, Samuel P. Midkiff, Vijay S. Pai:
Efficient high performance collective communication for the cell blade. 193-203 - Junchang Wang, Haipeng Cheng, Bei Hua, Xinan Tang:
Practice of parallelizing network applications on multi-core architectures. 204-213 - Stavros Passas, Kostas Magoutis, Angelos Bilas:
Towards 100 gbit/s ethernet: multicore-based parallel communication protocol design. 214-224 - Jiuxing Liu, Bülent Abali:
Virtualization polling engine (VPE): using dedicated CPU cores to accelerate I/O virtualization. 225-234
Accelerating applications with GPUs I
- M. Suhail Rehman, Kishore Kothapalli, P. J. Narayanan:
Fast and scalable list ranking on the GPU. 235-243 - Sundaresan Venkatasubramanian, Richard W. Vuduc:
Tuned and wildly asynchronous stencil kernels for hybrid CPU/GPU systems. 244-255 - Jiayuan Meng, Kevin Skadron:
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs. 256-265
Architectures for High-Performance Computing
- Leo Porter, Dean M. Tullsen:
Creating artificial global history to improve branch prediction accuracy. 266-275 - Germán Rodríguez, Ramón Beivide, Cyriel Minkenberg, Jesús Labarta, Mateo Valero:
Exploring pattern-aware routing in generalized fat tree networks. 276-285
High-performance communications II
- Tobias Hilbrich, Bronis R. de Supinski, Martin Schulz, Matthias S. Müller:
A graph based approach for MPI deadlock detection. 296-305 - Matthew Small, Xin Yuan:
Maximizing MPI point-to-point communication performance on RDMA-enabled clusters with customized protocols. 306-315 - Anthony Danalis, Lori L. Pollock, D. Martin Swany, John Cavazos:
MPI-aware compiler optimizations for improving communication-computation overlap. 316-325 - Jiuxing Liu, Dan E. Poff, Bülent Abali:
Evaluating high performance communication: a power perspective. 326-337
Storage solutions for supercomputing
- Ji-Yong Shin, Zenglin Xia, Ning-Yi Xu, Rui Gao, Xiongfei Cai, Seungryoul Maeng, Feng-Hsiung Hsu:
FTL design exploration in reconfigurable high-performance SSD for server applications. 338-349 - Henry M. Monti, Ali Raza Butt, Sudharshan S. Vazhkudai:
/scratch as a cache: rethinking HPC center scratch storage. 350-359 - Chao Jin, Hong Jiang, Dan Feng, Lei Tian:
P-Code: a new RAID-6 code with optimal properties. 360-369 - Chuanyi Liu, Yu Gu, Linchun Sun, Bin Yan, Dongsheng Wang:
R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems. 370-379
Accelerating applications with GPUs II
- Guangming Tan, Ziyu Guo, Mingyu Chen, Dan Meng:
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU. 380-389 - Gabriel Falcão Paiva Fernandes, Vítor Manuel Mendes da Silva, Leonel Sousa:
How GPUs can outperform ASICs for fast LDPC decoding. 390-399 - Wenjing Ma, Gagan Agrawal:
A translation system for enabling data mining applications on GPUs. 400-409
Transactional memory II
- Polychronis Xekalakis, Nikolas Ioannou, Marcelo Cintra:
Combining thread level speculation helper threads and runahead execution. 410-420 - Salil Mohan Pant, Gregory T. Byrd:
Limited early value communication to improve performance of transactional memory. 421-429
Novel supercomputing applications
- Keith R. Bisset, Jiangzhuo Chen, Xizhou Feng, V. S. Anil Kumar, Madhav V. Marathe:
EpiFast: a fast algorithm for large scale realistic epidemic simulations on distributed memory systems. 430-439 - Rob van Nieuwpoort, John W. Romein:
Using many-core hardware to correlate radio astronomy signals. 440-449 - Jun Cao, Krista A. Novstrup, Ayush Goyal, Samuel P. Midkiff, James M. Caruthers:
A parallel levenberg-marquardt algorithm. 450-459
Power management
- Barry Rountree, David K. Lowenthal, Bronis R. de Supinski, Martin Schulz, Vincent W. Freeh, Tyler K. Bletsch:
Adagio: making DVS practical for complex HPC applications. 460-469 - Mohammad Arjomand, Hamid Sarbazi-Azad:
A comprehensive power-performance model for NoCs with multi-flit channel buffers. 470-478 - Andrew Herdrich, Ramesh Illikkal, Ravi R. Iyer, Donald Newell, Vineet Chadha, Jaideep Moses:
Rate-based QoS techniques for cache/memory in CMP platforms. 479-488
Posters
- Ahmad Faraj, Sameer Kumar, Brian E. Smith, Amith R. Mamidala, John A. Gunnels, Philip Heidelberger:
MPI collective communications on the blue gene/p supercomputer: algorithms and optimizations. 489-490 - James Poe, Clay Hughes, Tao Li:
TransMetric: architecture independent workload characterization for transactional memory benchmarks. 491-492 - Md. Mafijul Islam, Sally A. McKee, Per Stenström:
Cancellation of loads that return zero using zero-value caches. 493-494 - Huayong Wang, Henrique Andrade, Bugra Gedik, Kun-Lung Wu:
Auto-vectorization through code generation for stream processing applications. 495-496 - Aleksandr Ovcharenko, Onkar Sahni, Christopher D. Carothers, Kenneth E. Jansen, Mark S. Shephard:
Subdomain communication to increase scalability in large-scale scientific applications. 497-498 - Yasuo Ishii, Mary Inaba, Kei Hiraki:
Access map pattern matching for data cache prefetch. 499-500 - Karan Singh, Major Bhadauria, Sally A. McKee:
Prediction-based power estimation and scheduling for CMPs. 501-502 - Jih-Ching Chiu, Kai-Ming Yang, Yu-Liang Chou:
Design of a novel SIMD architecture by fusing operations and registers. 503-504 - Jian Li, Lixin Zhang, Charles Lefurgy, Richard R. Treumann, Wolfgang E. Denzel:
Thrifty interconnection network for HPC systems. 505-506 - Liang Gu, Xiaoming Li:
Performance modeling for DFT algorithms in FFTW. 507-508 - Major Bhadauria, Vincent M. Weaver, Sally A. McKee:
PARSEC: hardware profiling of emerging workloads for CMP design. 509-510 - Mohamed E. Hussein, Wael Abd-Almageed:
Approximate kernel matrix computation on GPUs forlarge scale learning applications. 511-512 - Diana Bautista, Julio Sahuquillo, Houcine Hassan, Salvador Petit, José Duato:
Dynamic task set partitioning based on balancing memory requirements to reduce power consumption. 513-514 - Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, Wen-mei W. Hwu:
High-performance CUDA kernel execution on FPGAs. 515-516 - Shih-wei Liao, Tzu-Han Hung, Donald Nguyen, Hucheng Zhou, Chinyen Chou, Chia-Heng Tu:
Prefetch optimizations on large-scale applications via parameter value prediction. 519-520 - Scott Beamer, Krste Asanovic, Christopher Batten, Ajay Joshi, Vladimir Stojanovic:
Designing multi-socket systems using silicon photonics. 521-522 - Victor Lotrich, Norbert Flocke, Mark Ponton, Beverly A. Sanders, Erik Deumens, Rodney J. Bartlett, Ajith Perera:
An infrastructure for scalable and portable parallel programs for computational chemistry. 523-524
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.