default search action
Per Stenström
Person information
- affiliation: Chalmers University of Technology, Goteborg, Sweden
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c129]Piyumal Ranawaka, Muhammad Waqar Azhar, Per Stenström:
DNNOPT: A Framework for Efficiently Selecting On-chip Memory Loop Optimizations of DNN Accelerators. CF 2024 - [c128]Qi Shao, Angelos Arelakis, Per Stenström:
HMComp: Extending Near-Memory Capacity using Compression in Hybrid Memory. ICS 2024: 74-84 - 2023
- [j67]Muhammad Waqar Azhar, Madhavan Manivannan, Per Stenström:
Approx-RM: Reducing Energy on Heterogeneous Multicore Processors under Accuracy and Timing Constraints. ACM Trans. Archit. Code Optim. 20(3): 44:1-44:25 (2023) - [c127]Lluc Alvarez, Abraham Ruiz, Arnau Bigas-Soldevilla, Pavel Kuroedov, Alberto González, Hamsika Mahale, Noe Bustamante, Albert Aguilera, Francesco Minervini, Javier Salamero, Oscar Palomar, Vassilis Papaefstathiou, Antonis Psathakis, Nikolaos Dimou, Michalis Giaourtas, Iasonas Mastorakis, Georgios Ieronymakis, Georgios-Michail Matzouranis, Vassilis Flouris, Nick Kossifidis, Manolis Marazakis, Bhavishya Goel, Madhavan Manivannan, Ahsen Ejaz, Panagiotis Strikos, Mateo Vázquez, Ioannis Sourdis, Pedro Trancoso, Per Stenström, Jens Hagemeyer, Lennart Tigges, Nils Kucza, Jean-Marc Philippe, Ioannis Papaefstathiou:
eProcessor: European, Extendable, Energy-Efficient, Extreme-Scale, Extensible, Processor Ecosystem. CF 2023: 309-314 - [c126]Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström:
SoK: Analysis of Root Causes and Defense Strategies for Attacks on Microarchitectural Optimizations. EuroS&P 2023: 631-650 - [c125]Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström:
SCALE: Secure and Scalable Cache Partitioning. HOST 2023: 68-79 - 2022
- [j66]Petros Voudouris, Per Stenström, Risat Pathan:
Bounding the execution time of parallel applications on unrelated multiprocessors. Real Time Syst. 58(2): 189-232 (2022) - [j65]Muhammad Waqar Azhar, Miquel Pericàs, Per Stenström:
Task-RM: A Resource Manager for Energy Reduction in Task-Parallel Applications under Quality of Service Constraints. ACM Trans. Archit. Code Optim. 19(1): 11:1-11:26 (2022) - [j64]Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström:
Cooperative Slack Management: Saving Energy of Multicore Processors by Trading Performance Slack Between QoS-Constrained Applications. ACM Trans. Archit. Code Optim. 19(2): 21:1-21:27 (2022) - [c124]Alexandra Angerd, Angelos Arelakis, Vasilis Spiliopoulos, Erik Sintorn, Per Stenström:
GBDI: Going Beyond Base-Delta-Immediate Compression with Global Bases. HPCA 2022: 1115-1127 - [i5]Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström:
SoK: Analysis of Root Causes and Defense Strategies for Attacks on Microarchitectural Optimizations. CoRR abs/2212.10221 (2022) - 2021
- [j63]Petros Voudouris, Per Stenström, Risat Pathan:
Federated Scheduling of Sporadic DAGs on Unrelated Multiprocessors. ACM Trans. Embed. Comput. Syst. 20(5s): 87:1-87:25 (2021) - [c123]Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericàs:
CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling. PACT 2021: 213-225 - [i4]Nadja Ramhöj Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericàs:
CBP: Coordinated management of cache partitioning, bandwidth partitioning and prefetch throttling. CoRR abs/2102.11528 (2021) - 2020
- [j62]Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström:
Coordinated management of DVFS and cache partitioning under QoS constraints to save energy in multi-core systems. J. Parallel Distributed Comput. 144: 246-259 (2020) - [c122]Alexandra Angerd, Erik Sintorn, Per Stenström:
A GPU Register File using Static Data Compression. ICPP 2020: 59:1-59:10 - [c121]Nadja Holtryd, Madhavan Manivannan, Per Stenström, Miquel Pericàs:
DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors. IPDPS 2020: 578-589 - [c120]Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström:
Coordinated Management of Processor Configuration and Cache Partitioning to Optimize Energy under QoS Constraints. IPDPS 2020: 590-601 - [i3]Alexandra Angerd, Erik Sintorn, Per Stenström:
A GPU Register File using Static Data Compression. CoRR abs/2006.05693 (2020)
2010 – 2019
- 2019
- [j61]Alba de Melo, Jesús Carretero, Per Stenström, Sanjay Ranka, Eduard Ayguadé:
Trends on heterogeneous and innovative hardware and software systems. J. Parallel Distributed Comput. 133: 362-364 (2019) - [c119]Muhammad Waqar Azhar, Miquel Pericàs, Per Stenström:
SaC: Exploiting Execution-Time Slack to Save Energy in Heterogeneous Multicore Systems. ICPP 2019: 26:1-26:12 - [c118]Mehrzad Nejat, Miquel Pericàs, Per Stenström:
QoS-Driven Coordinated Management of Resources to Save Energy in Multi-core Systems. IPDPS 2019: 303-313 - [e11]Pen-Chung Yew, Per Stenström, Junjie Wu, Xiaoli Gong, Tao Li:
Advanced Parallel Processing Technologies - 13th International Symposium, APPT 2019, Tianjin, China, August 15-16, 2019, Proceedings. Lecture Notes in Computer Science 11719, Springer 2019, ISBN 978-3-030-29610-0 [contents] - [i2]Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström:
Coordinated Management of DVFS and Cache Partitioning under QoS Constraints to Save Energy in Multi-Core Systems. CoRR abs/1911.05101 (2019) - [i1]Mehrzad Nejat, Madhavan Manivannan, Miquel Pericàs, Per Stenström:
Coordinated Management of Processor Configuration and Cache Partitioning to Optimize Energy under QoS Constraints. CoRR abs/1911.05114 (2019) - 2018
- [j60]Madhavan Manivannan, Miquel Pericàs, Vassilis Papaefstathiou, Per Stenström:
Global Dead-Block Management for Task-Parallel Programs. ACM Trans. Archit. Code Optim. 15(3): 33:1-33:25 (2018) - [j59]Risat Pathan, Petros Voudouris, Per Stenström:
Scheduling Parallel Real-Time Recurrent Tasks on Multicore Platforms. IEEE Trans. Parallel Distributed Syst. 29(4): 915-928 (2018) - [c117]Dmitry Knyaginin, Vassilis Papaefstathiou, Per Stenström:
ProFess: A Probabilistic Hybrid Main Memory Management Framework for High Performance and Fairness. HPCA 2018: 143-155 - [e10]Skevos Evripidou, Per Stenström, Michael F. P. O'Boyle:
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, PACT 2018, Limassol, Cyprus, November 01-04, 2018. ACM 2018 [contents] - 2017
- [j58]Madhavan Manivannan, Miquel Pericàs, Vassilis Papaefstathiou, Per Stenström:
Runtime-Assisted Global Cache Management for Task-Based Parallel Programs. IEEE Comput. Archit. Lett. 16(2): 145-148 (2017) - [j57]Muhammad Waqar Azhar, Per Stenström, Vassilis Papaefstathiou:
SLOOP: QoS-Supervised Loop Execution to Reduce Energy on Heterogeneous Architectures. ACM Trans. Archit. Code Optim. 14(4): 41:1-41:25 (2017) - [j56]Alexandra Angerd, Erik Sintorn, Per Stenström:
A Framework for Automated and Controlled Floating-Point Accuracy Reduction in Graphics Applications on GPUs. ACM Trans. Archit. Code Optim. 14(4): 46:1-46:25 (2017) - [c116]Dmitry Knyaginin, Per Stenström:
Rock: a framework for pruning the design space of hybrid main memory systems. MEMSYS 2017: 337-347 - [c115]Petros Voudouris, Per Stenström, Risat Pathan:
Timing-Anomaly Free Dynamic Scheduling of Task-Based Parallel Applications. RTAS 2017: 365-376 - 2016
- [j55]Minghua Li, Guancheng Chen, Qijun Wang, Yonghua Lin, H. Peter Hofstee, Per Stenström, Dian Zhou:
PATer: A Hardware Prefetching Automatic Tuner on IBM POWER8 Processor. IEEE Comput. Archit. Lett. 15(1): 37-40 (2016) - [j54]Per Stenström:
2015 Maurice Wilkes Award Given to Christos Kozyrakis. IEEE Micro 36(3): 128-129 (2016) - [c114]Manolis Marazakis, John Goodacre, Didier Fuin, Paul M. Carpenter, John Thomson, Emil Matús, Antimo Bruno, Per Stenström, Jérôme Martin, Yves Durand, Isabelle Dor:
EUROSERVER: Share-anything scale-out micro-server design. DATE 2016: 678-683 - [c113]Madhavan Manivannan, Vassilis Papaefstathiou, Miquel Pericàs, Per Stenström:
RADAR: Runtime-assisted dead region management for last-level caches. HPCA 2016: 644-656 - [c112]Dmitry Knyaginin, Vassilis Papaefstathiou, Per Stenström:
Adaptive Row Addressing for Cost-Efficient Parallel Memory Protocols in Large-Capacity Memories. MEMSYS 2016: 121-132 - [c111]Petros Voudouris, Per Stenström, Risat Pathan:
Timing-anomaly free dynamic scheduling of task-based parallel applications. RTSS 2016: 371 - 2015
- [b1]Somayeh Sardashti, Angelos Arelakis, Per Stenström, David A. Wood:
A Primer on Compression in the Memory Hierarchy. Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers 2015, ISBN 978-3-031-00623-4 - [c110]Jochen Hollmann, J. Rubén Titos Gil, Per Stenström:
Enhancing Garbage Collection Synchronization Using Explicit Bit Barriers. ICPP 2015: 769-778 - [c109]Tobias Fjalling, Per Stenström:
Performance Impact of Batching Web-Application Requests Using Hot-Spot Processing on GPUs. IPDPS 2015: 989-999 - [c108]Angelos Arelakis, Fredrik Dahlgren, Per Stenström:
HyComp: a hybrid cache compression method for selection of data-type-specific compression methods. MICRO 2015: 38-49 - 2014
- [j53]Angelos Arelakis, Per Stenström:
A Case for a Value-Aware Cache. IEEE Comput. Archit. Lett. 13(1): 1-4 (2014) - [j52]M. M. Waliullah, Per Stenström:
Removal of Conflicts in Hardware Transactional Memory Systems. Int. J. Parallel Program. 42(1): 198-218 (2014) - [j51]Viktor K. Prasanna, Yves Robert, Per Stenström:
Introduction to the JPDC special issue on Perspectives on Parallel and Distributed Processing. J. Parallel Distributed Comput. 74(7): 2543 (2014) - [j50]Mafijul Md. Islam, Per Stenström:
Characterizing and Exploiting Small-Value Memory Instructions. IEEE Trans. Computers 63(7): 1640-1655 (2014) - [j49]J. Rubén Titos Gil, Anurag Negi, Manuel E. Acacio, José M. García, Per Stenström:
ZEBRA: Data-Centric Contention Management in Hardware Transactional Memory. IEEE Trans. Parallel Distributed Syst. 25(5): 1359-1369 (2014) - [c107]Per Stenström:
Effective resource management towards efficient computing. DATE 2014: 1 - [c106]Dmitry Knyaginin, Georgi Gaydadjiev, Per Stenström:
Crystal: A Design-Time Resource Partitioning Method for Hybrid Main Memory. ICPP 2014: 90-100 - [c105]Bhavishya Goel, J. Rubén Titos Gil, Anurag Negi, Sally A. McKee, Per Stenström:
Performance and Energy Analysis of the Restricted Transactional Memory Implementation on Haswell. IPDPS 2014: 615-624 - [c104]Madhavan Manivannan, Per Stenström:
Runtime-Guided Cache Coherence Optimizations in Multi-core Architectures. IPDPS 2014: 625-636 - [c103]Angelos Arelakis, Per Stenström:
SC2: A statistical compression cache scheme. ISCA 2014: 145-156 - [c102]Risat Mahmud Pathan, Per Stenström, Lars-Goran Green, Torbjorn Hult, Patrik Sandin:
Overhead-aware temporal partitioning on multicore processors. RTAS 2014: 251-262 - [e9]Arndt Bode, Michael Gerndt, Per Stenström, Lawrence Rauchwerger, Barton P. Miller, Martin Schulz:
2014 International Conference on Supercomputing, ICS'14, Muenchen, Germany, June 10-13, 2014. ACM 2014, ISBN 978-1-4503-2642-1 [contents] - 2013
- [j48]Michael J. Flynn, Oskar Mencer, Veljko M. Milutinovic, Goran Rakocevic, Per Stenström, Roman Trobec, Mateo Valero:
Moving from petaflops to petadata. Commun. ACM 56(5): 39-42 (2013) - [j47]J. Rubén Titos Gil, Anurag Negi, Manuel E. Acacio, José M. García, Per Stenström:
Eager Beats Lazy: Improving Store Management in Eager Hardware Transactional Memory. IEEE Trans. Parallel Distributed Syst. 24(11): 2192-2201 (2013) - [c101]Per Stenström:
Keynote talk: Towards automatic resource management in parallel architectures. PACT 2013: 5 - [c100]Alen Bardizbanyan, Peter Gavin, David B. Whalley, Magnus Själander, Per Larsson-Edefors, Sally A. McKee, Per Stenström:
Improving data access efficiency by using a tagless access buffer (TAB). CGO 2013: 28:1-28:11 - [c99]Adrià Armejach, Anurag Negi, Adrián Cristal, Osman S. Unsal, Per Stenström, Tim Harris:
HARP: Adaptive abort recurrence prediction for Hardware Transactional Memory. HiPC 2013: 196-205 - [c98]Madhavan Manivannan, Anurag Negi, Per Stenström:
Efficient Forwarding of Producer-Consumer Data in Task-Based Programs. ICPP 2013: 517-522 - 2012
- [j46]Per Stenström, Koen De Bosschere:
Introduction to the special issue on high-performance and embedded architectures and compilers. ACM Trans. Archit. Code Optim. 8(4): 18:1-18:2 (2012) - [c97]Anurag Negi, Adrià Armejach, Adrián Cristal, Osman S. Unsal, Per Stenström:
Transactional prefetching: narrowing the window of contention in hardware transactional memory. PACT 2012: 181-190 - [c96]Anurag Negi, J. Rubén Titos Gil, Manuel E. Acacio, José M. García, Per Stenström:
π-TM: Pessimistic invalidation for scalable lazy hardware transactional memory. HPCA 2012: 141-152 - [c95]Guancheng Chen, Per Stenström:
Critical lock analysis: diagnosing critical section bottlenecks in multithreaded applications. SC 2012: 71 - 2011
- [c94]Anurag Negi, Per Stenström, J. Rubén Titos Gil, Manuel E. Acacio, José M. García:
Pi-TM: Pessimistic Invalidation for Scalable Lazy Hardware Transactional Memory. PACT 2011: 203-204 - [c93]Mafijul Md. Islam, Per Stenström:
A unified approach to eliminate memory accesses early. CASES 2011: 55-64 - [c92]Anurag Negi, J. Rubén Titos Gil, Manuel E. Acacio, José M. García, Per Stenström:
Eager Meets Lazy: The Impact of Write-Buffering on Hardware Transactional Memory. ICPP 2011: 73-82 - [c91]Madhavan Manivannan, Ben H. H. Juurlink, Per Stenström:
Implications of Merging Phases on Scalability of Multi-core Architectures. ICPP 2011: 622-631 - [c90]J. Rubén Titos Gil, Anurag Negi, Manuel E. Acacio, José M. García, Per Stenström:
ZEBRA: a data-centric, hybrid-policy hardware transactional memory design. ICS 2011: 53-62 - [c89]Madhavan Manivannan, Ben H. H. Juurlink, Per Stenström:
Poster: implications of merging phases on scalability of multi-core architectures. ICS 2011: 380 - [c88]Anurag Negi, J. Rubén Titos Gil, Manuel E. Acacio, José M. García, Per Stenström:
The Impact of Non-coherent Buffers on Lazy Hardware Transactional Memory Systems. IPDPS Workshops 2011: 700-707 - [c87]Per Stenström, Doug Burger, Wen-mei W. Hwu, Vipin Kumar, Kunle Olukotun, David A. Padua, Burton Smith:
Panel Statement. IPDPS 2011: 877 - [c86]Mridha-Mohammad Waliullah, Per Stenström:
Classification and Elimination of Conflicts in Hardware Transactional Memory Systems. SBAC-PAD 2011: 96-103 - [e8]Per Stenström:
Transactions on High-Performance Embedded Architectures and Compilers III. Lecture Notes in Computer Science 6590, Springer 2011, ISBN 978-3-642-19447-4 [contents] - [e7]Per Stenström:
Transactions on High-Performance Embedded Architectures and Compilers IV. Lecture Notes in Computer Science 6760, Springer 2011, ISBN 978-3-642-24567-1 [contents] - 2010
- [j45]Yehuda Afek, Ulrich Drepper, Pascal Felber, Christof Fetzer, Vincent Gramoli, Michael Hohmuth, Etienne Rivière, Per Stenström, Osman S. Unsal, Walther Maldonado, Derin Harmanci, Patrick Marlier, Stephan Diestelhorst, Martin Pohlack, Adrián Cristal, Ibrahim Hur, Aleksandar Dragojevic, Rachid Guerraoui, Michal Kapalka, Sasa Tomic, Guy Korland, Nir Shavit, Martin Nowack, Torvald Riegel:
The Velox Transactional Memory Stack. IEEE Micro 30(5): 76-87 (2010) - [c85]Mafijul Md. Islam, Per Stenström:
Characterization and exploitation of narrow-width loads: the narrow-width cache approach. CASES 2010: 227-236 - [c84]Anurag Negi, M. M. Waliullah, Per Stenström:
LV*: a class of lazy versioning HTMs for low-cost integration of transactional memory systems. IFMT 2010: 5:1-5:10 - [c83]Anurag Negi, M. M. Waliullah, Per Stenström:
LV*: A low complexity lazy versioning HTM infrastructure. ICSAMOS 2010: 231-240
2000 – 2009
- 2009
- [j44]M. M. Waliullah, Per Stenström:
Schemes for avoiding starvation in transactional memory systems. Concurr. Comput. Pract. Exp. 21(7): 859-873 (2009) - [j43]Per Stenström, David B. Whalley:
Introduction. Trans. High Perform. Embed. Archit. Compil. 2: 3 (2009) - [j42]Martin Thuresson, Magnus Själander, Magnus Björk, Lars J. Svensson, Per Larsson-Edefors, Per Stenström:
FlexCore: Utilizing Exposed Datapath Control for Efficient Computing. J. Signal Process. Syst. 57(1): 5-19 (2009) - [c82]Jochen Hollmann, Per Stenström:
Using Hoarding to Increase Availability in Shared File Systems. ACIS-ICIS 2009: 422-429 - [c81]Md. Mafijul Islam, Per Stenström:
Zero-Value Caches: Cancelling Loads that Return Zero. PACT 2009: 237-245 - [c80]Martin Thuresson, Magnus Själander, Per Stenström:
A Flexible Code Compression Scheme Using Partitioned Look-Up Tables. HiPEAC 2009: 95-109 - [c79]Md. Mafijul Islam, Sally A. McKee, Per Stenström:
Cancellation of loads that return zero using zero-value caches. ICS 2009: 493-494 - [e6]Per Stenström:
Transactions on High-Performance Embedded Architectures and Compilers II. Lecture Notes in Computer Science 5470, Springer 2009, ISBN 978-3-642-00903-7 [contents] - 2008
- [j41]Fredrik Warg, Per Stenström:
Dual-thread Speculation: A Simple Approach to Uncover Thread-level Parallelism on a Simultaneous Multithreaded Processor. Int. J. Parallel Program. 36(2): 166-183 (2008) - [j40]Jaeheon Jeong, Per Stenström, Michel Dubois:
Simple Penalty-Sensitive Cache Replacement Policies. J. Instr. Level Parallelism 10 (2008) - [j39]Md. Mafijul Islam, Magnus Själander, Per Stenström:
Early detection and bypassing of trivial operations to improve energy efficiency of processors. Microprocess. Microsystems 32(4): 183-196 (2008) - [j38]Martin Thuresson, Lawrence Spracklen, Per Stenström:
Memory-Link Compression Schemes: A Value Locality Perspective. IEEE Trans. Computers 57(7): 916-927 (2008) - [j37]Reinhard Wilhelm, Jakob Engblom, Andreas Ermedahl, Niklas Holsti, Stephan Thesing, David B. Whalley, Guillem Bernat, Christian Ferdinand, Reinhold Heckmann, Tulika Mitra, Frank Mueller, Isabelle Puaut, Peter P. Puschner, Jan Staschulat, Per Stenström:
The worst-case execution-time problem - overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3): 36:1-36:53 (2008) - [c78]Alessandro Bardine, Manuel Comparetti, Pierfrancesco Foglia, Giacomo Gabrielli, Cosimo Antonio Prete, Per Stenström:
Leveraging Data Promotion for Low Power D-NUCA Caches. DSD 2008: 307-316 - [c77]Martin Thuresson, Per Stenström:
Accommodation of the Bandwidth of Large Cache Blocks Using Cache/Memory Link Compression. ICPP 2008: 478-486 - [c76]M. M. Waliullah, Per Stenström:
Intermediate checkpointing with conflicting access prediction in transactional memory systems. IPDPS 2008: 1-11 - [c75]Mafijul Md. Islam, Per Stenström:
Zero loads: canceling load requests by tracking zero values. MEDEA@PACT 2008: 16-23 - [c74]M. M. Waliullah, Per Stenström:
Efficient management of speculative data in hardware transactional memory systems. ICSAMOS 2008: 158-164 - [e5]Per Stenström, Michel Dubois, Manolis Katevenis, Rajiv Gupta, Theo Ungerer:
High Performance Embedded Architectures and Compilers, Third International Conference, HiPEAC 2008, Göteborg, Sweden, January 27-29, 2008, Proceedings. Lecture Notes in Computer Science 4917, Springer 2008, ISBN 978-3-540-77559-1 [contents] - 2007
- [j36]Jochen Hollmann, Anders Ardö, Per Stenström:
Effectiveness of caching in a distributed digital library system. J. Syst. Archit. 53(7): 403-416 (2007) - [j35]Jianwei Chen, Michel Dubois, Per Stenström:
SimWattch: Integrating Complete-System and User-Level Performance and Power Simulators. IEEE Micro 27(4): 34-48 (2007) - [j34]M. M. Waliullah, Per Stenström:
Starvation-free commit arbitration policies for transactional memory systems. SIGARCH Comput. Archit. News 35(1): 39-46 (2007) - [j33]Haakon Dybdahl, Per Stenström, Lasse Natvig:
An LRU-based replacement algorithm augmented with frequency of access in shared chip-multiprocessor caches. SIGARCH Comput. Archit. News 35(4): 45-52 (2007) - [j32]Alessandro Bardine, Pierfrancesco Foglia, Giacomo Gabrielli, Cosimo Antonio Prete, Per Stenström:
Improving power efficiency of D-NUCA caches. SIGARCH Comput. Archit. News 35(4): 53-58 (2007) - [j31]Koen De Bosschere, Wayne Luk, Xavier Martorell, Nacho Navarro, Michael F. P. O'Boyle, Dionisios N. Pnevmatikatos, Alex Ramírez, Pascal Sainrat, André Seznec, Per Stenström, Olivier Temam:
High-Performance Embedded Architecture and Compilation Roadmap. Trans. High Perform. Embed. Archit. Compil. 1: 5-29 (2007) - [j30]Per Stenström:
Introduction to Part 1. Trans. High Perform. Embed. Archit. Compil. 1: 33 (2007) - [c73]Marco Galluzzi, Enrique Vallejo, Adrián Cristal, Fernando Vallejo, Ramón Beivide, Per Stenström, James E. Smith, Mateo Valero:
Implicit Transactional Memory in Kilo-Instruction Multiprocessors. Asia-Pacific Computer Systems Architecture Conference 2007: 339-353 - [c72]Shekhar Borkar, Norman P. Jouppi, Per Stenström:
Microprocessors in the era of terascale integration. DATE 2007: 237-242 - [c71]M. M. Waliullah, Per Stenström:
Starvation-Free Transactional Memory-System Protocols. Euro-Par 2007: 280-291 - [c70]Haakon Dybdahl, Per Stenström:
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors. HPCA 2007: 2-12 - [c69]Md. Mafijul Islam, Alexander Busck, Mikael Engbom, Simji Lee, Michel Dubois, Per Stenström:
Loop-level Speculative Parallelism in Embedded Applications. ICPP 2007: 3 - [c68]Per Stenström:
IPDPS Panel: Is the Multi-Core Roadmap going to Live Up to its Promises? IPDPS 2007: 14 - [c67]Ana Bosque, Pablo Ibáñez, Víctor Viñals, Per Stenström, José María Llabería:
Characterization of Apache web server with Specweb2005. MEDEA@PACT 2007: 65-72 - [c66]Martin Thuresson, Magnus Själander, Magnus Björk, Lars J. Svensson, Per Larsson-Edefors, Per Stenström:
FlexCore: Utilizing Exposed Datapath Control for Efficient Computing. ICSAMOS 2007: 18-25 - [c65]Md. Mafijul Islam, Per Stenström:
Energy and Performance Trade-offs between Instruction Reuse and Trivial Computations for Embedded Applications. SIES 2007: 86-93 - [e4]Utpal Banerjee, José Moreira, Michel Dubois, Per Stenström:
Proceedings of the 4th Conference on Computing Frontiers, 2007, Ischia, Italy, May 7-9, 2007. ACM 2007, ISBN 978-1-59593-683-7 [contents] - [e3]Koen De Bosschere, David R. Kaeli, Per Stenström, David B. Whalley, Theo Ungerer:
High Performance Embedded Architectures and Compilers, Second International Conference, HiPEAC 2007, Ghent, Belgium, January 28-30, 2007, Proceedings. Lecture Notes in Computer Science 4367, Springer 2007, ISBN 978-3-540-69337-6 [contents] - [e2]Per Stenström, Michael F. P. O'Boyle, François Bodin, Marcelo Cintra, Sally A. McKee:
Transactions on High-Performance Embedded Architectures and Compilers I. Lecture Notes in Computer Science 4050, Springer 2007, ISBN 978-3-540-71527-6 [contents] - 2006
- [j29]Burkhard Monien, Guang Gao, Horst D. Simon, Paul G. Spirakis, Per Stenström:
Introduction. J. Parallel Distributed Comput. 66(5): 615-616 (2006) - [c64]Haakon Dybdahl, Per Stenström:
Enhancing Last-Level Cache Performance by Block Bypassing and Early Miss Determination. Asia-Pacific Computer Systems Architecture Conference 2006: 52-66 - [c63]Jaeheon Jeong, Per Stenström, Michel Dubois:
Simple penalty-sensitive replacement policies for caches. Conf. Computing Frontiers 2006: 341-352 - [c62]Haakon Dybdahl, Per Stenström, Lasse Natvig:
A Cache-Partitioning Aware Replacement Policy for Chip Multiprocessors. HiPC 2006: 22-34 - [c61]Per Stenström:
Chip-multiprocessing and beyond. HPCA 2006: 109 - [c60]Haakon Dybdahl, Per Stenström, Lasse Natvig:
An LRU-based replacement algorithm augmented with frequency of access in shared chip-multiprocessor caches. MEDEA@PACT 2006: 45-52 - [c59]Md. Mafijul Islam, Per Stenström:
Reduction of Energy Consumption in Processors by Early Detection and Bypassing of Trivial Operations. ICSAMOS 2006: 28-34 - [c58]Fredrik Warg, Per Stenström:
Dual-Thread Speculation: Two Threads in the Machine are Worth Eight in the Bush. SBAC-PAD 2006: 91-98 - [c57]Martin Thuresson, Per Stenström:
Scalable Value-Cache Based Compression Schemes for Multiprocessors. SBAC-PAD 2006: 117-124 - 2005
- [j28]Frank Mueller, Per Stenström:
Introduction to the special issue. ACM Trans. Embed. Comput. Syst. 4(1): 1-2 (2005) - [c56]Martin Thuresson, Per Stenström:
Evaluation of extended dictionary-based static code compression schemes. Conf. Computing Frontiers 2005: 77-86 - [c55]Fredrik Warg, Per Stenström:
Reducing misspeculation overhead for module-level speculative execution. Conf. Computing Frontiers 2005: 289-298 - [c54]Per Stenström:
The Chip-Multiprocessing Paradigm Shift: Opportunities and Challenges. HiPEAC 2005: 5 - [c53]Enrique Vallejo, Marco Galluzzi, Adrián Cristal, Fernando Vallejo, Ramón Beivide, Per Stenström, James E. Smith, Mateo Valero:
Implementing Kilo-Instruction Multiprocessors. ICPS 2005: 325-336 - [c52]Magnus Ekman, Per Stenström:
A Cost-Effective Main Memory Organization for Future Servers. IPDPS 2005 - [c51]Magnus Ekman, Per Stenström:
A Robust Main-Memory Compression Scheme. ISCA 2005: 74-85 - [c50]Magnus Ekman, Per Stenström:
Enhancing Multiprocessor Architecture Simulation Speed Using Matched-Pair Comparison. ISPASS 2005: 89-99 - 2004
- [j27]Håkan Grahn, Per Stenström:
A comparative evaluation of hardware-only and software-only directory protocols in shared-memory multiprocessors. J. Syst. Archit. 50(9): 537-561 (2004) - [j26]Jonas Jalminger, Per Stenström:
A cache block reuse prediction scheme. Microprocess. Microsystems 28(7): 373-385 (2004) - [c49]Martin Kämpe, Per Stenström, Michel Dubois:
Self-correcting LRU replacement policies. Conf. Computing Frontiers 2004: 181-191 - [c48]Magnus Ekman, Per Stenström:
A case for multi-level main memory. WMPI 2004: 1-8 - 2003
- [c47]Jochen Hollmann, Anders Ardö, Per Stenström:
An Evaluation of Document Prefetching in a Distributed Digital Library. ECDL 2003: 276-287 - [c46]Per Stenström:
One Chip, One Server: How Do We Exploit Its Power? HiPC 2003: 405 - [c45]Jonas Jalminger, Per Stenström:
A Novel Approach to Cache Block Reuse Predictions. ICPP 2003: 294- - [c44]Magnus Ekman, Per Stenström:
Performance and Power Impact of Issue-width in Chip-Multiprocessor Cores. ICPP 2003: 359-368 - [c43]Jim Nilsson, Anders Landin, Per Stenström:
The Coherence Predictor Cache: A Resource-Efficient and Accurate Coherence Prediction Infrastructure. IPDPS 2003: 10 - [c42]Peter Rundberg, Per Stenström:
Speculative Lock Reordering: Optimistic Out-of-Order Execution of Critical Sections. IPDPS 2003: 11 - [c41]Fredrik Warg, Per Stenström:
Improving Speculative Thread-Level Parallelism Through Module Run-Length Prediction. IPDPS 2003: 12 - [c40]Jianwei Chen, Michel Dubois, Per Stenström:
Integrating complete-system and user-level performance/power simulators: the SimWattch approach. ISPASS 2003: 1-10 - 2002
- [j25]Jonas Jalminger, Per Stenström:
Improvement of energy-efficiency in off-chip caches by selective prefetching. Microprocess. Microsystems 26(3): 107-121 (2002) - [c39]Martin Kämpe, Per Stenström, Michel Dubois:
The FAB Predictor: Using Fourier Analysis to Predict the Outcome of Conditional Branches. HPCA 2002: 223-232 - [c38]Jochen Hollmann, Anders Ardö, Per Stenström:
Empirical Observations Regarding Predictability in User Access-Behavior in a Distributed Digital Library System. IPDPS 2002 - [c37]Magnus Ekman, Per Stenström, Fredrik Dahlgren:
TLB and snoop energy-reduction using virtual caches in low-power chip-multiprocessors. ISLPED 2002: 243-246 - 2001
- [j24]Peter Rundberg, Per Stenström:
An All-Software Thread-Level Data Dependence Speculation System for Multiprocessors. J. Instr. Level Parallelism 3 (2001) - [c36]Fredrik Warg, Per Stenström:
Limits on Speculative Module-Level Parallelism in Imperative and Object-Oriented Programs on CMP Platforms. IEEE PACT 2001: 221-230 - [c35]Ulf Assarsson, Per Stenström:
A Case Study of Load Distribution in Parallel View Frustum Culling and Collision Detection. Euro-Par 2001: 663-673 - [e1]Per Stenström:
Proceedings of the 28th Annual International Symposium on Computer Architecture, ISCA 2001, Göteborg, Sweden, June 30-July 4, 2001. ACM 2001, ISBN 0-7695-1162-7 [contents] - 2000
- [j23]Per Stenström, Erik Hagersten, David J. Lilja, Margaret Martonosi, Madan Venugopal:
Shared-memory multiprocessing: Current state and future directions. Adv. Comput. 53: 1-53 (2000) - [j22]Håkan Grahn, Per Stenström:
Comparative Evaluation of Latency-Tolerating and -Reducing Techniques for Hardware-Only and Software-Only Directory Protocols. J. Parallel Distributed Comput. 60(7): 807-834 (2000) - [c34]Silvia M. Müller, Per Stenström, Mateo Valero, Stamatis Vassiliadis:
Parallel Computer Architecture. Euro-Par 2000: 537-538 - [c33]Magnus Karlsson, Fredrik Dahlgren, Per Stenström:
A Prefetching Technique for Irregular Accesses to Linked Data Structures. HPCA 2000: 206-217 - [c32]Ashley Saulsbury, Fredrik Dahlgren, Per Stenström:
Recency-based TLB preloading. ISCA 2000: 117-127 - [c31]Magnus Karlsson, Per Stenström:
An analytical model of the working-set sizes in decision-support systems. SIGMETRICS 2000: 275-285
1990 – 1999
- 1999
- [j21]Jonas Skeppstedt, Fredrik Dahlgren, Per Stenström:
Evaluation of Compiler-Controlled Updating to Reduce Coherence-Miss Penalties in Shared-Memory Multiprocessors. J. Parallel Distributed Comput. 56(2): 122-143 (1999) - [j20]Veljko Milutinovic, Per Stenström:
Special Issue On Distributed Shared Memory Systems. Proc. IEEE 87(3): 399-404 (1999) - [j19]Thomas Lundqvist, Per Stenström:
An Integrated Path and Timing Analysis Method based on Cycle-Level Symbolic Execution. Real Time Syst. 17(2-3): 183-207 (1999) - [c30]Thomas Lundqvist, Per Stenström:
A Method to Improve the Estimated Worst-Case Performance of Data Caching. RTCSA 1999: 255-262 - [c29]Thomas Lundqvist, Per Stenström:
Timing Anomalies in Dynamically Scheduled Microprocessors. RTSS 1999: 12-21 - 1998
- [j18]Fredrik Dahlgren, Jonas Skeppstedt, Per Stenström:
An evaluation of hardware-based and compiler-controlled optimizations of snooping cache protocols. Future Gener. Comput. Syst. 13(6): 469-487 (1998) - [j17]Fredrik Dahlgren, Michel Dubois, Per Stenström:
Performance Evaluation and Cost Analysis of Cache Protocol Extensions for Shared-Memory Multiprocessors. IEEE Trans. Computers 47(10): 1041-1055 (1998) - [c28]Thomas Lundqvist, Per Stenström:
Integrating Path and Timing Analysis Using Instruction-Level Simulation Techniques. LCTES 1998: 1-15 - [c27]Peter S. Magnusson, Fredrik Larsson, Andreas Moestedt, Bengt Werner, Jim Nilsson, Per Stenström, Fredrik Lundholm, Magnus Karlsson, Fredrik Dahlgren, Håkan Grahn:
SimICS/Sun4m: A Virtual Workstation. USENIX ATC 1998 - [c26]Per Stenström, Fredrik Dahlgren:
A holistic approach to computer system design education based on system simulation techniques. WCAE@ISCA 1998: 13 - 1997
- [j16]Fredrik Dahlgren, Per Stenström, Mårten Björkman:
Reducing the Read-Miss Penalty for Flat COMA Protocols. Comput. J. 40(4): 208-219 (1997) - [j15]Per Stenström, Mats Brorsson, Fredrik Dahlgren, Håkan Grahn, Michel Dubois:
Boosting the Performance of Shared Memory Multiprocessors. Computer 30(7): 63-70 (1997) - [j14]Per Stenström, Erik Hagersten, David J. Lilja, Margaret Martonosi, Madan Venugopal:
Trends in Shared Memory Multiprocessing. Computer 30(12): 44-50 (1997) - [j13]Magnus Karlsson, Per Stenström:
Effectivness of Dynamic Prefetching in Multiple-Writer Distributed Virtual Shared-Memory Systems. J. Parallel Distributed Comput. 43(2): 79-93 (1997) - [c25]Per Stenström, Jonas Skeppstedt:
A Performance Tuning Approach for Shared-Memory Multiprocessors. Euro-Par 1997: 72-83 - [c24]Håkan Grahn, Per Stenström:
Relative Performance of Hardware and Software-Only Directory Protocols Under Latency Tolerating and Reducing Techniques. IPPS 1997: 500- - 1996
- [j12]Per Stenström, Fredrik Dahlgren:
Applications for Shared Memory Multiprocessors (Guest Editors' Introduction). Computer 29(12): 29-31 (1996) - [j11]Håkan Grahn, Per Stenström:
Evaluation of a Competitive-Update Cache Coherence Protocol with Migratory Data Detection. J. Parallel Distributed Comput. 39(2): 168-180 (1996) - [j10]Per Stenström, Magnus Balldin, Jonas Skeppstedt:
The design of a non-blocking load processor architecture. Microprocess. Microsystems 20(2): 111-123 (1996) - [j9]Mats Brorsson, Per Stenström:
Characterising and Modelling Shared Memory Accesses in Multiprocessor Programs. Parallel Comput. 22(6): 869-893 (1996) - [j8]Jonas Skeppstedt, Per Stenström:
Using Dataflow Analysis Techniques to Reduce Ownership Overhead in Cache Coherence Protocols. ACM Trans. Program. Lang. Syst. 18(6): 659-682 (1996) - [j7]Fredrik Dahlgren, Per Stenström:
Evaluation of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors. IEEE Trans. Parallel Distributed Syst. 7(4): 385-398 (1996) - [c23]Magnus Karlsson, Per Stenström:
Performance Evaluation of a Cluster-Based Multiprocessor Built from ATM Switches and Bus-Based Multiprocessor Servers. HPCA 1996: 4-13 - 1995
- [j6]Håkan Grahn, Per Stenström, Michel Dubois:
Implementation and evaluation of update-based cache protocols under relaxed memory consistency models. Future Gener. Comput. Syst. 11(3): 247-271 (1995) - [j5]Fredrik Dahlgren, Per Stenström:
Using Write Caches to Improve Performance of Cache Coherence Protocols in Shared-Memory Multiprocessors. J. Parallel Distributed Comput. 26(2): 193-210 (1995) - [j4]Michel Dubois, Jonas Skeppstedt, Per Stenström:
Essential Misses and Data Traffic in Coherence Protocols. J. Parallel Distributed Comput. 29(2): 108-125 (1995) - [j3]Fredrik Dahlgren, Michel Dubois, Per Stenström:
Sequential Hardware Prefetching in Shared-Memory Multiprocessors. IEEE Trans. Parallel Distributed Syst. 6(7): 733-746 (1995) - [c22]Jonas Skeppstedt, Per Stenström:
A compiler algorithm that reduces read latency in ownership-based cache coherence protocols. PACT 1995: 69-78 - [c21]Mårten Björkman, Fredrik Dahlgren, Per Stenström:
Using hints to reduce the read miss penalty for flat COMA protocols. HICSS (1) 1995: 242-251 - [c20]Fredrik Dahlgren, Per Stenström:
Effectiveness of Hardware-Based Stride and Sequential Prefetching in Shared-Memory Multiprocessors. HPCA 1995: 68-77 - [c19]Håkan Grahn, Per Stenström:
Efficient Strategies for Software-Only Protocols in Shared-Memory Multiprocessors. ISCA 1995: 38-47 - 1994
- [c18]Jonas Skeppstedt, Per Stenström:
Simple Compiler Algorithms to Reduce Ownership Operhead in Cache Coherence Protocols. ASPLOS 1994: 286-296 - [c17]Per Stenström:
Introduction. HICSS (1) 1994: 520-521 - [c16]Fong Pong, Per Stenström, Michel Dubois:
An Integrated Methodology for the Verification of Directory-Based Cache Protocols. ICPP (1) 1994: 158-165 - [c15]Fredrik Dahlgren, Per Stenström:
Reducing the Write Traffic for a Hybrid Cache Protocol. ICPP (1) 1994: 166-173 - [c14]Fredrik Dahlgren, Michel Dubois, Per Stenström:
Combined Performance Gains of Simple Cache Protocol Extensions. ISCA 1994: 187-197 - [c13]Håkan Nilsson, Per Stenström:
An Adaptive Update-Based Cache Coherence Protocol for Reduction of Miss Rate and Traffic. PARLE 1994: 363-374 - [c12]Mats Brorsson, Per Stenström:
Modelling accesses to migratory and producer-consumer characterised data in a shared memory multiprocessor. SPDP 1994: 612-619 - 1993
- [c11]Mats Brorsson, Fredrik Dahlgren, Håkan Nilsson, Per Stenström:
The Cachemire Test Bench A Flexible And Effective Approach For Simulation Of Multiprocessors. Annual Simulation Symposium 1993: 41-49 - [c10]Fredrik Dahlgren, Michel Dubois, Per Stenström:
Fixed and Adaptive Sequential Prefetching in Shared Memory Multiprocessors. ICPP (1) 1993: 56-63 - [c9]Michel Dubois, Jonas Skeppstedt, Livio Ricciulli, Krishnan Ramamurthy, Per Stenström:
The Detection and Elimination of Useless Misses in Multiprocessors. ISCA 1993: 88-97 - [c8]Per Stenström, Mats Brorsson, Lars Sandberg:
An Adaptive Cache Coherence Protocol Optimized for Migratory Sharing. ISCA 1993: 109-118 - 1992
- [c7]Per Stenström:
A Latency-Hiding Scheme for Multiprocessors with Buffered Multistage Networks. IPPS 1992: 39-42 - [c6]Per Stenström, Truman Joe, Anoop Gupta:
Comparative Performance Evaluation of Cache-Coherent NUMA and COMA Architectures. ISCA 1992: 80-91 - [c5]Håkan Nilsson, Per Stenström:
The Scalable Tree Protocol - A Cache Coherence Approach for Large-Scale Multiprocessors. SPDP 1992: 498-506 - 1991
- [c4]Per Stenström, Fredrik Dahlgren, Lars Lundberg:
A Lockup-Free Multiprocessor Cache Design. ICPP (1) 1991: 246-250 - [c3]Fredrik Dahlgren, Per Stenström:
On Reconfigurable On-Chip Data Caches. MICRO 1991: 189-198 - 1990
- [j2]Per Stenström:
A Survey of Cache Coherence Schemes for Multiprocessors. Computer 23(6): 12-24 (1990)
1980 – 1989
- 1989
- [c2]Per Stenström:
A Cache Consistency Protocol for Multiprocessors with Multistage Networks. ISCA 1989: 407-415 - 1988
- [j1]Per Stenström:
Reducing Contention in Sharde-Memory Multiprocessors. Computer 21(11): 26-37 (1988) - 1987
- [c1]Per Stenström, Lars H. Philipson:
A Layered Emulator for Design Evaluation of MIMD Multiprocessors with Shared Memory. PARLE (1) 1987: 329-344
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-13 02:05 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint