default search action
Naoya Maruyama
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2021
- [j6]Francis J. Alexander, James A. Ang, Jenna A. Bilbrey, Jan Balewski, Tiernan Casey, Ryan Chard, Jong Choi, Sutanay Choudhury, Bert J. Debusschere, Anthony M. DeGennaro, Nikoli Dryden, J. Austin Ellis, Ian T. Foster, Cristina Garcia-Cardona, Sayan Ghosh, Peter Harrington, Yunzhi Huang, Shantenu Jha, Travis Johnston, Ai Kagawa, Ramakrishnan Kannan, Neeraj Kumar, Zhengchun Liu, Naoya Maruyama, Satoshi Matsuoka, Erin McCarthy, Jamaludin Mohd-Yusof, Peter Nugent, Yosuke Oyama, Thomas Proffen, David Pugmire, Sivasankaran Rajamanickam, Vinay Ramakrishnaiah, Malachi Schram, Sudip K. Seal, Ganesh Sivaraman, Christine Sweeney, Li Tan, Rajeev Thakur, Brian Van Essen, Logan T. Ward, Paul M. Welch, Michael Wolf, Sotiris S. Xantheas, Kevin G. Yager, Shinjae Yoo, Byung-Jun Yoon:
Co-design Center for Exascale Machine Learning Technologies (ExaLearn). Int. J. High Perform. Comput. Appl. 35(6): 598-616 (2021) - [j5]Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy, Peter Harrington, Jan Balewski, Satoshi Matsuoka, Peter Nugent, Brian Van Essen:
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs With Hybrid Parallelism. IEEE Trans. Parallel Distributed Syst. 32(7): 1641-1652 (2021) - 2020
- [p1]Julian M. Kunkel, Nabeeh Jumah, Anastasiia Novikova, Thomas Ludwig, Hisashi Yashiro, Naoya Maruyama, Mohamed Wahib, John Thuburn:
AIMES: Advanced Computation and I/O Methods for Earth-System Simulations. Software for Exascale Computing 2020: 61-102 - [i3]Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy, Peter Harrington, Jan Balewski, Satoshi Matsuoka, Peter Nugent, Brian Van Essen:
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism. CoRR abs/2007.12856 (2020)
2010 – 2019
- 2019
- [c58]Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen:
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism. IPDPS 2019: 210-220 - [c57]Nikoli Dryden, Naoya Maruyama, Tim Moon, Tom Benson, Marc Snir, Brian Van Essen:
Channel and filter parallelism for large-scale CNN training. SC 2019: 10:1-10:20 - [c56]Ian Karlin, Yoonho Park, Bronis R. de Supinski, Peng Wang, Bert Still, David Beckingsale, Robert Blake, Tong Chen, Guojing Cong, Carlos H. A. Costa, Johann Dahm, Giacomo Domeniconi, Thomas Epperly, Aaron Fisher, Sara Kokkila Schumacher, Steven H. Langer, Hai Le, Eun Kyung Lee, Naoya Maruyama, Xinyu Que, David F. Richards, Björn Sjögreen, Jonathan Wong, Carol S. Woodward, Ulrike Meier Yang, Xiaohua Zhang, Bob Anderson, David Appelhans, Levi Barnes, Peter D. Barnes Jr., Sorin Bastea, David Böhme, Jamie A. Bramwell, James M. Brase, José R. Brunheroto, Barry Chen, Charway R. Cooper, Tony Degroot, Robert D. Falgout, Todd Gamblin, David J. Gardner, James N. Glosli, John A. Gunnels, Max P. Katz, Tzanio V. Kolev, I-Feng W. Kuo, Matthew P. LeGendre, Ruipeng Li, Pei-Hung Lin, Shelby Lockhart, Kathleen McCandless, Claudia Misale, Jaime H. Moreno, Rob Neely, Jarom Nelson, Rao Nimmakayala, Kathryn M. O'Brien, Kevin O'Brien, Ramesh Pankajakshan, Roger Pearce, Slaven Peles, Phil Regier, Steven C. Rennich, Martin Schulz, Howard Scott, James C. Sexton, Kathleen Shoga, Shiv Sundram, Guillaume Thomas-Collignon, Brian Van Essen, Alexey Voronin, Bob Walkup, Lu Wang, Chris Ward, Hui-Fang Wen, Daniel A. White, Christopher Young, Cyril Zeller, Edward Zywicz:
Preparation and optimization of a diverse workload for a large-scale heterogeneous system. SC 2019: 32:1-32:17 - [i2]Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen:
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism. CoRR abs/1903.06681 (2019) - 2018
- [c55]Motohiko Matsuda, Keisuke Fukuda, Naoya Maruyama:
A Portability Layer of an All-pairs Operation for Hierarchical N-Body Algorithm Framework Tapas. HPC Asia 2018: 241-250 - [c54]Shin'ichiro Takizawa, Motohiko Matsuda, Naoya Maruyama, Yoshifumi Nakamura:
A Scalable Multi-Granular Data Model for Data Parallel Workflows. HPC Asia 2018: 251-260 - [c53]Md. Zahangir Alom, Adam T. Moody, Naoya Maruyama, Brian C. Van Essen, Tarek M. Taha:
Effective Quantization Approaches for Recurrent Neural Networks. IJCNN 2018: 1-8 - [i1]Md. Zahangir Alom, Adam T. Moody, Naoya Maruyama, Brian C. Van Essen, Tarek M. Taha:
Effective Quantization Approaches for Recurrent Neural Networks. CoRR abs/1802.02615 (2018) - 2017
- [j4]Koji Ueno, Toyotaro Suzumura, Naoya Maruyama, Katsuki Fujisawa, Satoshi Matsuoka:
Efficient Breadth-First Search on Massively Parallel and Distributed-Memory Machines. Data Sci. Eng. 2(1): 22-35 (2017) - [j3]Didem Unat, Anshu Dubey, Torsten Hoefler, John Shalf, Mark James Abraham, Mauro Bianco, Bradford L. Chamberlain, Romain Cledat, H. Carter Edwards, Hal Finkel, Karl Fuerlinger, Frank Hannig, Emmanuel Jeannot, Amir Kamil, Jeff Keasler, Paul H. J. Kelly, Vitus J. Leung, Hatem Ltaief, Naoya Maruyama, Chris J. Newburn, Miquel Pericàs:
Trends in Data Locality Abstractions for HPC Systems. IEEE Trans. Parallel Distributed Syst. 28(10): 3007-3020 (2017) - [c52]Artur Podobas, Hamid Reza Zohouri, Naoya Maruyama, Satoshi Matsuoka:
Evaluating high-level design strategies on FPGAs for high-performance computing. FPL 2017: 1-4 - [c51]Artur Podobas, Hamid Reza Zohouri, Naoya Maruyama, Satoshi Matsuoka:
Evaluating high-level design strategies on FPGAs for high-performance computing. FPL 2017: 1-4 - [c50]James Lin, Zhigeng Xu, Akira Nukada, Naoya Maruyama, Satoshi Matsuoka:
Optimizations of Two Compute-Bound Scientific Kernels on the SW26010 Many-Core Processor. ICPP 2017: 432-441 - 2016
- [j2]Kiyoshi Kumahata, Kazuo Minami, Naoya Maruyama:
High-performance conjugate gradient performance improvement on the K computer. Int. J. High Perform. Comput. Appl. 30(1): 55-70 (2016) - [c49]Koji Ueno, Toyotaro Suzumura, Naoya Maruyama, Katsuki Fujisawa, Satoshi Matsuoka:
Extreme scale breadth-first search on supercomputers. IEEE BigData 2016: 1040-1047 - [c48]Satoshi Matsuoka, Hideharu Amano, Kengo Nakajima, Koji Inoue, Tomohiro Kudoh, Naoya Maruyama, Kenjiro Taura, Takeshi Iwashita, Takahiro Katagiri, Toshihiro Hanawa, Toshio Endo:
From FLOPS to BYTES: disruptive change in high-performance computing towards the post-moore era. Conf. Computing Frontiers 2016: 274-281 - [c47]Tetsuya Hoshino, Naoya Maruyama, Satoshi Matsuoka:
A Directive-Based Data Layout Abstraction for Performance Portability of OpenACC Applications. HPCC/SmartCity/DSS 2016: 1147-1154 - [c46]Keisuke Fukuda, Motohiko Matsuda, Naoya Maruyama, Rio Yokota, Kenjiro Taura, Satoshi Matsuoka:
Tapas: An Implicitly Parallel Programming Framework for Hierarchical N-Body Algorithms. ICPADS 2016: 1100-1109 - [c45]Abdelhalim Amer, Satoshi Matsuoka, Miquel Pericàs, Naoya Maruyama, Kenjiro Taura, Rio Yokota, Pavan Balaji:
Scaling FMM with Data-Driven OpenMP Tasks on Multicore Architectures. IWOMP 2016: 156-170 - [c44]Hamid Reza Zohouri, Naoya Maruyama, Aaron Smith, Motohiko Matsuda, Satoshi Matsuoka:
Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs. SC 2016: 409-420 - [c43]Mohamed Wahib, Naoya Maruyama, Takayuki Aoki:
Daino: a high-level framework for parallel and efficient AMR on GPUs. SC 2016: 621-632 - [e1]Naoya Maruyama, Bronis R. de Supinski, Mohamed Wahib:
OpenMP: Memory, Devices, and Tasks - 12th International Workshop on OpenMP, IWOMP 2016, Nara, Japan, October 5-7, 2016, Proceedings. Lecture Notes in Computer Science 9903, 2016, ISBN 978-3-319-45549-5 [contents] - 2015
- [c42]Mohamed Wahib, Naoya Maruyama:
Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications. HPDC 2015: 259-270 - [c41]Naoya Maruyama:
PDSEC Keynote. IPDPS Workshops 2015: 921 - [c40]Mohamed Wahib, Naoya Maruyama:
Data-centric GPU-based adaptive mesh refinement. IA3@SC 2015: 3:1-3:7 - 2014
- [c39]Kento Sato, Kathryn M. Mohror, Adam Moody, Todd Gamblin, Bronis R. de Supinski, Naoya Maruyama, Satoshi Matsuoka:
A User-Level InfiniBand-Based File System and Checkpoint Strategy for Burst Buffers. CCGRID 2014: 21-30 - [c38]Kento Sato, Adam Moody, Kathryn M. Mohror, Todd Gamblin, Bronis R. de Supinski, Naoya Maruyama, Satoshi Matsuoka:
FMI: Fault Tolerant Messaging Interface for Fast and Transparent Recovery. IPDPS 2014: 1225-1234 - [c37]Motohiko Matsuda, Shin'ichiro Takizawa, Naoya Maruyama:
Evaluation of Asynchronous MPI Communication in Map-Reduce System on the K Computer. EuroMPI/ASIA 2014: 163 - [c36]Tetsuya Hoshino, Naoya Maruyama, Satoshi Matsuoka:
An OpenACC extension for data layout transformation. WACCPD@SC 2014: 12-18 - [c35]Mohamed Wahib, Naoya Maruyama:
Scalable Kernel Fusion for Memory-Bound GPU Applications. SC 2014: 191-202 - 2013
- [c34]Tetsuya Hoshino, Naoya Maruyama, Satoshi Matsuoka, Ryoji Takaki:
CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application. CCGRID 2013: 136-143 - [c33]Motohiko Matsuda, Naoya Maruyama, Shin'ichiro Takizawa:
K MapReduce: A scalable tool for data-processing and search/ensemble applications on large-scale supercomputers. CLUSTER 2013: 1-8 - [c32]Mohamed Wahib, Naoya Maruyama:
Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES. CLUSTER 2013: 1-8 - [c31]Naoya Maruyama, Leif Kobbelt, Pavan Balaji, Nikola Puzovic, Samuel Thibault, Kun Zhou:
Topic 15: GPU and Accelerator Computing - (Introduction). Euro-Par 2013: 800 - [c30]Toshiya Komoda, Shinobu Miwa, Hiroshi Nakamura, Naoya Maruyama:
Integrating Multi-GPU Execution in an OpenACC Compiler. ICPP 2013: 260-269 - [c29]Mohamed-Slim Bouguerra, Ana Gainaru, Leonardo Arturo Bautista-Gomez, Franck Cappello, Satoshi Matsuoka, Naoya Maruyama:
Improving the Computing Efficiency of HPC Systems Using a Combination of Proactive and Preventive Checkpointing. IPDPS 2013: 501-512 - [c28]Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka:
Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM. ISC 2013: 255-266 - 2012
- [c27]Akihiro Nomura, Yutaka Ishikawa, Naoya Maruyama, Satoshi Matsuoka:
Design and Implementation of Portable and Efficient Non-blocking Collective Communication. CCGRID 2012: 1-8 - [c26]Leonardo Arturo Bautista-Gomez, Thomas Ropars, Naoya Maruyama, Franck Cappello, Satoshi Matsuoka:
Hierarchical Clustering Strategies for Fault Tolerance in Large Scale HPC Systems. CLUSTER 2012: 355-363 - [c25]Irina Demeshko, Naoya Maruyama, Hirofumi Tomita, Satoshi Matsuoka:
Multi-GPU Implementation of the NICAM Atmospheric Model. Euro-Par Workshops 2012: 175-184 - [c24]Leonardo Arturo Bautista-Gomez, Bogdan Nicolae, Naoya Maruyama, Franck Cappello, Satoshi Matsuoka:
Scalable Reed-Solomon-Based Reliable Local Storage for HPC Applications on IaaS Clouds. Euro-Par 2012: 313-324 - [c23]Aleksandr Drozd, Naoya Maruyama, Satoshi Matsuoka:
Sequence Alignment on Massively Parallel Heterogeneous Systems. IPDPS Workshops 2012: 2498-2501 - [c22]Kento Sato, Naoya Maruyama, Kathryn M. Mohror, Adam Moody, Todd Gamblin, Bronis R. de Supinski, Satoshi Matsuoka:
Design and modeling of a non-blocking checkpointing system. SC 2012: 19 - [c21]Kenjiro Taura, Jun Nakashima, Rio Yokota, Naoya Maruyama:
A Task Parallel Implementation of Fast Multipole Methods. SC Companion 2012: 617-625 - [c20]Aleksandr Drozd, Naoya Maruyama, Satoshi Matsuoka:
A Multi GPU Read Alignment Algorithm with Model-Based Performance Optimization. VECPAR 2012: 270-277 - 2011
- [c19]Takashi Shimokawabe, Takayuki Aoki, Tomohiro Takaki, Toshio Endo, Akinori Yamanaka, Naoya Maruyama, Akira Nukada, Satoshi Matsuoka:
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer. SC 2011: 3:1-3:11 - [c18]Naoya Maruyama, Tatsuo Nomura, Kento Sato, Satoshi Matsuoka:
Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. SC 2011: 11:1-11:12 - [c17]Aleksandr Drozd, Naoya Maruyama, Satoshi Matsuoka:
Poster: fast GPU read alignment with burrows wheeler transform based index. SC Companion 2011: 21-22 - [c16]Leonardo Arturo Bautista-Gomez, Seiji Tsuboi, Dimitri Komatitsch, Franck Cappello, Naoya Maruyama, Satoshi Matsuoka:
FTI: high performance fault tolerance interface for hybrid systems. SC 2011: 32:1-32:32 - [c15]Mark Silberstein, Naoya Maruyama:
An exact algorithm for energy-efficient acceleration of task trees on CPU/GPU architectures. SYSTOR 2011: 7 - 2010
- [j1]Naoya Maruyama, Satoshi Matsuoka:
Model-based Fault Localization: Finding Behavioral Outliers in Large-scale Computing Systems. New Gener. Comput. 28(3): 237-255 (2010) - [c14]Leonardo Arturo Bautista-Gomez, Naoya Maruyama, Franck Cappello, Satoshi Matsuoka:
Distributed Diskless Checkpoint for Large Scale Systems. CCGRID 2010: 63-72 - [c13]Hitoshi Nagasaka, Naoya Maruyama, Akira Nukada, Toshio Endo, Satoshi Matsuoka:
Statistical power modeling of GPU kernels using performance counters. Green Computing Conference 2010: 115-122 - [c12]Leonardo Arturo Bautista-Gomez, Akira Nukada, Naoya Maruyama, Franck Cappello, Satoshi Matsuoka:
Low-overhead diskless checkpoint for hybrid computing systems. HiPC 2010: 1-10 - [c11]Toshio Endo, Akira Nukada, Satoshi Matsuoka, Naoya Maruyama:
Linpack evaluation on a supercomputer with heterogeneous accelerators. IPDPS 2010: 1-8 - [c10]Naoya Maruyama, Akira Nukada, Satoshi Matsuoka:
A high-performance fault-tolerant software framework for memory on commodity GPUs. IPDPS 2010: 1-12 - [c9]Takashi Shimokawabe, Takayuki Aoki, Chiashi Muroi, Junichi Ishida, Kohei Kawano, Toshio Endo, Akira Nukada, Naoya Maruyama, Satoshi Matsuoka:
An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code. SC 2010: 1-11
2000 – 2009
- 2009
- [c8]Sumeth Lerthirunwong, Naoya Maruyama, Satoshi Matsuoka:
Adaptive Resource Indexing Technique for Unstructured Peer-to-Peer Networks. CCGRID 2009: 172-179 - 2008
- [c7]Hitoshi Sato, Satoshi Matsuoka, Toshio Endo, Naoya Maruyama:
Access-pattern and bandwidth aware file replication algorithm in a grid environment. GRID 2008: 250-257 - [c6]Naoya Maruyama, Satoshi Matsuoka:
Model-based fault localization in large-scale computing systems. IPDPS 2008: 1-12 - [c5]Yasuhiko Ogata, Toshio Endo, Naoya Maruyama, Satoshi Matsuoka:
An efficient, model-based CPU-GPU heterogeneous FFT library. IPDPS 2008: 1-10 - 2007
- [c4]Hideo Nishimura, Naoya Maruyama, Satoshi Matsuoka:
Virtual Clusters on the Fly - Fast, Scalable, and Flexible Installation. CCGRID 2007: 549-556 - [c3]Shohei Yamasaki, Naoya Maruyama, Satoshi Matsuoka:
Model-based resource selection for efficient virtual cluster deployment. VTDC@SC 2007: 6:1-6:7 - 2006
- [c2]Masaki Tatezono, Naoya Maruyama, Satoshi Matsuoka:
Making Wide-Area, Multi-site MPI Feasible Using Xen VM. ISPA Workshops 2006: 387-396 - [c1]Alexander V. Mirgorodskiy, Naoya Maruyama, Barton P. Miller:
Scalable systems software - Problem diagnosis in large-scale computing environments. SC 2006: 88
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 22:20 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint