default search action
Guohao Dai
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j13]Shiyao Li, Zhenhua Zhu, Hanbo Sun, Xuefei Ning, Guohao Dai, Yiming Hu, Huazhong Yang, Yu Wang:
Toward High-Accuracy and Real-Time Two-Stage Small Object Detection on FPGA. IEEE Trans. Circuits Syst. Video Technol. 34(9): 8053-8066 (2024) - [j12]Yiming Chen, Mingyen Lee, Guohao Dai, Mufeng Zhou, Nagadastagiri Challapalle, Tianyi Wang, Yao Yu, Yongpan Liu, Yu Wang, Huazhong Yang, Vijaykrishnan Narayanan, Xueqing Li:
GRAPHIC: Gather and Process Harmoniously in the Cache With High Parallelism and Flexibility. IEEE Trans. Emerg. Top. Comput. 12(1): 84-96 (2024) - [c63]Kai Zhong, Zhenhua Zhu, Guohao Dai, Hongyi Wang, Xinhao Yang, Haoyu Zhang, Jin Si, Qiuli Mao, Shulin Zeng, Ke Hong, Genghan Zhang, Huazhong Yang, Yu Wang:
FEASTA: A Flexible and Efficient Accelerator for Sparse Tensor Algebra in Machine Learning. ASPLOS (3) 2024: 349-366 - [c62]Lin Zhao, Tianchen Zhao, Zinan Lin, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang:
FlashEval: Towards Fast and Accurate Evaluation of Text-to-Image Diffusion Generative Models. CVPR 2024: 16122-16131 - [c61]Hongyi Wang, Kai Zhong, Haoyu Zhang, Shulin Zeng, Zhenhua Zhu, Xinhao Yang, Shuang Wang, Guohao Dai, Huazhong Yang, Yu Wang:
DySpMM: From Fix to Dynamic for Sparse Matrix-Matrix Multiplication Accelerators. DAC 2024: 96:1-96:6 - [c60]Xueyuan Liu, Zhuoran Song, Guohao Dai, Gang Li, Can Xiao, Yan Xiang, Dehui Kong, Ke Xu, Xiaoyao Liang:
FusionArch: A Fusion-Based Accelerator for Point-Based Point Cloud Neural Networks. DATE 2024: 1-6 - [c59]Tongxin Xie, Tianchen Zhao, Zhenhua Zhu, Xuefei Ning, Bing Li, Guohao Dai, Huazhong Yang, Yu Wang:
DyPIM: Dynamic-Inference-Enabled Processing - In-Memory Accelerator. DATE 2024: 1-6 - [c58]Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang:
FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs. FPGA 2024: 223-234 - [c57]Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang:
Evaluating Quantized Large Language Models. ICML 2024 - [c56]Ke Hong, Guohao Dai, Jiaming Xu, Qiuli Mao, Xiuhong Li, Jun Liu, Kangdi Chen, Yuhan Dong, Yu Wang:
FlashDecoding++: Faster Large Language Model Inference with Asynchronization, Flat GEMM Optimization, and Heuristics. MLSys 2024 - [c55]Fan Yang, Zehao Wang, Haoyu Zhang, Zhenhua Zhu, Xinhao Yang, Guohao Dai, Yu Wang:
Efficient Deployment of Large Language Model across Cloud-Device Systems. SOCC 2024: 1-6 - [i35]Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang:
FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs. CoRR abs/2401.03868 (2024) - [i34]Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, Yu Wang:
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K. CoRR abs/2402.05136 (2024) - [i33]Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang:
Evaluating Quantized Large Language Models. CoRR abs/2402.18158 (2024) - [i32]Lin Zhao, Tianchen Zhao, Zinan Lin, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang:
FlashEval: Towards Fast and Accurate Evaluation of Text-to-image Diffusion Generative Models. CoRR abs/2403.16379 (2024) - [i31]Enshu Liu, Junyi Zhu, Zinan Lin, Xuefei Ning, Matthew B. Blaschko, Sergey Yekhanin, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang:
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better. CoRR abs/2404.02241 (2024) - [i30]Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang:
A Survey on Efficient Inference for Large Language Models. CoRR abs/2404.14294 (2024) - [i29]Yao Teng, Yue Wu, Han Shi, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu:
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis. CoRR abs/2405.14224 (2024) - [i28]Si Xu, Zixiao Huang, Yan Zeng, Shengen Yan, Xuefei Ning, Haolin Ye, Sipei Gu, Chunsheng Shui, Zhezheng Lin, Hao Zhang, Sheng Wang, Guohao Dai, Yu Wang:
HetHub: A Heterogeneous distributed hybrid training system for large-scale models. CoRR abs/2405.16256 (2024) - [i27]Tianchen Zhao, Xuefei Ning, Tongcheng Fang, Enshu Liu, Guyue Huang, Zinan Lin, Shengen Yan, Guohao Dai, Yu Wang:
MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization. CoRR abs/2405.17873 (2024) - [i26]Tianchen Zhao, Tongcheng Fang, Enshu Liu, Rui Wan, Widyadewi Soedarmadji, Shiyao Li, Zinan Lin, Guohao Dai, Shengen Yan, Huazhong Yang, Xuefei Ning, Yu Wang:
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation. CoRR abs/2406.02540 (2024) - [i25]Zhihang Yuan, Pu Lu, Hanling Zhang, Xuefei Ning, Linfeng Zhang, Tianchen Zhao, Shengen Yan, Guohao Dai, Yu Wang:
DiTFastAttn: Attention Compression for Diffusion Transformer Models. CoRR abs/2406.08552 (2024) - [i24]Xuefei Ning, Zifu Wang, Shiyao Li, Zinan Lin, Peiran Yao, Tianyu Fu, Matthew B. Blaschko, Guohao Dai, Huazhong Yang, Yu Wang:
Can LLMs Learn by Teaching? A Preliminary Study. CoRR abs/2406.14629 (2024) - [i23]Tianyu Fu, Haofeng Huang, Xuefei Ning, Genghan Zhang, Boju Chen, Tianqi Wu, Hongyi Wang, Zixiao Huang, Shiyao Li, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang:
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression. CoRR abs/2406.14909 (2024) - [i22]Enshu Liu, Junyi Zhu, Zinan Lin, Xuefei Ning, Matthew B. Blaschko, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang:
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs. CoRR abs/2407.00945 (2024) - [i21]Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang:
CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios. CoRR abs/2409.10593 (2024) - [i20]Jinhao Li, Shan Huang, Jiaming Xu, Jun Liu, Li Ding, Ningyi Xu, Guohao Dai:
MARCA: Mamba Accelerator with ReConfigurable Architecture. CoRR abs/2409.11440 (2024) - [i19]Yao Teng, Han Shi, Xian Liu, Xuefei Ning, Guohao Dai, Yu Wang, Zhenguo Li, Xihui Liu:
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding. CoRR abs/2410.01699 (2024) - [i18]Jinhao Li, Jiaming Xu, Shan Huang, Yonghua Chen, Wen Li, Jun Liu, Yaoxiu Lian, Jiayi Pan, Li Ding, Hao Zhou, Yu Wang, Guohao Dai:
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective. CoRR abs/2410.04466 (2024) - [i17]Haoyu Zhang, Jun Liu, Zhenhua Zhu, Shulin Zeng, Maojia Sheng, Tao Yang, Guohao Dai, Yu Wang:
Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search. CoRR abs/2410.20381 (2024) - 2023
- [j11]Genghan Zhang, Yuetong Zhao, Yanting Tao, Zhongming Yu, Guohao Dai, Sitao Huang, Yuan Wen, Pavlos Petoumenos, Yu Wang:
Sgap: towards efficient sparse tensor algebra compilation for GPU. CCF Trans. High Perform. Comput. 5(2): 210-227 (2023) - [j10]Shulin Zeng, Guohao Dai, Niansong Zhang, Xinhao Yang, Haoyu Zhang, Zhenhua Zhu, Huazhong Yang, Yu Wang:
Serving Multi-DNN Workloads on FPGAs: A Coordinated Architecture, Scheduling, and Mapping Perspective. IEEE Trans. Computers 72(5): 1314-1328 (2023) - [j9]Jingbo Hu, Guohao Dai, Liuzheng Wang, Liyang Lai, Yu Huang, Huazhong Yang, Yu Wang:
Adaptive Multidimensional Parallel Fault Simulation Framework on Heterogeneous System. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42(6): 1951-1964 (2023) - [j8]Hanbo Sun, Zhenhua Zhu, Chenyu Wang, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang:
Gibbon: An Efficient Co-Exploration Framework of NN Model and Processing-In-Memory Architecture. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42(11): 4075-4089 (2023) - [j7]Zhenhua Zhu, Hanbo Sun, Tongxin Xie, Yu Zhu, Guohao Dai, Lixue Xia, Dimin Niu, Xiaoming Chen, Xiaobo Sharon Hu, Yu Cao, Yuan Xie, Huazhong Yang, Yu Wang:
MNSIM 2.0: A Behavior-Level Modeling Tool for Processing-In-Memory Architectures. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42(11): 4112-4125 (2023) - [j6]Kai Zhong, Shulin Zeng, Wentao Hou, Guohao Dai, Zhenhua Zhu, Xuecang Zhang, Shihai Xiao, Huazhong Yang, Yu Wang:
CoGNN: An Algorithm-Hardware Co-Design Approach to Accelerate GNN Inference With Minibatch Sampling. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 42(12): 4883-4896 (2023) - [c54]Shuo Yin, Guohao Dai, Wei W. Xing:
High-Dimensional Yield Estimation Using Shrinkage Deep Features and Maximization of Integral Entropy Reduction. ASP-DAC 2023: 283-289 - [c53]Wentao Hou, Kai Zhong, Shulin Zeng, Guohao Dai, Huazhong Yang, Yu Wang:
NTGAT: A Graph Attention Network Accelerator with Runtime Node Tailoring. ASP-DAC 2023: 645-650 - [c52]Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han:
TorchSparse++: Efficient Point Cloud Engine. CVPR Workshops 2023: 202-209 - [c51]Shiyao Li, Zhenhua Zhu, Yu Zhu, Qingpeng Zhu, Jiangwei Zhang, Wenxiu Sun, Guohao Dai, Fei Qiao, Huazhong Yang, Yu Wang:
Memory-Efficient and Real-Time SPAD-based dToF Depth Sensor with Spatial and Statistical Correlation. DAC 2023: 1-6 - [c50]Yanfang Liu, Guohao Dai, Wei W. Xing:
Seeking the Yield Barrier: High-Dimensional SRAM Evaluation Through Optimal Manifold. DAC 2023: 1-6 - [c49]Xinhao Yang, Tianyu Fu, Guohao Dai, Shulin Zeng, Kai Zhong, Ke Hong, Yu Wang:
An Efficient Accelerator for Point-based and Voxel-based Point Cloud Neural Networks. DAC 2023: 1-6 - [c48]Zhenhua Zhu, Jun Liu, Guohao Dai, Shulin Zeng, Bing Li, Huazhong Yang, Yu Wang:
Processing-In-Hierarchical-Memory Architecture for Billion-Scale Approximate Nearest Neighbor Search. DAC 2023: 1-6 - [c47]Yu Zhu, Zhenhua Zhu, Guohao Dai, Fengbin Tu, Hanbo Sun, Kwang-Ting Cheng, Huazhong Yang, Yu Wang:
PIM-HLS: An Automatic Hardware Generation Tool for Heterogeneous Processing-In-Memory-based Neural Network Accelerators. DAC 2023: 1-6 - [c46]Tianyu Fu, Chiyue Wei, Zhenhua Zhu, Shang Yang, Zhongming Yu, Guohao Dai, Huazhong Yang, Yu Wang:
CLAP: Locality Aware and Parallel Triangle Counting with Content Addressable Memory. DATE 2023: 1-6 - [c45]Hanbo Sun, Tongxin Xie, Zhenhua Zhu, Guohao Dai, Huazhong Yang, Yu Wang:
Minimizing Communication Conflicts in Network-On-Chip Based Processing-In-Memory Architecture. DATE 2023: 1-6 - [c44]Yijia Zhang, Yibo Han, Shijie Cao, Guohao Dai, Youshan Miao, Ting Cao, Fan Yang, Ningyi Xu:
Adam Accumulation to Reduce Memory Footprints of Both Activations and Gradients for Large-Scale DNN Training. ECAI 2023: 3058-3065 - [c43]Yaoxiu Lian, Xinhao Yang, Ke Hong, Yu Wang, Guohao Dai, Ningyi Xu:
A Point Transformer Accelerator with Fine-Grained Pipelines and Distribution-Aware Dynamic FPS. ICCAD 2023: 1-9 - [c42]Yanfang Liu, Guohao Dai, Yuanqing Cheng, Wang Kang, Wei W. Xing:
OPT: Optimal Proposal Transfer for Efficient Yield Optimization for Analog and SRAM Circuits. ICCAD 2023: 1-9 - [c41]Jun Liu, Guohao Dai, Hao Xia, Lidong Guo, Xiangsheng Shi, Jiaming Xu, Huazhong Yang, Yu Wang:
TSTC: Two-Level Sparsity Tensor Core Enabling both Algorithm Flexibility and Hardware Efficiency. ICCAD 2023: 1-9 - [c40]Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, Yu Wang:
Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection. ICCV 2023: 17682-17692 - [c39]Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han:
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs. MICRO 2023: 225-239 - [c38]Shulin Zeng, Zhenhua Zhu, Jun Liu, Haoyu Zhang, Guohao Dai, Zixuan Zhou, Shuangchen Li, Xuefei Ning, Yuan Xie, Huazhong Yang, Yu Wang:
DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search. MICRO 2023: 283-296 - [c37]Ke Hong, Zhongming Yu, Guohao Dai, Xinhao Yang, Yaoxiu Lian, Zehao Liu, Ningyi Xu, Yuhan Dong, Yu Wang:
Exploiting Hardware Utilization and Adaptive Dataflow for Efficient Sparse Convolution in 3D Point Clouds. MLSys 2023 - [c36]Zhongming Yu, Guohao Dai, Shang Yang, Genghan Zhang, Hengrui Zhang, Feiwen Zhu, June Yang, Jishen Zhao, Yu Wang:
HyperGef: A Framework Enabling Efficient Fusion for Hypergraph Neural Network on GPUs. MLSys 2023 - [c35]Weijie Luo, Zihao Liu, Guohao Dai, Ningyi Xu:
History-Detr: Optimize Query Initialization Strategy by Using Historical Information and Kinematics. MMAsia 2023: 4:1-4:7 - [c34]Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Zhongming Yu, Hengrui Zhang, Xingcheng Yao, Aohan Zeng, Shiguang Guo, Yuxiao Dong, Yang Yang, Peng Zhang, Guohao Dai, Yu Wang, Chang Zhou, Hongxia Yang, Jie Tang:
CogDL: A Comprehensive Library for Graph Deep Learning. WWW 2023: 747-758 - [i16]Yijia Zhang, Yibo Han, Shijie Cao, Guohao Dai, Youshan Miao, Ting Cao, Fan Yang, Ningyi Xu:
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training. CoRR abs/2305.19982 (2023) - [i15]Tianchen Zhao, Xuefei Ning, Ke Hong, Zhongyuan Qiu, Pu Lu, Yali Zhao, Linfeng Zhang, Lipu Zhou, Guohao Dai, Huazhong Yang, Yu Wang:
Ada3D : Exploiting the Spatial Redundancy with Adaptive Inference for Efficient 3D Object Detection. CoRR abs/2307.08209 (2023) - [i14]Yanfang Liu, Guohao Dai, Wei W. Xing:
Seeking the Yield Barrier: High-Dimensional SRAM Evaluation Through Optimal Manifold. CoRR abs/2307.15773 (2023) - [i13]Ke Hong, Guohao Dai, Jiaming Xu, Qiuli Mao, Xiuhong Li, Jun Liu, Kangdi Chen, Yuhan Dong, Yu Wang:
FlashDecoding++: Faster Large Language Model Inference on GPUs. CoRR abs/2311.01282 (2023) - [i12]Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han:
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs. CoRR abs/2311.12862 (2023) - [i11]Jinhao Li, Jiaming Xu, Shiyao Li, Shan Huang, Jun Liu, Yaoxiu Lian, Guohao Dai:
Fast and Efficient 2-bit LLM Inference on GPU: 2/4/16-bit in a Weight Matrix with Asynchronous Dequantization. CoRR abs/2311.16442 (2023) - 2022
- [j5]Jincheng Yu, Zhilin Xu, Shulin Zeng, Chao Yu, Jiantao Qiu, Zhaoyang Shen, Yuanfan Xu, Guohao Dai, Yu Wang, Huazhong Yang:
INCAME: Interruptible CNN Accelerator for Multirobot Exploration. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(4): 964-978 (2022) - [j4]Kai Zhong, Xuefei Ning, Guohao Dai, Zhenhua Zhu, Tianchen Zhao, Shulin Zeng, Yu Wang, Huazhong Yang:
Exploring the Potential of Low-Bit Training of Convolutional Neural Networks. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 41(12): 5421-5434 (2022) - [j3]Shulin Zeng, Guohao Dai, Hanbo Sun, Jun Liu, Shiyao Li, Guangjun Ge, Kai Zhong, Kaiyuan Guo, Yu Wang, Huazhong Yang:
A Unified FPGA Virtualization Framework for General-Purpose Deep Neural Networks in the Cloud. ACM Trans. Reconfigurable Technol. Syst. 15(3): 24:1-24:31 (2022) - [c33]Bangyan Wang, Lei Deng, Fei Sun, Guohao Dai, Liu Liu, Yu Wang, Yuan Xie:
A one-for-all and o(v log(v ))-cost solution for parallel merge style operations on sorted key-value arrays. ASPLOS 2022: 669-682 - [c32]Guohao Dai, Guyue Huang, Shang Yang, Zhongming Yu, Hengrui Zhang, Yufei Ding, Yuan Xie, Huazhong Yang, Yu Wang:
Heuristic adaptability to input dynamics for SpMM on CPUs. DAC 2022: 595-600 - [c31]Yu Zhu, Zhenhua Zhu, Guohao Dai, Kai Zhong, Huazhong Yang, Yu Wang:
Exploiting Parallelism with Vertex-Clustering in Processing-In-Memory-based GCN Accelerators. DATE 2022: 652-657 - [c30]Hanbo Sun, Chenyu Wang, Zhenhua Zhu, Xuefei Ning, Guohao Dai, Huazhong Yang, Yu Wang:
Gibbon: Efficient Co-Exploration of NN Model and Processing-In-Memory Architecture. DATE 2022: 867-872 - [c29]Guohao Dai, Zhenhua Zhu, Tianyu Fu, Chiyue Wei, Bangyan Wang, Xiangyu Li, Yuan Xie, Huazhong Yang, Yu Wang:
DIMMining: pruning-efficient and parallel graph mining on near-memory-computing. ISCA 2022: 130-145 - [c28]Jun Liu, Zhenhua Zhu, Jingbo Hu, Hanbo Sun, Li Liu, Lingzhi Liu, Guohao Dai, Huazhong Yang, Yu Wang:
Optimizing Graph-based Approximate Nearest Neighbor Search: Stronger and Smarter. MDM 2022: 179-184 - [c27]Hengrui Zhang, Zhongming Yu, Guohao Dai, Guyue Huang, Yufei Ding, Yuan Xie, Yu Wang:
Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective. MLSys 2022 - [i10]Guohao Dai, Guyue Huang, Shang Yang, Zhongming Yu, Hengrui Zhang, Yufei Ding, Yuan Xie, Huazhong Yang, Yu Wang:
Heuristic Adaptability to Input Dynamics for SpMM on GPUs. CoRR abs/2202.08556 (2022) - [i9]Yiming Chen, Guohao Dai, Mufeng Zhou, Mingyen Lee, Nagadastagiri Challapalle, Guodong Yin, Zekun Yang, Yongpan Liu, Huazhong Yang, Vijaykrishnan Narayanan, Xueqing Li:
GRAPHIC: GatheR-And-Process in Highly parallel with In-SSD Compression Architecture in Very Large-Scale Graph. CoRR abs/2208.08600 (2022) - [i8]Genghan Zhang, Yuetong Zhao, Yanting Tao, Zhongming Yu, Guohao Dai, Sitao Huang, Yuan Wen, Pavlos Petoumenos, Yu Wang:
Sgap: Towards Efficient Sparse Tensor Algebra Compilation for GPU. CoRR abs/2209.02882 (2022) - [i7]Shuo Yin, Guohao Dai, Wei W. Xing:
High-Dimensional Yield Estimation using Shrinkage Deep Features and Maximization of Integral Entropy Reduction. CoRR abs/2212.02100 (2022) - 2021
- [c26]Shulin Zeng, Guohao Dai, Hanbo Sun, Jun Liu, Hongren Zheng, Yusong Wu, Fan Zhang, Xinhao Yang, Yi Cai, Yu Wang, Huazhong Yang:
3M-AI: A Multi-task and Multi-core Virtualization Framework for Multi-FPGA AI Systems in the Cloud. FPGA 2021: 228 - [c25]Yitu Wang, Zhenhua Zhu, Fan Chen, Mingyuan Ma, Guohao Dai, Yu Wang, Hai Li, Yiran Chen:
Rerec: In-ReRAM Acceleration with Access-Aware Mapping for Personalized Recommendation. ICCAD 2021: 1-9 - [c24]Zhongming Yu, Guohao Dai, Guyue Huang, Yu Wang, Huazhong Yang:
Exploiting Online Locality and Reduction Parallelism for Sampled Dense Matrix Multiplication on GPUs. ICCD 2021: 567-574 - [i6]Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Xingcheng Yao, Aohan Zeng, Shiguang Guo, Peng Zhang, Guohao Dai, Yu Wang, Chang Zhou, Hongxia Yang, Jie Tang:
CogDL: An Extensive Toolkit for Deep Learning on Graphs. CoRR abs/2103.00959 (2021) - [i5]Guyue Huang, Guohao Dai, Yu Wang, Yufei Ding, Yuan Xie:
Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction. CoRR abs/2106.16064 (2021) - [i4]Hengrui Zhang, Zhongming Yu, Guohao Dai, Guyue Huang, Yufei Ding, Yuan Xie, Yu Wang:
Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective. CoRR abs/2110.09524 (2021) - 2020
- [c23]Jincheng Yu, Zhilin Xu, Shulin Zeng, Chao Yu, Jiantao Qiu, Chaoyang Shen, Yuanfan Xu, Guohao Dai, Yu Wang, Huazhong Yang:
INCA: INterruptible CNN Accelerator for Multi-tasking in Embedded Robots. DAC 2020: 1-6 - [c22]Shulin Zeng, Guohao Dai, Hanbo Sun, Kai Zhong, Guangjun Ge, Kaiyuan Guo, Yu Wang, Huazhong Yang:
Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud. FCCM 2020: 102-110 - [c21]Jincheng Yu, Zhilin Xu, Shulin Zeng, Chao Yu, Jiantao Qiu, Chaoyang Shen, Yuanfan Xu, Guohao Dai, Yu Wang, Huazhong Yang:
INCAME: INterruptible CNN Accelerator for Multi-robot Exploration. FPGA 2020: 316 - [c20]Shulin Zeng, Guohao Dai, Kai Zhong, Hanbo Sun, Guangjun Ge, Kaiyuan Guo, Yu Wang, Huazhong Yang:
Enable Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud. FPGA 2020: 317 - [c19]Zhenhua Zhu, Hanbo Sun, Kaizhong Qiu, Lixue Xia, Gokul Krishnan, Guohao Dai, Dimin Niu, Xiaoming Chen, Xiaobo Sharon Hu, Yu Cao, Yuan Xie, Yu Wang, Huazhong Yang:
MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems. ACM Great Lakes Symposium on VLSI 2020: 83-88 - [c18]Ziqian Wan, Guohao Dai, Yun Joon Soh, Jishen Zhao, Yu Wang:
An Order Sampling Processing-in-Memory Architecture for Approximate Graph Pattern Mining. ACM Great Lakes Symposium on VLSI 2020: 357-362 - [c17]Tianyu Fu, Ziqian Wan, Guohao Dai, Yu Wang, Huazhong Yang:
LessMine: Reducing Sample Space and Data Access for Dense Pattern Mining. HPEC 2020: 1-7 - [c16]Jingbo Hu, Guohao Dai, Yu Wang, Huazhong Yang:
GraphSDH: A General Graph Sampling Framework with Distribution and Hierarchy. HPEC 2020: 1-7 - [c15]Guyue Huang, Guohao Dai, Yu Wang, Huazhong Yang:
GE-SpMM: general-purpose sparse matrix-matrix multiplication on GPUs for graph neural networks. SC 2020: 72 - [i3]Shulin Zeng, Guohao Dai, Hanbo Sun, Kai Zhong, Guangjun Ge, Kaiyuan Guo, Yu Wang, Huazhong Yang:
Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud. CoRR abs/2003.12101 (2020) - [i2]Guyue Huang, Guohao Dai, Yu Wang, Huazhong Yang:
GE-SpMM: General-purpose Sparse Matrix-Matrix Multiplication on GPUs for Graph Neural Networks. CoRR abs/2007.03179 (2020)
2010 – 2019
- 2019
- [j2]Guohao Dai, Tianhao Huang, Yu Wang, Huazhong Yang, John Wawrzynek:
HyVE: Hybrid Vertex-Edge Memory Hierarchy for Energy-Efficient Graph Processing. IEEE Trans. Computers 68(8): 1131-1146 (2019) - [j1]Guohao Dai, Tianhao Huang, Yuze Chi, Jishen Zhao, Guangyu Sun, Yongpan Liu, Yu Wang, Yuan Xie, Huazhong Yang:
GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 38(4): 640-653 (2019) - [c14]Guohao Dai, Tianhao Huang, Yu Wang, Huazhong Yang, John Wawrzynek:
GraphSAR: a sparsity-aware processing-in-memory architecture for large-scale graph processing on ReRAMs. ASP-DAC 2019: 120-126 - [c13]Zhenhua Zhu, Hanbo Sun, Yujun Lin, Guohao Dai, Lixue Xia, Song Han, Yu Wang, Huazhong Yang:
A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM. DAC 2019: 56 - [c12]Kun Wu, Guohao Dai, Xing Hu, Shuangchen Li, Xinfeng Xie, Yu Wang, Yuan Xie:
Memory-Bound Proof-of-Work Acceleration for Blockchain Applications. DAC 2019: 177 - [c11]Qijing Huang, Christopher Yarp, Sagar Karandikar, Nathan Pemberton, Benjamin Brock, Liang Ma, Guohao Dai, Robert Quitt, Krste Asanovic, John Wawrzynek:
Centrifuge: Evaluating full-system HLS-generated heterogenous-accelerator SoCs using FPGA-Acceleration. ICCAD 2019: 1-8 - 2018
- [c10]Tianhao Huang, Guohao Dai, Yu Wang, Huazhong Yang:
HyVE: Hybrid vertex-edge memory hierarchy for energy-efficient graph processing. DATE 2018: 973-978 - [c9]Guohao Dai, Tianhao Huang, Yu Wang, Huazhong Yang, John Wawrzynek:
NewGraph: Balanced Large-Scale Graph Processing on FPGAs with Low Preprocessing Overheads. FCCM 2018: 208 - [c8]Gushu Li, Guohao Dai, Shuangchen Li, Yu Wang, Yuan Xie:
GraphIA: an in-situ accelerator for large-scale graph processing. MEMSYS 2018: 79-84 - 2017
- [c7]Guohao Dai, Tianhao Huang, Yuze Chi, Ningyi Xu, Yu Wang, Huazhong Yang:
ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture. FPGA 2017: 217-226 - 2016
- [c6]Guohao Dai, Yuze Chi, Yu Wang, Huazhong Yang:
FPGP: Graph Processing Framework on FPGA A Case Study of Breadth-First Search. FPGA 2016: 105-110 - [c5]Yubin Li, Yuliang Sun, Guohao Dai, Qiang Xu, Yu Wang, Huazhong Yang:
Approximate Frequent Itemset Mining for streaming data on FPGA. FPL 2016: 1-4 - [c4]Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, Huazhong Yang:
NXgraph: An efficient graph processing system on a single machine. ICDE 2016: 409-420 - 2015
- [c3]Yubin Li, Yuliang Sun, Guohao Dai, Yuzhi Wang, Jiacai Ni, Yu Wang, Guoliang Li, Huazhong Yang:
A self-aware data compression system on FPGA in Hadoop. FPT 2015: 196-199 - [i1]Yuze Chi, Guohao Dai, Yu Wang, Guangyu Sun, Guoliang Li, Huazhong Yang:
NXgraph: An Efficient Graph Processing System on a Single Machine. CoRR abs/1510.06916 (2015) - 2014
- [c2]Guohao Dai, Yi Shan, Fei Chen, Yu Wang, Kun Wang, Huazhong Yang:
Online scheduling for FPGA computation in the Cloud. FPT 2014: 330-333 - 2013
- [c1]Sitao Huang, Guohao Dai, Yuliang Sun, Zilong Wang, Yu Wang, Huazhong Yang:
DTW-Based Subsequence Similarity Search on AMD Heterogeneous Computing Platform. HPCC/EUC 2013: 1054-1063
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-12-15 02:17 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint