会议文集


会议名31st IEEE International Symposium on High Performance Computer Architecture (HPCA 2025)
中译名《第三十一届IEEE国际高性能计算机体系架构研讨会,卷3》
机构Institute of Electrical and Electronic Engineers (IEEE)
会议日期1-5 March 2025
会议地点Las Vegas, Nevada, USA
出版年2025
馆藏号357272


题名作者出版年
Hydra: Scale-out FHE Accelerator Architecture for Secure Deep Learning on FPGAYinghao Yang; Xicheng Xu; Haibin Zhang; Jie Song; Xin Tang; Hang Lu; Xiaowei Li2025
WarpDrive: GPU-Based Fully Homomorphic Encryption Acceleration Leveraging Tensor and CUDA CoresGuang Fan; Mingzhe Zhang; Fangyu Zheng; Shengyu Fan; Tian Zhou; Xianglong Deng; Wenxu Tang; Liang Kong; Yixuan Song; Shoumeng Yan2025
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from μWatts to MWatts for Sustainable AIArya Tschand; Arun Tejusve Raghunath Rajan; Sachin Idgunji; Anirban Ghosh; Jeremy Holleman; Csaba Kiraly; Pawan Ambalkar; Ritika Borkar; Ramesh Chukka; Trevor Cockrell; Oliver Curtis; Grigori Fursin; Miro Hodak; Hiwot Kassa; Anton Lokhmotov; Dejan Miskovic; Yuechao Pan; Manu Prasad Manmathan; Liz Raymond; Tom St. John; Arjun Suresh; Rowan Taubitz; Sean Zhan; Scott Wasson; David Kanter; Vijay Janapa Reddi2025
Enterprise Class Modular Cache HierarchyCraig Walters; Deanna Berger; Robert Sonnelitter; Alper Buyuktosunoglu2025
Predicting DRAM-Caused Risky VMs in Large-Scale CloudsYaoguang Yong; Xiaoming Du; Xuhua Ma; Yuxiang Wang; Bin Yao; Xudong Zheng; Huite Yi2025
Enhancing Large-Scale AI Training Efficiency: The C4 Solution for Real-Time Anomaly Detection and Communication OptimizationJianbo Dong; Bin Luo; Jun Zhang; Pengcheng Zhang; Fei Feng; Yikai Zhu; Ang Liu; Zian Chen; Yi Shi; Hairong Jiao; Gang Lu; Yu Guan; Ennan Zhai; Wencong Xiao; Hanyu Zhao; Man Yuan; Siran Yang; Xiang Li; Jiamang Wang; Rui Men; Jianwei Zhang; Chang Zhou; Dennis Cai; Yuan Xie; Binzhang Fu2025
Revisiting Reliability in Large-Scale Machine Learning Research ClustersApostolos Kokolis; Michael Kuchnik; John Hoffman; Adithya Kumar; Parth Malani; Faye Ma; Zachary DeVito; Shubho Sengupta; Kalyan Saladi; Carole-Jean Wu2025
HILP: Accounting for Workload-Level Parallelism in System-on-Chip Design Space ExplorationJoseph Rogers; Lieven Eeckhout; Magnus Jahre2025
CORDOBA: Carbon-Efficient Optimization Framework for Computing SystemsMariam Elgamal; Doug Carmean; Elnaz Ansari; Okay Zed; Ramesh Peri; Srilatha Manne; Udit Gupta; Gu-Yeon Wei; David Brooks; Gage Hills; Carole-Jean Wu2025
Architecting Space Microdatacenters: A System-level ApproachNathan Bleier; Rick Eason; Michael Lembeck; Rakesh Kumar2025
ARTEMIS: Agile Discovery of Efficient Real-Time Systems-on-Chips in the Heterogeneous EraSubhankar Pal; Aporva Amarnath; Behzad Boroujerdian; Augusto Vega; Alper Buyuktosunoglu; John-David Wellman; Vijay Janapa Reddi; Pradip Bose2025
LEGO: Spatial Accelerator Generation and Optimization for Tensor ApplicationsYujun Lin; Zhekai Zhang; Song Han2025
DynamoLLM: Designing LLM Inference Clusters for Performance and Energy EfficiencyJovan Stojkovic; Chaojie Zhang; Inigo Goiri; Josep Torrellas; Esha Choukse2025
throttLL'eM: Predictive GPU Throttling for Energy Efficient LLM Inference ServingAndreas Kosmas Kakolyris; Dimosthenis Masouros; Petros Vavaroutsos; Sotirios Xydis; Dimitrios Soudris2025
RpcNIC: Enabling Efficient Datacenter RPC Offloading on PCIe-attached SmartNICsJie Zhang; Hongjing Huang; Xuzheng Chen; Xiang Li; Jieru Zhao; Ming Liu; Zeke Wang2025
NVMePass: A Lightweight, High-performance and Scalable NVMe Virtualization Architecture with I/O Queues PassthroughYiquan Chen; Zhen Jin; Yijing Wang; Yi Chen; Jiexiong Xu; Hao Yu; Jinlong Chen; Wenhai Lin; Kanghua Fang; Keyao Zhang; Chengkun Wei; Qiang Liu; Yuan Xie; Wenzhi Chen2025
Warped-Compaction: Maximizing GPU Register File Bandwidth Utilization via Operand CompactionEunbi Jeong; Ipoom Jeong; Myung Kuk Yoon; Nam Sung Kim2025
Cooperative Warp Execution in Tensor Core for RISC-V GPGPUAbubakr Nada; Giuseppe Maria Sarda; Erwan Lenormand2025
SparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph WorkloadsShinnung Jeong; Liam Paul Cooper; Ju Min Lee; Heelim Choi; Nicholas Parnenzini; Chihyo Ahn; Yongwoo Lee; Hanjun Kim; Hyesoon Kim2025
HSMU-SpGEMM: Achieving High Shared Memory Utilization for Parallel Sparse General Matrix-Matrix Multiplication on Modern GPUsMin Wu; Huizhang Luo; Fenfang Li; Yiran Zhang; Zhuo Tang; Kenli Li; Jeff Zhang; Chubo Liu2025
12