HPCA 2024 Trip Report
This week I attended the 30th IEEE International Symposium on High-Performance Computer Architecture (HPCA), which was held in Edinburgh, United Kingdom, in March 2 - 6, 2024. HPCA is a high-impact venue for presenting research results on a wide range of computer architecture topics.
On Sunday, I attended an exciting tutorial on Computing Systems Resilience to Hardware Faults: Tackling Complexity and Scale, organized by the University of Athens with talks from AMD and Meta engineers. The session focused on the growing challenges of hardware faults in microprocessors and memory devices, especially as systems become more complex and widely deployed. Topics included advanced simulation techniques for assessing vulnerabilities in CPUs, GPUs, and accelerators, as well as real-world insights from DRAM reliability studies and Meta’s large-scale efforts to detect hardware failures in data centers. It was an engaging and informative experience, showcasing cutting-edge research and industry collaboration to tackle the growing issue of system resilience.
The main conference showcased cutting-edge research in computing architecture, machine learning, and hardware security. Sessions covered topics like memory systems, hardware accelerators, and secure multi-GPU computing, with presentations on energy-efficient solutions for DNNs, innovations in FPGA and DRAM technologies, and techniques to mitigate hardware vulnerabilities. Experts from academia and industry shared insights into optimizing system performance, making the event a dynamic forum for advancing research in high-performance computing.
The keynotes covered critical topics in modern computing. Derek Chiou (UT-Austin/Microsoft) discussed the shift from processor-centric to SmartNICs/DPUs in cloud servers, focusing on their role in I/O, security, and virtualization. Kunle Olukotun (Stanford) addressed the challenges of optimizing hardware for generative AI and foundation models, emphasizing the need for specialized systems to handle complex data flows. Nir Shavit (MIT) explored how sparsity in deep neural networks can improve efficiency, highlighting the need for new parallel algorithms and hardware to manage irregular data patterns.
Overall, Edinburgh proved to be an excellent venue for the event. The weather was unseasonably mild for March, making it easy to explore the city and enjoy the surroundings. With its mix of history and modernity, Edinburgh offered a great setting for both the conference and some downtime. Overall, it was a productive and enjoyable experience, and the city provided a fitting backdrop for the event.