ARITH 2024 Keynotes

“Report from IEEE WG P3109: Arithmetic Formats for Machine Learning”

Dr Andrew Fitzgibbon, Co-editor P3109 Working Documents

In the field of machine learning, there is increasing interest in and deployment of low-bit-width floating point formats. These formats, typically eight-bit, but increasingly even smaller, have proved valuable in reducing computational requirements while maintaining accuracy and utility of the machine learning algorithms on which our society increasingly depends for security, medicine, and numerous industrial applications in agriculture, entertainment, transport, and many others. The aim of WG P3109 is to synthesize existing practice and to define a set of arithmetic formats which best serve this growing community of practice. This talk will represent the discussions and key ideas and rationales behind the working group's current decisions, as represented in the interim report of September 2023. (See link at https://github.com/P3109/Public/blob/main/Shared%20Reports/P3109%20WG%20Interim%20Report.pdf

 

 

 

Andrew Fitzgibbon has contributed to some of the most significant machine learning and computer vision based products and innovations of the last few decades. At the University of Oxford, he co-founded the company "2d3", which developed the world's first commercially available automated camera tracking system, which was given the Emmy Award for "revolutionary impact on the creation of complex visual effects". At Microsoft, he contributed to the machine learning system behind the human motion capture system in Kinect for Xbox 360, one of the first systems to use massive synthetic training data to deliver a real-world computer vision application. More recently he contributed to the hand tracking algorithms in Microsoft's Hololens, the first commercially available AR headset with fully articulated hand tracking. At Graphcore, his recent work has been on programming languages and molecular machine learning. He is also a highly cited researcher, having authored numerous papers with over a thousand citations, and is inventor on dozens of patents.

 

“Immense-scale Machine Learning: The Big, the Small, and the Not Right At All”

Norman P. Jouppi, Google VP and Engineering Fellow.

We start with an introduction to the decade-long evolution of Google's Tensor Processing Systems, including the evolution of their numerics over time. We also discuss the unique requirements arising from serving billions of daily users across multiple products in production environments.

Training Large Language ML Models (LLMs) can require 100,000 accelerators working together in synchrony for months. And LLM inference can require exaFLOP/OP speeds with response times of under a second. This has driven adoption of lower-precision formats across the industry. Finally, since random errors from various sources are likely to occur at this immense scale during months-long training runs, fault tolerance is also a key feature of immense-scale ML systems. All these factors provide interesting challenges for computer arithmetic.

 

 

 

Norman P. Jouppi is a Google VP and Engineering Fellow. He joined Google in 2013 to lead the design of Google's Tensor Processing Units (TPUs). He is known for his innovations in computer memory systems, and was the principal architect and lead designer of several microprocessors. His innovations in microprocessor design have been adopted in many high-performance microprocessors.

Norm received his Ph.D. in electrical engineering from Stanford University in 1984. While at Stanford he was one of the principal architects and designers of the MIPS microprocessor, and developed techniques for MOS VLSI timing verification. He joined HP in 2002 through its merger with Compaq, where he was previously a Staff Fellow at Compaq's Western Research Laboratory (formerly DECWRL) in Palo Alto, California. In 2010 he was named an HP Senior Fellow. From 1984 through 1996 he was a consulting assistant/associate professor in the electrical engineering department at Stanford University where he taught courses in computer architecture, VLSI, and circuit design.

Norm holds more than 125 U.S. patents. He has published over 125 technical papers, with several best paper awards and two International Symposium on Computer Architecture (ISCA) Influential Paper Awards. He is the recipient of the 2014 IEEE Harry H. Goode Award and the 2015 ACM/IEEE Eckert-Mauchly Award. He is a Fellow of the ACM, IEEE, and AAAS, and a member of the National Academy of Engineering.

 

Symposium at a glance

Time
(CEST)
Sunday,
June 9th
Monday, June 10th Tuesday, June 11th Wednesday, June 12th
8:30 Registration Registration
9:00 Welcome
Session 4
Math Tools, Libraries and Software Evaluation
Session 6
Arithmetic Operators
9:30 Session 1
Arithmetic for Cryptography
......
......
11:00 Coffee Break Coffee Break Coffee Break
11:45 Keynote Talk
Andrew Fitzgibbon
Keynote Talk
Norman P. Jouppi
Session 7
Alternative formats
......
12:45 Lunch Lunch Close
...... Lunch
......
14:15 Session 2
Datapath Design I
Session 5
Transcendental functions and error analysis
......
......
15:45 Coffee Break Coffee Break
16:30 Session 3
Datapath Design II
Steering Committee Meeting
(SC Members only)
......
17:30
......
......
18:30 Sightseeing tour of Malaga
19:00 Welcome Reception Reception at the Malaga City Hall Banquet
......
......
......
21:00

Detailed Program — Subject to Change

Practical information

All coffee breaks and lunches are included in the registration fee, as well as Welcome Reception and Banquet.

Dinners on Monday and Wednesday are not covered by the conference.

More information on the social program can be found in the dedicated page.

Disclaimer: The opinions expressed in the papers on this website are the opinions of the authors/speakers and not necessarily the opinions of the IEEE or of the conference and its organizers.


Sessions:

Sessions Authors Title

Session 1  

Arithmetic for Cryptography

Monday 9:30 - 11:00 

David Du Pont, Jonas Bertels, Furkan Turan, Michiel Van Beirendonck and Ingrid Verbauwhede Hardware Acceleration of the Prime-Factor and Rader NTT for BGV Fully Homomorphic Encryption
Décio Luiz Gazzoni Filho, Guilherme Brandão, Gora Adj, Arwa Alblooshi, Isaac Canales-Martínez, Jorge Chávez-Saab and Julio López PQC-AMX: accelerating Saber and FrodoKEM on the Apple M1 and M3 SoCs
Zabihollah Ahmadpour, Ghassem Jaberipur and Jeong-A Lee Montgomery Modular Multiplication via Single-Base Residue Number Systems
     

Session 2 

Datapath Design I

Monday 14:15 - 15:45

Samuel Coward, Theo Drane, Emiliano Morini and George Constantinides Combining Power and Arithmetic Optimization via Datapath Rewriting
Tom Hubrecht, Claude-Pierre Jeannerod and Jean-Michel Muller Useful applications of correctly-rounded operators of the form ab+cd+e
David Lutz, Anisha Saini, Mairin Kroes, Thomas Elmer and Harsha Valsaraju Fused FP8 4-Way Dot Product with Scaling and FP32 Accumulation
     

Session 3 

Datapath Design II

Monday 16:30 - 17:30

Vassil Dimitrov, Richard Ford, Laurent Imbert, Arjuna Madanayake, Nilan Udayanga and Will Wray Multiple-base Logarithmic Quantization and Application in Reduced Precision AI Computations
Zeynep Kaya and Mario Garrido Novel Access Patterns based on Overlapping Loading and Processing Times to Reduce Latency and Increase Throughput in Memory-based FFTs
     
     

Session 4 

Math Tools, Libraries and Software Evaluation

Tuesday 9:00 - 11:00

Ping Tang An Open-Source RISC-V Vector Math Library
Mantas Mikaitis MATLAB Simulator of Level-Index Arithmetic
Mikael Henriksson, Theodor Lindberg and Oscar Gustafsson APyTypes: Algorithmic Data Types in Python for Efficient Simulation of Finite Word-Length Effects
Vincent Lefèvre An Emacs-Cairo Scrolling Bug due to Floating-Point Inaccuracy
     

Session 5 

Transcendental functions and error analysis

Tuesday 14:15 - 15:45

Joris van der Hoeven and Fredrik Johansson Fast multiple precision exp(x) with precomputations
Hui Chen, Lianghua Quan and Weiqiang Liu HGH-CORDIC: A High-Radix Generalized Hyperbolic Coordinate Rotation Digital Computer
Denis Arzelier, Florent Bréhard, Mioara Joldes and Marc Mezzarobba Rounding Error Analysis of an Orbital Collision Probability Evaluation Algorithm
     
     

Session 6 

Arithmetic Operators

Wednesday 9:00 - 11:00

Martin Langhammer, Bogdan Pasca and Igor Kucherenko Multiplier Architecture with a Carry-Based Partial Product Encoding
Theo Drane, Samuel Coward, Mertcan Temel and Joe Leslie-Hurd On the Systematic Creation of Faithfully Rounded Commutative Truncated Booth Multipliers
Ziying Cui, Ke Chen, Bi Wu, Chenggang Yan, Yu Gong and Weiqiang Liu A Time Efficient Comprehensive Model of Approximate Multipliers for Design Space Exploration
Andreas Boettcher and Martin Kumm Small Logic-based Multipliers with Incomplete Sub-Multipliers for FPGAs
     

Session 7 

Alternative formats

Wednesday 11:45 - 12:45

Raul Murillo, Alberto Antonio Del Barrio García and Guillermo Botella Square Root Unit with Minimum Iterations for Posit Arithmetic
Micaela Serôdio, João Lopes, Jose Sousa, Horácio Neto and Mário Vestias PT-Float: A Floating-Point Unit with Dynamically Varying Exponent and Fraction Sizes

ARITH 2024 ACCEPTED PAPERS

ARITH 2024 Accepted Papers with Abstracts