TSR-GEMM: Tile-Selective Precision Recovery for Robust Mixed-Precision Matrix Multiplication on GPU Tensor Cores

Notification

Announcement!

ISJEM Invites papers for various areas like engineering, Management, Science & other multi discplinary subjects. Please submit your paper for review.

ISJEM assigns a digital object identifier (DOI) to each published paper, making it easier for the paper to be cited in various major databases like Google Scholar, ResearchGate, Academia.edu, etc…

ISJEM takes 24–48 hours to publish a research paper. Within 24 hours, the submitted paper will be reviewed and notified of its status, and it will be published once the processing fee is successfully received.

TSR-GEMM: Tile-Selective Precision Recovery for Robust Mixed-Precision Matrix Multiplication on GPU Tensor Cores

Version

File Size 1.08 MB

Downloads 0

Files 1

Published 20 April 2026

Updated 20 April 2026

TSR-GEMM: Tile-Selective Precision Recovery for Robust Mixed-Precision Matrix Multiplication on GPU Tensor Cores

Authors:

Dr. Pavithra L¹, Vedant Singh Chauhan², Ananya Singh³, Vinayak Shrivastava⁴, Abhinav Rai⁵

1Department of Computational Intelligence, SRMIST

Chennai, India

{vc2685, as1178, vs, ar}@srmist.edu.in

Abstract:

Modern deep learning frameworks increasingly rely on mixed-precision general matrix multiplication (GEMM) to exploit the throughput advantages of half-precision (FP16) Tensor Cores on NVIDIA GPUs. While FP16 GEMM delivers substan-tial speedups over FP32 computation, it introduces numerical errors that are spatially non-uniform across the output matrix—concentrated in tiles whose input sub-blocks exhibit high condi-tion numbers or significant cancellation. Existing recovery mech-anisms, such as iterative refinement, operate at matrix-global granularity and therefore cannot exploit this spatial locality. We introduce TSR-GEMM (Tile-Selective Residual GEMM), a three-phase mixed-precision GEMM pipeline that (1) performs the bulk computation in FP16 using Tensor Cores while simultaneously accumulating per-tile norm statistics, (2) evaluates a lightweight instability score for each output tile based on input panel and output tile norms, and (3) selectively re-computes only flagged tiles in FP32 via cuBLAS. TSR-GEMM exposes a single tunable threshold τ that governs the precision–performance trade-off. On an NVIDIA RTX 3050 Ti GPU across matrix dimensions from 512×512 to 4096×4096, TSR-GEMM achieves FP32-comparable accuracy (5.4 × 10⁻⁸ relative error) at full recovery, while at 70% tile recovery it reduces error by 8× over pure FP16 with only a 12% throughput reduction relative to the no-recovery baseline. The τ sweep reveals a smooth, well-behaved Pareto frontier, confirming the instability score as a reliable predictor of per-tile numerical risk.

Index Terms—mixed-precision arithmetic, GEMM, Tensor Cores, CUDA, numerical accuracy, tile-selective recovery, GPU computing

International Scientific Journal of Engineering and Management

An International Scholarly || Multidisciplinary || Open Access || Indexing in all major Database & Metadata

The journal follows the UGC Guidelines and is evaluated for inclusion in the Web of Science

TSR-GEMM: Tile-Selective Precision Recovery for Robust Mixed-Precision Matrix Multiplication on GPU Tensor Cores

TSR-GEMM: Tile-Selective Precision Recovery for Robust Mixed-Precision Matrix Multiplication on GPU Tensor Cores

Categories & Tags

Similar Downloads

What is the difference between a Research Paper and a Review Paper?

What is DOI?

What do you need to do during production of your Research Paper?

What are the advantages of publishing a research paper?

Ways to Support your Academic Wellbeing which preparing the Research Paper/Article

How to improve your Research Paper writing Skills?

Is DOI compulsory to publish a research paper in a Journal?

In what ways does research paper give weight to career development?

How to develop a Research Paper from Scratch

How Plagiarism report plays crucial role in Research Paper Publication?

What is DOI?

Quick Links

Contact Us