Preprint
Article

This version is not peer-reviewed.

Multiplication-Free Gaussian Elimination and Matrix Inversion via Bitplane Semantics: Exact Trailing Updates with Boolean GEMM, GF(2) Gauss–Jordan, Bareiss, and Modular CRT

Submitted:

11 September 2025

Posted:

16 September 2025

You are already at the latest version

Abstract
We extend our multiplication-free bit-sliced paradigm from matrix multiplication and determinants to Gaussian elimination and matrix inversion. All trailing updates are bilinear and can be executed by a Boolean (bit-sliced) GEMM with bitwise AND, population count, shifts and additions, yielding zero scalar multiplications at the matrix level. We integrate the bit-sliced core with three exact pivot/inversion regimes: (i) fully Boolean Gauss-Jordan over F2; (ii) fraction-free Bareiss over Z; (iii) modular LU over primes with CRT reconstruction. We provide executable code and small numerical checks; all GEMM-shaped updates are multiplication-free.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

Elimination and inversion reduce to panel factorizations and trailing matrix–matrix updates [1,2]. Our bit-sliced (bitplane) GEMM reconstructs integer products from Boolean plane products using AND+POPCNT and power-of-two shifts, thus removing all scalar multiplications from matrix products. This paper applies the same mechanism to Schur-complement updates in elimination/inversion and combines it with exact pivot regimes: Gauss–Jordan over F 2 , Bareiss over Z , and modular LU + CRT. We emphasize exactness and reproducibility with simple Python reference code.

Contributions.

(1) Bit-sliced Boolean trailing updates inside blocked LU (zero scalar multiplications for all GEMMs); (2) a pure-Boolean Gauss–Jordan over F 2 ; (3) fraction-free Bareiss with bit-sliced numerators; (4) modular LU + CRT path with bit-sliced updates modulo primes; (5) executable code and verified small-case numerics.

2. Related Work

Foundational accuracy and stability are treated by Higham and by Golub–Van Loan [1,2]. Bareiss introduced fraction-free elimination [3]. CRT and residue-number reconstruction for exact linear algebra are classical [4,5]. Bit-sliced/bit-serial multiplication appears in hardware/software works such as BISMO [6] and in popcount-oriented references [7,8].

3. Method

3.1. Bit-Sliced Boolean GEMM (Recap)

Let A = b A ( b ) 2 b , B = b B ( b ) 2 b , with A ( b ) , B ( b ) { 0 , 1 } . With row-packing for A ( b ) and column-packing for B ( b ) along the shared inner dimension,
A B = b 1 , b 2 2 b 1 + b 2 popcount A rows ( b 1 ) B cols ( b 2 ) ,
using only Boolean operations and shifts. Exactness holds in integer/modular settings provided accumulator widths (or CRT) are sufficient.

3.2. Where GEMM Appears in Elimination

In a blocked LU, after forming a panel with pivots, the trailing update is
A 22 A 22 L 21 U 12 ,
a GEMM addressed by the bit-sliced core. Pivoting (row swaps) is compatible; we keep all panel solves and scalar inverses outside the matrix-product core.

3.3. Exact Regimes

GF(2) Gauss–Jordan. Addition is XOR and 1 1 = 1 , so row scaling vanishes; the algorithm is purely Boolean.
Bareiss (fraction-free). Divisions are exact by construction; numerators are bilinear (bit-sliced); exactness follows from Bareiss divisibility [3].
Modular LU + CRT. Perform LU modulo several primes; panel inverses are modular scalars; updates are bit-sliced modulo p; reconstruct over Z via CRT once the product of primes exceeds a bound [4,5].

4. Results: Multiplication Counts

We count matrix–matrix multiplications (GEMMs) only. Trailing updates in a blocked LU are 1 GEMM per block traditionally and 0 in the bit-sliced model (Boolean GEMM). Gauss–Jordan over F 2 uses row ops/XOR only (0 GEMMs).
Task Traditional GEMMs Bit-sliced GEMMs
Trailing update A 22 A 22 L 21 U 12 1 per block 0
Blocked LU (integer/mod p) many 0 (all trailing)
Gauss–Jordan over F 2 0 0

5. Numerical Illustrations

Executable tests: (i) bit-sliced GEMM equals NumPy integer GEMM; (ii) GF(2) inverse on an invertible example; (iii) blocked LU driver that calls Boolean GEMM for each trailing update and reports GEMM counts; (iv) modular CRT inverse sanity check; and (v) Bareiss 2 × 2 step with exact divisibility.

6. Discussion and Limitations

The bit-sliced model eliminates scalar multiplications from all matrix products within elimination/inversion. Exactness over Z follows by Bareiss or modular LU + CRT. Practical speed depends on POPCNT throughput and bandwidth. Large problems need careful blocking, packing, and prime/bound selection for CRT.

Funding

No external funding was received.

Data and Code Availability

All illustrative code is in the Appendix.

AI Assistance Disclosure

Drafting assistance and editing supported by an AI system; the author is responsible for all content and claims.

Conflicts of Interest

The author declares no conflicts of interests.

Appendix A Python Reference (Executable)

Appendix A.1. Bit-Sliced Boolean GEMM, GF(2) Inverse, Blocked LU Driver, CRT Inverse, Bareiss, Tests

Preprints 176361 g001Preprints 176361 g002Preprints 176361 g003Preprints 176361 g004

References

  1. N. J. Higham. Accuracy and Stability of Numerical Algorithms, 2nd ed. SIAM, 2002.
  2. G. H. Golub and C. F. Van Loan. Matrix Computations, 4th ed. Johns Hopkins, 2013.
  3. E. H. Bareiss. Sylvester’s Identity and Multistep Integer-Preserving Gaussian Elimination. Math. Comp. 1968, 22, 565–578. [Google Scholar]
  4. H. Garner. The Residue Number System. IRE Trans. Electronic Computers 1959, EC-8, 140–147. [Google Scholar]
  5. Crandall and C. Pomerance. Prime Numbers: A Computational Perspective, 2nd ed. Springer, 2005.
  6. Y. Umuroglu and M. Jahre. BISMO: A Scalable Bit-Serial Matrix Multiplication Overlay for Reconfigurable Computing. FPL, 2018.
  7. H. S. Warren. Hacker’s Delight, 2nd ed. Addison-Wesley, 2012.
  8. D. E. Knuth. The Art of Computer Programming, Vol. 4A. Addison-Wesley, 2011.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated