Integer Squaring Algorithm (sqr32)

2025-12-03

Takayuki HOSODA

Overview

This page describes a 32-bit integer squaring algorithm and its C implementation, designed for processors that provide only 16×16 multipliers.

By exploiting the symmetry of squaring and applying a Karatsuba-like decomposition, a single 32×32-bit multiplication is replaced with three 16×16-bit multiplications.

Note: On systems equipped with a native 64-bit multiplier, the ordinary expression a * a should be used instead.

Algorithm

a² = a_H² B² + 2 a_H a_L B + a_L²
B = 2¹⁶

Here, a_H and a_L denote the upper and lower 16-bit halves of the input value, respectively.

C Implementation

sqr32 \ computes the square of a given 32-bit integer and returns the result as a 64-bit unsigned integer.

/* sqr32 - calculate squre of given integer value
 * Rev.1.0 (2025-12-22) (c) 2009-2025 Takayuki HOSODA
 * SPDX-License-Identifier: BSD-3-Clause
 */
#include <inttypes.h>
uint64_t sqr32(int32_t a);
#ifdef HAVE_NO_MUL64
uint64_t sqr32(int32_t a) {
    uint32_t        ax, ah, al;
    uint64_t        bh, bm, bl;
    // Karatsuba method: a * a = a_h^2 * B^2 + 2 * a_h * a_l * B + a_l^2
    ax = (uint32_t)(a < 0 ? -a : a);       // assume two's complement; INT32_MIN wraps
    al = (ax      ) & 0xffff;              // lower 16-bit of input
    ah = (ax >> 16) & 0xffff;              // upper 16-bit of input
    bh = (uint64_t)(ah * ah) << (2 * 16);  // 16x16 multiplier and 64-bit shifter
    bm = (uint64_t)(ah * al) << (1 + 16);  // 16x16 multiplier and 64-bit shifter
    bl = (uint64_t)(al * al);              // 16x16 multiplier
    return bl + bm + bh;                   // 64-bit adder
}
#else
uint64_t sqr32(int32_t x) {
    int64_t eax = (int64_t)x;
    return (uint64_t)(eax * eax);
}
#endif

Note: Two's complement representation is assumed; INT32_MIN is treated as wrapping on negation.

Integer Squaring Algorithm (sqr32)

Overview

Algorithm

C Implementation

SEE ALSO