My misunderstanding about SME

My misunderstanding about SME

2025, Dec 25    

Misunderstanding about ZA size and shape

Assume SVL = 512 bits, and the element type is float32.

Misunderstanding ❌: ZA has size $SVL×SVL$, so I would have $512*512 = 2^{18}$ bits, which means my ZA register can hold $2^{13}$ floats.

Correct ✅: The ZA matrix is square at the “byte” level, not at the “bit” level.

ZA is not a bitmap. It is a “rectangle” with a height of $SVL/8$ rows, and each row is $SVL/8$ bytes wide.

      <--------- Width = 512 bits (64 Bytes) --------->
    +--------------------------------------------------+
 R  | Byte 0 | Byte 1 | ...                  | Byte 63 |
 o  |--------+--------+----------------------+---------|
 w  | Byte 0 | Byte 1 | ...                  | Byte 63 |
 s  |--------+--------+----------------------+---------|
    | ...    | ...    | ...                  | ...     |
(64)|--------+--------+----------------------+---------|
    | Byte 0 | Byte 1 | ...                  | Byte 63 |
    +--------------------------------------------------+

So the actual size of the ZA matrix is:

$64*512 \text{ bits} = 2^{15} \text{ bits} = 2^{12} \text{ bytes}$

About tile size and shape

In SME: one tile is always a square of N x N elements.

The value of N depends on how many such elements fit in one Z register.

Therefore, the number of tiles changes with the element size.

Data type Element size (Bytes) Vector capacity (Elements) Tile shape (Elements) Number of tiles Naming
int8 (byte) 1 64 64 x 64 1 ZA0.b
float16 2 32 32 x 32 2 ZA0.h, ZA1.h
float32 4 16 16 x 16 4 ZA0.sZA3.s
float64 8 8 8 x 8 8 ZA0.dZA7.d