Working Notes: a commonplace notebook for recording & exploring ideas.
Home. Site Map. Subscribe. More at expLog.
Fp8
- 8 bit floating point
- E4M3: 1 sign, 4 exponent, 3 mantissa -448 -> 448 + nan
- E5M2: 1 sign, 5 exponent, 2 mantissa -57344 -> 57344 + inf, nan
- [[Mixed precision training]]: scales loss to match the supported range of values
- fp8 has distinct scaling for each tensor
- just in time scaling: scale based on absolute value
- delayed scaling: based on maximum seen in iterations
- [[Transformer Engine]]
- References:
— Kunal