easydel.layers.quantization.__init__#

class easydel.layers.quantization.__init__.EasyQuantizer(quantization_method: EasyDeLQuantizationMethods = EasyDeLQuantizationMethods.NF4, quantization_platform: Optional[EasyDeLPlatforms] = EasyDeLPlatforms.JAX, quantization_pattern: Optional[str] = None, block_size: int = 256, **kwargs)[source]#

Bases: object

quantize_linears(model: Module, /, *, quantization_pattern: Optional[str] = None, verbose: bool = True) Module[source]#

Quantize parameters to requested precision, excluding specified layers.

Parameters
  • model – The model to quantize.

  • quantization_pattern (str) – re pattern for layers to be quantized.

  • verbose (bool) – whenever to use tqdm for logging stuff.

Returns

Quantized parameters in the same structure as the input.

class easydel.layers.quantization.__init__.Linear8bit(*args: Any, **kwargs: Any)[source]#

Bases: QauntModule

An 8-bit quantized version of the linear transformation applied over the last dimension of the input.

classmethod from_linear(linear: Linear, rngs: Optional[Rngs] = None, **kwargs) Linear8bit[source]#

Create a Linear8bit module from a regular Linear module.

Parameters
  • linear – The source Linear module

  • rngs – Random number generator state

Returns

A new Linear8bit module with quantized weights

get_kernel()[source]#

Get the dequantized quant_kernel weights.

get_quantized_kernel()[source]#

Get the quantized quant_kernel weights and quant_scales.

static metadata()[source]#
static quantization_mapping()[source]#
to_linear(rngs: Optional[Rngs] = None) Linear[source]#

Convert this Linear8bit module back to a regular Linear module.

Parameters

rngs – Random number generator state

Returns

A new Linear module with dequantized weights

class easydel.layers.quantization.__init__.LinearNF4(*args: Any, **kwargs: Any)[source]#

Bases: QauntModule

A 4-bit quantized version of the linear transformation using NF4 quantization.

classmethod from_linear(linear: Linear, rngs: Optional[Rngs] = None, block_size: int = 128, **kwargs) LinearNF4[source]#
get_kernel()[source]#

Get the dequantized quant_kernel weights.

get_quantized_kernel()[source]#

Get the quantized quant_kernel weights and quant_scales.

static metadata()[source]#
static quantization_mapping()[source]#
to_linear(rngs: Optional[Rngs] = None) Linear[source]#