easydel.modules.mistral.modeling_mistral#

class easydel.modules.mistral.modeling_mistral.MistralAttention(*args: Any, **kwargs: Any)[source]#

Bases: UnifiedAttention

Multi-head attention layer with RoPE embeddings for Mistral models.

Inherits from UnifiedAttention with Mistral-specific customizations: - Sliding window attention support - Custom RoPE configuration

class easydel.modules.mistral.modeling_mistral.MistralDecoderLayer(*args: Any, **kwargs: Any)[source]#

Bases: Module

Single decoder layer for Mistral models.

Combines sliding window attention with feedforward networks, using RMS normalization and residual connections.

class easydel.modules.mistral.modeling_mistral.MistralForCausalLM(*args: Any, **kwargs: Any)[source]#

Bases: BaseCausalLMModule[MistralModel, MistralConfig]

Mistral model with a language modeling head for causal language modeling tasks.

get_decoder()[source]#

Returns the decoder part of the model’s graph definition.

get_embedding()[source]#

Returns the embedding layer of the module.

get_encoder()[source]#

Returns the encoder part of the model’s graph definition. Decoder-Only models don’t have an encoder.

get_lm_head()[source]#

Returns the language model head of the module.

class easydel.modules.mistral.modeling_mistral.MistralForSequenceClassification(*args: Any, **kwargs: Any)[source]#

Bases: BaseSequenceClassificationModule[MistralModel, MistralConfig]

Mistral model for sequence classification tasks.

get_decoder()[source]#

Returns the decoder part of the model’s graph definition.

get_embedding()[source]#

Returns the embedding layer of the module.

get_encoder()[source]#

Returns the encoder part of the model’s graph definition. Decoder-Only models don’t have an encoder.

get_lm_head()[source]#

Returns the language model head of the module. This model has a sequence classification head, not an LM Head.

class easydel.modules.mistral.modeling_mistral.MistralMLP(*args: Any, **kwargs: Any)[source]#

Bases: Module

Multi-Layer Perceptron module for Mistral models.

Implements the feedforward network with SiLU activation function for efficient and effective representation learning.

class easydel.modules.mistral.modeling_mistral.MistralModel(*args: Any, **kwargs: Any)[source]#

Bases: EasyDeLBaseModule

Mistral model implementation.

This implements the Mistral language model architecture, utilizing transformer blocks with RMSNorm, sliding window attention, and rotary position embeddings.

config#

Configuration for the model.

Type

MistralConfig

dtype#

Data type for computations.

Type

jnp.dtype

param_dtype#

Data type for parameters.

Type

jnp.dtype

precision#

Precision setting for JAX operations.

get_decoder()[source]#

Returns the decoder part of the model’s graph definition.

get_embedding()[source]#

Returns the embedding layer of the module.

get_encoder()[source]#

Returns the encoder part of the model’s graph definition. Decoder-Only models don’t have an encoder.

get_lm_head()[source]#

Returns the language model head of the module. Base Models don’t have a Language Model Head.