easydel.modules.mistral.modeling_mistral#
- class easydel.modules.mistral.modeling_mistral.MistralAttention(*args: Any, **kwargs: Any)[source]#
Bases:
UnifiedAttentionMulti-head attention layer with RoPE embeddings for Mistral models.
Inherits from UnifiedAttention with Mistral-specific customizations: - Sliding window attention support - Custom RoPE configuration
- class easydel.modules.mistral.modeling_mistral.MistralDecoderLayer(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleSingle decoder layer for Mistral models.
Combines sliding window attention with feedforward networks, using RMS normalization and residual connections.
- class easydel.modules.mistral.modeling_mistral.MistralForCausalLM(*args: Any, **kwargs: Any)[source]#
Bases:
BaseCausalLMModule[MistralModel,MistralConfig]Mistral model with a language modeling head for causal language modeling tasks.
- class easydel.modules.mistral.modeling_mistral.MistralForSequenceClassification(*args: Any, **kwargs: Any)[source]#
Bases:
BaseSequenceClassificationModule[MistralModel,MistralConfig]Mistral model for sequence classification tasks.
- class easydel.modules.mistral.modeling_mistral.MistralMLP(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleMulti-Layer Perceptron module for Mistral models.
Implements the feedforward network with SiLU activation function for efficient and effective representation learning.
- class easydel.modules.mistral.modeling_mistral.MistralModel(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModuleMistral model implementation.
This implements the Mistral language model architecture, utilizing transformer blocks with RMSNorm, sliding window attention, and rotary position embeddings.
- config#
Configuration for the model.
- Type
- dtype#
Data type for computations.
- Type
jnp.dtype
- param_dtype#
Data type for parameters.
- Type
jnp.dtype
- precision#
Precision setting for JAX operations.