easydel.modules.deepseek_v2.modeling_deepseek_flax#
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2Attention(*args: Any, **kwargs: Any)[source]#
Bases:
AttentionModule
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2DecoderLayer(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2ForCausalLM(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModuleDeepseekV2 model with a language modeling head for causal language modeling tasks.
This model extends the base DeepseekV2Model by adding a linear language modeling head on top of the transformer model. It’s designed for generative tasks and can be used for text generation.
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2MLP(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2MoE(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.DeepseekV2Model(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModule- property frequencies#
Retrieves or computes the frequency components (e.g., for RoPE) from the configuration.
Uses self.config.get_basic_frequencies() and caches the result.
- Returns
The frequency components, potentially cached.
- Return type
jnp.ndarray
- class easydel.modules.deepseek_v2.modeling_deepseek_flax.MoEGate(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- easydel.modules.deepseek_v2.modeling_deepseek_flax.apply_rotary_pos_emb(q, k, cos, sin, position_ids, unsqueeze_dim=1)[source]#
- easydel.modules.deepseek_v2.modeling_deepseek_flax.init_deepseek_rotary_embedding(dim, max_position_embeddings=2048, base=10000, method: Literal['linear', 'yarn', 'dynamic', None] = None, kwargs: Optional[dict] = None)[source]#
- easydel.modules.deepseek_v2.modeling_deepseek_flax.yarn_find_correction_dim(num_rotations, dim, base=10000, max_position_embeddings=2048)[source]#