easydel.modules.xerxes.modeling_xerxes#
- class easydel.modules.xerxes.modeling_xerxes.Identity(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleNo-op module used as a placeholder when optional layers are disabled.
- class easydel.modules.xerxes.modeling_xerxes.PostCross(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleApplies a bounded tanh transform after cross attention.
- class easydel.modules.xerxes.modeling_xerxes.XerxesAttention(*args: Any, **kwargs: Any)[source]#
Bases:
UnifiedAttentionXerxes Attention with conditional Q/K normalization.
Inherits Q/K normalization from QKNormAttention. Features: - Conditional Q/K normalization via xe_kvnorm flag - Layer-specific sliding window (different patterns based on layer_idx or window_pattern)
- class easydel.modules.xerxes.modeling_xerxes.XerxesDecoderLayer(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleTransformer decoder block with optional cross-attention and MoE.
- class easydel.modules.xerxes.modeling_xerxes.XerxesForCausalLM(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModuleXerxes language model with LM head for causal generation.
- class easydel.modules.xerxes.modeling_xerxes.XerxesMLP(*args: Any, **kwargs: Any)[source]#
Bases:
ModuleFeed-forward network for Xerxes decoder blocks.
- class easydel.modules.xerxes.modeling_xerxes.XerxesModel(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModuleXerxes decoder stack wiring embeddings, decoder layers, and final norm.
- property default_frequencies#