easydel.modules.glm4_moe.modeling_glm4_moe

easydel.modules.glm4_moe.modeling_glm4_moe#

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeAttention(*args: Any, **kwargs: Any)[source]#

Bases: UnifiedAttention

Attention layer variant used inside GLM-4-MoE decoder blocks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeDecoderLayer(*args: Any, **kwargs: Any)[source]#

Bases: Module

Single decoder block for GLM-4-MoE with attention and MoE MLP.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeForCausalLM(*args: Any, **kwargs: Any)[source]#

Bases: BaseCausalLMModule[Glm4MoeModel, Glm4MoeConfig]

GLM4 MoE model with a language modeling head for causal language modeling tasks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeForSequenceClassification(*args: Any, **kwargs: Any)[source]#

Bases: BaseSequenceClassificationModule[Glm4MoeModel, Glm4MoeConfig]

GLM4 MoE model for sequence classification tasks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMLP(*args: Any, **kwargs: Any)[source]#

Bases: Module

Dense feed-forward block used in non-MoE GLM-4-MoE layers.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMLPStack(*args: Any, **kwargs: Any)[source]#

Bases: Module

Glm4Moe MoE MLP using the new ParallelMoELinear layers.

reform_param: ClassVar = {'down_proj$': {'inverse_spliter': <function Glm4MoeMLPStack.<lambda>>, 'splits': [{'name': 'down_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}]}, 'gate_up_proj$': {'inverse_spliter': <function Glm4MoeMLPStack.<lambda>>, 'splits': [{'name': 'gate_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}, {'name': 'up_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}]}}#

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMoE(*args: Any, **kwargs: Any)[source]#

Bases: BaseMoeModule

GLM-4-MoE feed-forward wrapper combining router and expert stacks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeModel(*args: Any, **kwargs: Any)[source]#

Bases: EasyDeLBaseModule

GLM4 MoE model implementation.

get_decoder()[source]#

Return the decoder component of the model.

This method should be overridden by encoder-decoder models to return their decoder component. Useful for tasks that need access to the decoder separately from the encoder.

Returns: The decoder module.
Return type: nn.Module | EasyDeLBaseModule
Raises: NotImplementedError – If the model does not implement a decoder.

get_embedding()[source]#

Return the input embedding layer of the model.

This method should be overridden by models to return their token embedding layer. Useful for weight tying or accessing embeddings directly.

Returns: The embedding layer.
Return type: nn.Module | nn.Embed
Raises: NotImplementedError – If the model does not have an embedding layer.

get_encoder()[source]#

Return the encoder component of the model.

This method should be overridden by encoder-decoder models to return their encoder component. Useful for tasks that only need the encoder, such as feature extraction or embedding generation.

Returns: The encoder module.
Return type: nn.Module | EasyDeLBaseModule
Raises: NotImplementedError – If the model does not implement an encoder.

get_lm_head()[source]#

Return the language model head of the model.

This method should be overridden by language models to return their output projection layer that maps hidden states to vocabulary logits.

Returns: The language model head layer.
Return type: ParallelLinear
Raises: NotImplementedError – If the model does not have a language model head.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeTopKRouter(*args: Any, **kwargs: Any)[source]#

Bases: Module

Selects top-k experts per token for GLM-4-MoE routing.

get_selected_experts(scores)[source]#

easydel.modules.glm4_moe.modeling_glm4_moe

Contents

easydel.modules.glm4_moe.modeling_glm4_moe#