easydel.modules.glm4_moe.modeling_glm4_moe#

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeAttention(*args: Any, **kwargs: Any)[source]#

Bases: UnifiedAttention

Attention layer variant used inside GLM-4-MoE decoder blocks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeDecoderLayer(*args: Any, **kwargs: Any)[source]#

Bases: Module

Single decoder block for GLM-4-MoE with attention and MoE MLP.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeForCausalLM(*args: Any, **kwargs: Any)[source]#

Bases: BaseCausalLMModule[Glm4MoeModel, Glm4MoeConfig]

GLM4 MoE model with a language modeling head for causal language modeling tasks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeForSequenceClassification(*args: Any, **kwargs: Any)[source]#

Bases: BaseSequenceClassificationModule[Glm4MoeModel, Glm4MoeConfig]

GLM4 MoE model for sequence classification tasks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMLP(*args: Any, **kwargs: Any)[source]#

Bases: Module

Dense feed-forward block used in non-MoE GLM-4-MoE layers.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMLPStack(*args: Any, **kwargs: Any)[source]#

Bases: Module

Glm4Moe MoE MLP using the new ParallelMoELinear layers.

reform_param: ClassVar = {'down_proj$': {'inverse_spliter': <function Glm4MoeMLPStack.<lambda>>, 'splits': [{'name': 'down_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}]}, 'gate_up_proj$': {'inverse_spliter': <function Glm4MoeMLPStack.<lambda>>, 'splits': [{'name': 'gate_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}, {'name': 'up_proj.kernel', 'spliter': <function Glm4MoeMLPStack.<lambda>>}]}}#
class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeMoE(*args: Any, **kwargs: Any)[source]#

Bases: BaseMoeModule

GLM-4-MoE feed-forward wrapper combining router and expert stacks.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeModel(*args: Any, **kwargs: Any)[source]#

Bases: EasyDeLBaseModule

GLM4 MoE model implementation.

get_decoder()[source]#

Return the decoder component of the model.

This method should be overridden by encoder-decoder models to return their decoder component. Useful for tasks that need access to the decoder separately from the encoder.

Returns

The decoder module.

Return type

nn.Module | EasyDeLBaseModule

Raises

NotImplementedError – If the model does not implement a decoder.

get_embedding()[source]#

Return the input embedding layer of the model.

This method should be overridden by models to return their token embedding layer. Useful for weight tying or accessing embeddings directly.

Returns

The embedding layer.

Return type

nn.Module | nn.Embed

Raises

NotImplementedError – If the model does not have an embedding layer.

get_encoder()[source]#

Return the encoder component of the model.

This method should be overridden by encoder-decoder models to return their encoder component. Useful for tasks that only need the encoder, such as feature extraction or embedding generation.

Returns

The encoder module.

Return type

nn.Module | EasyDeLBaseModule

Raises

NotImplementedError – If the model does not implement an encoder.

get_lm_head()[source]#

Return the language model head of the model.

This method should be overridden by language models to return their output projection layer that maps hidden states to vocabulary logits.

Returns

The language model head layer.

Return type

ParallelLinear

Raises

NotImplementedError – If the model does not have a language model head.

class easydel.modules.glm4_moe.modeling_glm4_moe.Glm4MoeTopKRouter(*args: Any, **kwargs: Any)[source]#

Bases: Module

Selects top-k experts per token for GLM-4-MoE routing.

get_selected_experts(scores)[source]#