easydel.modules.mistral3.mistral3_configuration#

class easydel.modules.mistral3.mistral3_configuration.Mistral3Config(vision_config=None, text_config=None, image_token_index=10, projector_hidden_act='gelu', vision_feature_layer=-1, multimodal_projector_bias=False, spatial_merge_size=2, **kwargs)[source]#

Bases: EasyDeLBaseConfig

Configuration objects inherit from [EasyDeLBaseConfig] and can be used to control the model outputs. Read the documentation from [EasyDeLBaseConfig] for more information.

Parameters
  • vision_config (Union[AutoConfig, dict], optional, defaults to PixtralVisionConfig) – The config object or dictionary of the vision backbone.

  • text_config (Union[AutoConfig, dict], optional, defaults to MistralConfig) – The config object or dictionary of the text backbone.

  • image_token_index (int, optional, defaults to 10) – The image token index to encode the image prompt.

  • projector_hidden_act (str, optional, defaults to “gelu”) – The activation function used by the multimodal projector.

  • vision_feature_layer (Union[int, list[int]], optional, defaults to -1) – The index of the layer to select the vision feature. If multiple indices are provided, the vision feature of the corresponding indices will be concatenated to form the vision features.

  • multimodal_projector_bias (bool, optional, defaults to False) – Whether to use bias in the multimodal projector.

  • spatial_merge_size (int, optional, defaults to 2) – The downsampling factor for the spatial merge operation.

attribute_map: ClassVar = {'image_token_id': 'image_token_index'}#
get_partition_rules(*args, **kwargs)[source]#

Get the partition rules for the model. :returns: The partition rules. :rtype: tp.Tuple[tp.Tuple[str, PartitionSpec]]

is_composition = True#
model_type: str = 'mistral3'#
sub_configs: ClassVar = {'text_config': <class 'easydel.infra.base_config.EasyDeLBaseConfig'>, 'vision_config': <class 'easydel.infra.base_config.EasyDeLBaseConfig'>}#