easydel.modules.aya_vision.aya_vision_configuration#

class easydel.modules.aya_vision.aya_vision_configuration.AyaVisionConfig(vision_config=None, text_config=None, vision_feature_select_strategy='full', vision_feature_layer=-1, downsample_factor=2, adapter_layer_norm_eps=1e-06, image_token_index=255036, **kwargs)[source]#

Bases: EasyDeLBaseConfig

This is the configuration class to store the configuration of a [AyaVisionForConditionalGeneration]. It is used to instantiate an AyaVision model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of AyaVision. e.g. [CohereForAI/aya-vision-8b](https://huggingface.co/CohereForAI/aya-vision-8b)

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

Parameters
  • vision_config (Union[AutoConfig, dict], optional, defaults to CLIPVisionConfig) – The config object or dictionary of the vision backbone.

  • text_config (Union[AutoConfig, dict], optional, defaults to LlamaConfig) – The config object or dictionary of the text backbone.

  • vision_feature_select_strategy (str, optional, defaults to “full”) – The feature selection strategy used to select the vision feature from the vision backbone. Can be one of “default” or “full”. If “default”, the CLS token is removed from the vision features. If “full”, the full vision features are used.

  • vision_feature_layer (int, optional, defaults to -1) – The index of the layer to select the vision feature.

  • downsample_factor (int, optional, defaults to 2) – The downsample factor to apply to the vision features.

  • adapter_layer_norm_eps (float, optional, defaults to 1e-06) – The epsilon value used for layer normalization in the adapter.

  • image_token_index (int, optional, defaults to 255036) – The image token index to encode the image prompt.

get_partition_rules(*args, **kwargs)[source]#

Retrieves the combined partition rules from the text and vision configurations.

Parameters
  • *args – Positional arguments passed to the underlying config partition rule methods.

  • **kwargs – Keyword arguments passed to the underlying config partition rule methods.

Returns

Combined partition rules from both text and vision models.

Return type

Tuple

model_type: str = 'aya_vision'#
sub_configs: dict[str, 'PretrainedConfig'] = {'text_config': <class 'easydel.modules.auto.auto_configuration.AutoEasyDeLConfig'>, 'vision_config': <class 'easydel.modules.auto.auto_configuration.AutoEasyDeLConfig'>}#