easydel.modules.aya_vision.modeling_aya_vision_flax

easydel.modules.aya_vision.modeling_aya_vision_flax#

class easydel.modules.aya_vision.modeling_aya_vision_flax.AyaVisionCausalLMOutputWithPast(loss: Optional[Union[Array, ndarray, bool, number]] = None, logits: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, image_hidden_states: Optional[Union[Array, ndarray, bool, number]] = None)[source]#

Bases: ModelOutput

Base class for AyaVision causal language model (or autoregressive) outputs.

Parameters

loss (chex.Array of shape (1,), optional, returned when labels is provided) – Language modeling loss (for next-token prediction).
logits (chex.Array of shape (batch_size, sequence_length, config.vocab_size)) – Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head))

Contains pre-computed hidden-states (key and values in the self-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
Tuple of chex.Array (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).

Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).

Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
image_hidden_states (chex.Array, optional) – A chex.Array of size (batch_size * num_patches, num_images, sequence_length, hidden_size)`. image_hidden_states of the model produced by the vision encoder and after projecting the last hidden state.

attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None#

hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None#

image_hidden_states: Optional[Union[Array, ndarray, bool, number]] = None#

logits: Union[Array, ndarray, bool, number] = None#

loss: Optional[Union[Array, ndarray, bool, number]] = None#

past_key_values: Optional[TransformerCache] = None#

replace(**kwargs)#

class easydel.modules.aya_vision.modeling_aya_vision_flax.AyaVisionForConditionalGeneration(*args: Any, **kwargs: Any)[source]#

Bases: EasyDeLBaseModule

config: tp.Union[EasyDeLBaseConfig, _CP]#

dtype: jnp.dtype#

get_image_features(pixel_values: Union[Array, ndarray, bool, number]) → Union[Array, ndarray, bool, number][source]#

loss_type = 'ForCausalLM'#

param_dtype: jnp.dtype#

precision: lax.PrecisionLike#

prepare_inputs_for_generation(input_ids: Union[Array, ndarray, bool, number], max_length: int, pixel_values: Optional[Union[Array, ndarray, bool, number]] = None, attention_mask: Optional[Union[Array, ndarray, bool, number]] = None)[source]#

The prepare_inputs_for_generation function is used to prepare the inputs for a generation task.

Parameters

self – Access variables that belong to the class
input_ids – Pass in the input tokens
max_length – Set the length of the sequence to be generated
attention_mask – tp.Optional[chex.Array]: Mask the attention weights token_type_ids: tp.Optional[chex.Array]: TokenTypeIds

Returns

A dictionary of the past_key_values, attention_mask and position ids

rngs: nn.Rngs#

update_inputs_for_generation(model_outputs, model_kwargs)[source]#

class easydel.modules.aya_vision.modeling_aya_vision_flax.AyaVisionMultiModalProjector(*args: Any, **kwargs: Any)[source]#

Bases: Module

pixel_shuffle(image_features: Array) → Array[source]#

easydel.modules.aya_vision.modeling_aya_vision_flax

Contents

easydel.modules.aya_vision.modeling_aya_vision_flax#