easydel.infra.modeling_outputs#
- class easydel.infra.modeling_outputs.AttentionLayerOutput(attention_output: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number], attention_weight: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, cache_view: Optional[Any] = None)[source]#
Bases:
ModelOutput- cache_view: Optional[Any] = None#
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutput(last_hidden_state: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[Dict[str, Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs, with potential hidden states and attentions.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithNoAttention(last_hidden_state: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs, with potential hidden states.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, num_channels, height, width)) – Sequence of hidden-states at the output of the last layer of the model.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – tp.Tuple of chex.Array (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each layer) of shape (batch_size, num_channels, height, width). Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithPast(last_hidden_state: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[Dict[str, Union[Array, ndarray, bool, number]]] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs, with potential hidden states and attentions.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
past_key_values (tp.Dict[str, chex.Array]) – Dictionary of pre-computed hidden-states (key and values in the attention blocks) that can be used for fast auto-regressive decoding. Pre-computed key and value hidden-states are of shape [batch_size, max_length].
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithPastAndCrossAttentions(last_hidden_state: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs that may also contain a past key/values (to speed up sequential decoding).
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) –
Sequence of hidden-states at the output of the last layer of the model.
If past_key_values is used only the last hidden-state of the sequences of shape (batch_size, 1, hidden_size) is output.
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and optionally if config.is_encoder_decoder=True 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and optionally if config.is_encoder_decoder=True in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True and config.add_cross_attention=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithPooling(last_hidden_state: Union[Array, ndarray, bool, number] = None, pooler_output: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs that also contains a pooling of the last hidden states.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
pooler_output (chex.Array of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state: Union[Array, ndarray, bool, number] = None, pooler_output: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[TransformerCache] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs that also contains a pooling of the last hidden states.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
pooler_output (chex.Array of shape (batch_size, hidden_size)) – Last layer hidden-state of the first token of the sequence (classification token) after further processing through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns the classification token after processing through a linear layer and a tanh activation function. The linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True and config.add_cross_attention=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and optionally if config.is_encoder_decoder=True 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and optionally if config.is_encoder_decoder=True in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BaseModelOutputWithPoolingAndNoAttention(last_hidden_state: Union[Array, ndarray, bool, number] = None, pooler_output: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model’s outputs that also contains a pooling of the last hidden states.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, num_channels, height, width)) – Sequence of hidden-states at the output of the last layer of the model.
pooler_output (chex.Array of shape (batch_size, hidden_size)) – Last layer hidden-state after a pooling operation on the spatial dimensions.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – tp.Tuple of chex.Array (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each layer) of shape (batch_size, num_channels, height, width). Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.BeamSearchOutput(sequences: Union[Array, ndarray, bool, number] = None, scores: Union[Array, ndarray, bool, number] = None)[source]#
Bases:
ModelOutputFlax Base class for outputs of decoder-only generation models using greedy search.
- Parameters
sequences (chex.Array of shape (batch_size, max_length)) – The generated sequences.
scores (chex.Array of shape (batch_size,)) – The scores (log probabilities) of the generated sequences.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.CLIPOutput(loss: Union[Array, ndarray, bool, number] = None, logits_per_image: Union[Array, ndarray, bool, number] = None, logits_per_text: Union[Array, ndarray, bool, number] = None, text_embeds: Union[Array, ndarray, bool, number] = None, image_embeds: Union[Array, ndarray, bool, number] = None, text_model_output: BaseModelOutputWithPooling = None, vision_model_output: BaseModelOutputWithPooling = None)[source]#
Bases:
ModelOutput- Parameters
loss – (chex.Array) training loss
logits_per_image – (chex.Array of shape (image_batch_size, text_batch_size)): The scaled dot product scores between image_embeds and text_embeds. This represents the image-text similarity scores.
logits_per_text – (chex.Array of shape (text_batch_size, image_batch_size)): The scaled dot product scores between text_embeds and image_embeds. This represents the text-image similarity scores.
text_embeds (chex.Array of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of [FlaxCLIPTextModel].
image_embeds (chex.Array of shape (batch_size, output_dim) – The image embeddings obtained by applying the projection layer to the pooled output of [FlaxCLIPVisionModel].
text_model_output (BaseModelOutputWithPooling) – The output of the [FlaxCLIPTextModel].
vision_model_output (BaseModelOutputWithPooling) – The output of the [FlaxCLIPVisionModel].
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- text_model_output: BaseModelOutputWithPooling = None#
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- to_tuple() Tuple[Any][source]#
Convert self to a tuple containing all the attributes/keys that are not None.
- vision_model_output: BaseModelOutputWithPooling = None#
- class easydel.infra.modeling_outputs.CLIPTextModelOutput(text_embeds: Union[Array, ndarray, bool, number] = None, last_hidden_state: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number], ...]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number], ...]] = None)[source]#
Bases:
ModelOutputBase class for text model’s outputs that also contains a pooling of the last hidden states.
- Parameters
text_embeds (chex.Array of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of [FlaxCLIPTextModel].
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- easydel.infra.modeling_outputs.CausalLMOutput#
alias of
MaskedLMOutput
- class easydel.infra.modeling_outputs.CausalLMOutputWithCrossAttentions(logits: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for causal language model (or autoregressive) outputs.
- Parameters
logits (chex.Array of shape (batch_size, sequence_length, config.vocab_size)) – Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Cross attentions weights after the attention softmax, used to compute the weighted average in the cross-attention heads.
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of chex.Array tuples of length config.n_layers, with each tuple containing the cached key, value states of the self-attention and the cross-attention layers if model is used in encoder-decoder setting. Only relevant if config.is_decoder = True.
Contains pre-computed hidden-states (key and values in the attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.DecoderLayerOutput(hidden_states: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number], residual_states: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, cross_attention: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, attention_weight: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, router_logits: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, gate_loss: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, cache_view: Optional[Any] = None)[source]#
Bases:
ModelOutput- cache_view: Optional[Any] = None#
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.EncoderLayerOutput(hidden_states: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number], residual_states: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, attention_weight: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None)[source]#
Bases:
ModelOutput- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.GreedySearchOutput(sequences: Union[Array, ndarray, bool, number] = None)[source]#
Bases:
ModelOutputFlax Base class for outputs of decoder-only generation models using greedy search.
- Parameters
sequences (chex.Array of shape (batch_size, max_length)) – The generated sequences.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.ImageClassifierOutput(text_embeds: Union[Array, ndarray, bool, number] = None, last_hidden_state: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number], ...]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number], ...]] = None)[source]#
Bases:
ModelOutputBase class for text model’s outputs that also contains a pooling of the last hidden states.
- Parameters
text_embeds (chex.Array of shape (batch_size, output_dim) – The text embeddings obtained by applying the projection layer to the pooled output of [FlaxCLIPTextModel].
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.ImageClassifierOutputWithNoAttention(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of image classification models.
- Parameters
logits (chex.Array of shape (batch_size, config.num_labels)) – Classification (or regression if config.num_labels==1) scores (before SoftMax).
hidden_states (`tuple(chex.Array) –
config.output_hidden_states=True) – tp.Tuple of chex.Array (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each stage) of shape (batch_size, num_channels, height, width). Hidden-states (also called feature maps) of the model at the output of each stage.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.MambaCausalLMOutput(last_hidden_state: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number] = None, hidden_states: Optional[Tuple[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, attentions: Optional[Tuple[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, past_key_values: Optional[Dict[str, Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, loss: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, logits: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number] = None, cache_params: Optional[List[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None)[source]#
Bases:
BaseModelOutput- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.MambaOutput(last_hidden_state: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number] = None, hidden_states: Optional[Tuple[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, attentions: Optional[Tuple[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, past_key_values: Optional[Dict[str, Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None, loss: Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number, NoneType] = None, cache_params: Optional[List[Union[jax.Array, numpy.ndarray, numpy.bool, numpy.number]]] = None)[source]#
Bases:
BaseModelOutput- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.MaskedLMOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[TransformerCache] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for masked language models outputs.
- Parameters
logits (chex.Array of shape (batch_size, sequence_length, config.vocab_size)) – Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.ModelOutput(*args, **kwargs)[source]#
Bases:
OrderedDictBase class for all model outputs as dataclass. Has a __getitem__ that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the None attributes. Otherwise behaves like a regular python dictionary.
- pop(key[, default]) v, remove specified key and return the corresponding value.[source]#
If the key is not found, return the default if given; otherwise, raise a KeyError.
- setdefault(*args, **kwargs)[source]#
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- class easydel.infra.modeling_outputs.MoeCausalLMOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[TransformerCache] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None, aux_loss: Optional[Union[Array, ndarray, bool, number]] = None, router_logits: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, all_router_losses: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None)[source]#
Bases:
MaskedLMOutputBase class for causal language modeling (CLM) outputs of MoE models.
- Parameters
aux_loss (chex.Array, optional) – Auxiliary loss used for training MoE models.
router_logits (tuple(chex.Array), optional) – tp.Tuple of chex.Array (one for each layer) of shape (batch_size, sequence_length, num_experts). The logits output of the router network, which are used to compute the mixture of experts.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.MoeModelOutput(last_hidden_state: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[TransformerCache] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, router_logits: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, all_router_losses: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, logits: Union[Array, ndarray, bool, number] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for MoE model outputs.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) – Sequence of hidden-states at the output of the last layer of the model.
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
router_logits (tuple(chex.Array), optional) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, sequence_length, num_experts).
The logits output of the router network, which are used to compute the mixture of experts.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.MultipleChoiceModelOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of multiple choice models.
- Parameters
logits (chex.Array of shape (batch_size, num_choices)) –
num_choices is the second dimension of the input tensors. (see input_ids above).
Classification scores (before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.NextSentencePredictorOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of models predicting if two sentences are consecutive or not.
- Parameters
logits (chex.Array of shape (batch_size, 2)) – Prediction scores of the next sequence prediction (classification) head (scores of True/False continuation before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.QuestionAnsweringModelOutput(start_logits: Union[Array, ndarray, bool, number] = None, end_logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of question answering models.
- Parameters
start_logits (chex.Array of shape (batch_size, sequence_length)) – Span-start scores (before SoftMax).
end_logits (chex.Array of shape (batch_size, sequence_length)) – Span-end scores (before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.SampleOutput(sequences: Union[Array, ndarray, bool, number] = None)[source]#
Bases:
ModelOutputFlax Base class for outputs of decoder-only generation models using sampling.
- Parameters
sequences (chex.Array of shape (batch_size, max_length)) – The generated sequences.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.Seq2SeqLMOutput(logits: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, decoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, decoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_last_hidden_state: Optional[Union[Array, ndarray, bool, number]] = None, encoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for sequence-to-sequence language models outputs.
- Parameters
logits (chex.Array of shape (batch_size, sequence_length, config.vocab_size)) – Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
decoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the decoder at the output of each layer plus the initial embedding outputs.
decoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
encoder_last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder of the model.
encoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the encoder at the output of each layer plus the initial embedding outputs.
encoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the encoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.Seq2SeqModelOutput(last_hidden_state: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, decoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, decoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_last_hidden_state: Optional[Union[Array, ndarray, bool, number]] = None, encoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for model encoder’s outputs that also contains : pre-computed hidden states that can speed up sequential decoding.
- Parameters
last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size)) –
Sequence of hidden-states at the output of the last layer of the decoder of the model.
If past_key_values is used only the last hidden-state of the sequences of shape (batch_size, 1, hidden_size) is output.
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
decoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the decoder at the output of each layer plus the initial embedding outputs.
decoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
encoder_last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder of the model.
encoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the encoder at the output of each layer plus the initial embedding outputs.
encoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the encoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.Seq2SeqQuestionAnsweringModelOutput(start_logits: Union[Array, ndarray, bool, number] = None, end_logits: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, decoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, decoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_last_hidden_state: Optional[Union[Array, ndarray, bool, number]] = None, encoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of sequence-to-sequence question answering models.
- Parameters
start_logits (chex.Array of shape (batch_size, sequence_length)) – Span-start scores (before SoftMax).
end_logits (chex.Array of shape (batch_size, sequence_length)) – Span-end scores (before SoftMax).
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
decoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the decoder at the output of each layer plus the initial embedding outputs.
decoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
encoder_last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder of the model.
encoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the encoder at the output of each layer plus the initial embedding outputs.
encoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the encoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.Seq2SeqSequenceClassifierOutput(logits: Union[Array, ndarray, bool, number] = None, past_key_values: Optional[TransformerCache] = None, decoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, decoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, cross_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_last_hidden_state: Optional[Union[Array, ndarray, bool, number]] = None, encoder_hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, encoder_attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of sequence-to-sequence sentence classification models.
- Parameters
logits (chex.Array of shape (batch_size, config.num_labels)) – Classification (or regression if config.num_labels==1) scores (before SoftMax).
past_key_values (tuple(tuple(chex.Array)), optional, returned when use_cache=True is passed or when config.use_cache=True) –
tp.Tuple of tuple(chex.Array) of length config.n_layers, with each tuple having 2 tensors of shape (batch_size, num_heads, sequence_length, embed_size_per_head)) and 2 additional tensors of shape (batch_size, num_heads, encoder_sequence_length, embed_size_per_head).
Contains pre-computed hidden-states (key and values in the self-attention blocks and in the cross-attention blocks) that can be used (see past_key_values input) to speed up sequential decoding.
decoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the decoder at the output of each layer plus the initial embedding outputs.
decoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
cross_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the decoder’s cross-attention layer, after the attention softmax, used to compute the weighted average in the cross-attention heads.
encoder_last_hidden_state (chex.Array of shape (batch_size, sequence_length, hidden_size), optional) – Sequence of hidden-states at the output of the last layer of the encoder of the model.
encoder_hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the encoder at the output of each layer plus the initial embedding outputs.
encoder_attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights of the encoder, after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.SequenceClassifierOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, past_key_values: Optional[TransformerCache] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None, aux_loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of sentence classification models.
- Parameters
logits (chex.Array of shape (batch_size, config.num_labels)) – Classification (or regression if config.num_labels==1) scores (before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- past_key_values: Optional[TransformerCache] = None#
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.
- class easydel.infra.modeling_outputs.TokenClassifierOutput(logits: Union[Array, ndarray, bool, number] = None, hidden_states: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, attentions: Optional[Tuple[Union[Array, ndarray, bool, number]]] = None, loss: Optional[Union[Array, ndarray, bool, number]] = None)[source]#
Bases:
ModelOutputBase class for outputs of token classification models.
- Parameters
logits (chex.Array of shape (batch_size, sequence_length, config.num_labels)) – Classification scores (before SoftMax).
hidden_states (tuple(chex.Array), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) –
tp.Tuple of chex.Array (one for the output of the embeddings + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the initial embedding outputs.
attentions (tuple(chex.Array), optional, returned when output_attentions=True is passed or when config.output_attentions=True) –
tp.Tuple of chex.Array (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- classmethod from_dict(data: Dict[str, Any]) T#
Deserializes a dictionary into a PyTree object.
- classmethod from_json(json_str: str) T#
Deserializes a JSON string into a PyTree object.
- replace(**kwargs)#
Creates a new instance with specified fields replaced.
- to_dict() Dict[str, Any]#
Serializes the PyTree object to a dictionary.
- to_json(**kwargs) str#
Serializes the PyTree object to a JSON string.