easydel.modules.pixtral.modeling_pixtral_flax#
- class easydel.modules.pixtral.modeling_pixtral_flax.PixtralAttention(*args: Any, **kwargs: Any)[source]#
Bases:
FlaxAttentionModule
- class easydel.modules.pixtral.modeling_pixtral_flax.PixtralBlock(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.pixtral.modeling_pixtral_flax.PixtralMLP(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.pixtral.modeling_pixtral_flax.PixtralTransformer(*args: Any, **kwargs: Any)[source]#
Bases:
Module
- class easydel.modules.pixtral.modeling_pixtral_flax.PixtralVisionModel(*args: Any, **kwargs: Any)[source]#
Bases:
EasyDeLBaseModule- property frequencies#
Returns frequency values from the config.
- easydel.modules.pixtral.modeling_pixtral_flax.apply_rotary_pos_emb(q, k, cos, sin, position_ids=None, unsqueeze_dim=0)[source]#
Applies Rotary Position Embedding to the query and key tensors.
- Parameters
q (jnp.ndarray) – The query tensor.
k (jnp.ndarray) – The key tensor.
cos (jnp.ndarray) – The cosine part of the rotary embedding.
sin (jnp.ndarray) – The sine part of the rotary embedding.
position_ids (jnp.ndarray, optional) – Deprecated and unused.
unsqueeze_dim (int, optional) – The ‘unsqueeze_dim’ argument specifies the dimension along which to unsqueeze cos and sin so that they can be properly broadcasted to the dimensions of q and k. For example, note that cos and sin have the shape [batch_size, seq_len, head_dim]. Then, if q and k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes cos and sin broadcastable to the shapes of q and k. Similarly, if q and k have the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2.
- Returns
tuple(jnp.ndarray) comprising of the query and key tensors rotated using the Rotary Position Embedding.
- easydel.modules.pixtral.modeling_pixtral_flax.compute_frequencies(dim: int, max_patches_per_side: int, theta: float = 10000.0)[source]#
Computes frequencies with a fixed max length for RoPE.
- Parameters
dim – Embedding dimension.
max_patches_per_side – Maximum number of patches per side of the image.
theta – Scaling factor for frequencies.
- Returns
Computed frequencies of shape (max_patches_per_side**2, dim).
- Return type
inv_freq
- easydel.modules.pixtral.modeling_pixtral_flax.generate_block_attention_mask(patch_embeds_list, tensor)[source]#