easydel.modules.pixtral.modeling_pixtral_flax#

class easydel.modules.pixtral.modeling_pixtral_flax.PixtralAttention(*args: Any, **kwargs: Any)[source]#

Bases: FlaxAttentionModule

class easydel.modules.pixtral.modeling_pixtral_flax.PixtralBlock(*args: Any, **kwargs: Any)[source]#

Bases: Module

class easydel.modules.pixtral.modeling_pixtral_flax.PixtralMLP(*args: Any, **kwargs: Any)[source]#

Bases: Module

class easydel.modules.pixtral.modeling_pixtral_flax.PixtralTransformer(*args: Any, **kwargs: Any)[source]#

Bases: Module

class easydel.modules.pixtral.modeling_pixtral_flax.PixtralVisionModel(*args: Any, **kwargs: Any)[source]#

Bases: EasyDeLBaseModule

property frequencies#

Returns frequency values from the config.

easydel.modules.pixtral.modeling_pixtral_flax.apply_rotary_pos_emb(q, k, cos, sin, position_ids=None, unsqueeze_dim=0)[source]#

Applies Rotary Position Embedding to the query and key tensors.

Parameters
  • q (jnp.ndarray) – The query tensor.

  • k (jnp.ndarray) – The key tensor.

  • cos (jnp.ndarray) – The cosine part of the rotary embedding.

  • sin (jnp.ndarray) – The sine part of the rotary embedding.

  • position_ids (jnp.ndarray, optional) – Deprecated and unused.

  • unsqueeze_dim (int, optional) – The ‘unsqueeze_dim’ argument specifies the dimension along which to unsqueeze cos and sin so that they can be properly broadcasted to the dimensions of q and k. For example, note that cos and sin have the shape [batch_size, seq_len, head_dim]. Then, if q and k have the shape [batch_size, heads, seq_len, head_dim], then setting unsqueeze_dim=1 makes cos and sin broadcastable to the shapes of q and k. Similarly, if q and k have the shape [batch_size, seq_len, heads, head_dim], then set unsqueeze_dim=2.

Returns

tuple(jnp.ndarray) comprising of the query and key tensors rotated using the Rotary Position Embedding.

easydel.modules.pixtral.modeling_pixtral_flax.compute_frequencies(dim: int, max_patches_per_side: int, theta: float = 10000.0)[source]#

Computes frequencies with a fixed max length for RoPE.

Parameters
  • dim – Embedding dimension.

  • max_patches_per_side – Maximum number of patches per side of the image.

  • theta – Scaling factor for frequencies.

Returns

Computed frequencies of shape (max_patches_per_side**2, dim).

Return type

inv_freq

easydel.modules.pixtral.modeling_pixtral_flax.generate_block_attention_mask(patch_embeds_list, tensor)[source]#
easydel.modules.pixtral.modeling_pixtral_flax.position_ids_in_meshgrid(patch_embeds_list, max_width)[source]#
easydel.modules.pixtral.modeling_pixtral_flax.rotate_half(x)[source]#

Rotates half the hidden dims of the input.