easydel.inference.openai_api_modules

Contents

easydel.inference.openai_api_modules#

OpenAI API compatibility models and utilities.

This module provides Pydantic models and utilities for OpenAI API compatibility, enabling EasyDeL inference engines to work with OpenAI-compatible clients and tools.

Key Components:

Request/Response models for chat completions and text completions
Function calling support with multiple format parsers
Token usage tracking and metrics
Streaming response models

Classes:

ChatMessage: Single message in a conversation DeltaMessage: Incremental message for streaming UsageInfo: Token usage and performance metrics ChatCompletionRequest: Request for chat completions ChatCompletionResponse: Response from chat completions CompletionRequest: Request for text completions CompletionResponse: Response from text completions FunctionCallFormat: Supported function call formats FunctionCallFormatter: Formatter for function call prompts FunctionCallParser: Parser for extracting function calls

Example

>>> from easydel.inference.openai_api_modules import (
...     ChatCompletionRequest,
...     ChatMessage
... )
>>> request = ChatCompletionRequest(
...     model="gpt-3.5-turbo",
...     messages=[
...         ChatMessage(role="user", content="Hello!")
...     ],
...     temperature=0.7
... )

class easydel.inference.openai_api_modules.ChatCompletionRequest(*, model: str, messages: list[easydel.inference.openai_api_modules.ChatMessage], max_tokens: int | None = None, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 0.95, top_k: int = 0, min_p: float = 0.0, suppress_tokens: list[int] = <factory>, functions: list[easydel.inference.openai_api_modules.FunctionDefinition] | None = None, function_call: str | dict[str, typing.Any] | None = None, tools: list[easydel.inference.openai_api_modules.ToolDefinition] | None = None, tool_choice: str | dict[str, typing.Any] | None = None, n: int | None = 1, stream: bool | None = False, stop: str | list[str] | None = None, logit_bias: dict[str, float] | None = None, user: str | None = None, chat_template_kwargs: dict[str, int | float | str | bool] | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the chat completion endpoint. Mirrors the OpenAI ChatCompletion request structure.

chat_template_kwargs: dict[str, int | float | str | bool] | None#

frequency_penalty: float#

function_call: str | dict[str, Any] | None#

functions: list[easydel.inference.openai_api_modules.FunctionDefinition] | None#

logit_bias: dict[str, float] | None#

max_tokens: int | None#

messages: list[easydel.inference.openai_api_modules.ChatMessage]#

min_p: float#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n: int | None#

presence_penalty: float#

repetition_penalty: float#

stop: str | list[str] | None#

stream: bool | None#

suppress_tokens: list[int]#

temperature: float#

tool_choice: str | dict[str, Any] | None#

tools: list[easydel.inference.openai_api_modules.ToolDefinition] | None#

top_k: int#

top_p: float#

user: str | None#

class easydel.inference.openai_api_modules.ChatCompletionResponse(*, id: str = <factory>, object: str = 'chat.completion', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.ChatCompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a non-streaming response from the chat completion endpoint.

choices: list[easydel.inference.openai_api_modules.ChatCompletionResponseChoice]#

created: int#

id: str#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#

usage: UsageInfo#

class easydel.inference.openai_api_modules.ChatCompletionResponseChoice(*, index: int, message: ChatMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call', 'tool_calls', 'abort']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a non-streaming chat completion response.

finish_reason: Optional[Literal['stop', 'length', 'function_call', 'tool_calls', 'abort']]#

index: int#

message: ChatMessage#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.ChatCompletionStreamResponse(*, id: str = <factory>, object: str = 'chat.completion.chunk', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a single chunk in a streaming response from the chat completion endpoint.

choices: list[easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice]#

created: int#

id: str#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#

usage: UsageInfo#

class easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice(*, index: int, delta: DeltaMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a streaming chat completion response chunk.

delta: DeltaMessage#

finish_reason: Optional[Literal['stop', 'length', 'function_call']]#

index: int#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.ChatMessage(*, role: str, content: str | list[Mapping[str, str]], name: str | None = None, function_call: dict[str, Any] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single message in a chat conversation.

role#

Message role (system, user, assistant, function)

Type: str

content#

Message content (text or structured)

Type: str | list[Mapping[str, str]]

name#

Optional name for the message sender

Type: str | None

function_call#

Optional function call made by assistant

Type: dict[str, Any] | None

content: str | list[Mapping[str, str]]#

function_call: dict[str, Any] | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None#

role: str#

class easydel.inference.openai_api_modules.CompletionLogprobs(*, tokens: list[str], token_logprobs: list[float], top_logprobs: list[dict[str, float]] | None = None, text_offset: list[int] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Log probabilities for token generation.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text_offset: list[int] | None#

token_logprobs: list[float]#

tokens: list[str]#

top_logprobs: list[dict[str, float]] | None#

class easydel.inference.openai_api_modules.CompletionRequest(*, model: str, prompt: str | list[str], max_tokens: int | None = None, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 0.95, top_k: int = 0, min_p: float = 0.0, suppress_tokens: list[int] = <factory>, n: int | None = 1, stream: bool | None = False, stop: str | list[str] | None = None, logit_bias: dict[str, float] | None = None, user: str | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the completions endpoint. Mirrors the OpenAI Completion request structure.

frequency_penalty: float#

logit_bias: dict[str, float] | None#

max_tokens: int | None#

min_p: float#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n: int | None#

presence_penalty: float#

prompt: str | list[str]#

repetition_penalty: float#

stop: str | list[str] | None#

stream: bool | None#

suppress_tokens: list[int]#

temperature: float#

top_k: int#

top_p: float#

user: str | None#

class easydel.inference.openai_api_modules.CompletionResponse(*, id: str = <factory>, object: str = 'text_completion', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.CompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a response from the completions endpoint.

choices: list[easydel.inference.openai_api_modules.CompletionResponseChoice]#

created: int#

id: str#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#

usage: UsageInfo#

class easydel.inference.openai_api_modules.CompletionResponseChoice(*, text: str, index: int, logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None = None, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a completion response.

finish_reason: Optional[Literal['stop', 'length', 'function_call']]#

index: int#

logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str#

class easydel.inference.openai_api_modules.CompletionStreamResponse(*, id: str = <factory>, object: str = 'text_completion.chunk', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.CompletionStreamResponseChoice], usage: easydel.inference.openai_api_modules.UsageInfo | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a streaming response from the completions endpoint.

choices: list[easydel.inference.openai_api_modules.CompletionStreamResponseChoice]#

created: int#

id: str#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#

usage: easydel.inference.openai_api_modules.UsageInfo | None#

class easydel.inference.openai_api_modules.CompletionStreamResponseChoice(*, index: int, text: str, logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None = None, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a streaming completion response chunk.

finish_reason: Optional[Literal['stop', 'length', 'function_call']]#

index: int#

logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str#

class easydel.inference.openai_api_modules.CountTokenRequest(*, model: str, conversation: str | list[easydel.inference.openai_api_modules.ChatMessage], **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the token counting endpoint.

conversation: str | list[easydel.inference.openai_api_modules.ChatMessage]#

model: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.DeltaFunctionCall(*, name: str | None = None, arguments: str | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

arguments: str | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None#

class easydel.inference.openai_api_modules.DeltaMessage(*, role: str | None = None, content: str | list[Mapping[str, str]] | None = None, function_call: dict[str, Any] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a change (delta) in a chat message.

Used in streaming responses to send incremental updates.

role#

Optional role if starting new message

Type: str | None

content#

Incremental content to append

Type: str | list[Mapping[str, str]] | None

function_call#

Optional function call updates

Type: dict[str, Any] | None

content: str | list[Mapping[str, str]] | None#

function_call: dict[str, Any] | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: str | None#

class easydel.inference.openai_api_modules.DeltaToolCall(*, id: str | None = None, type: Optional[Literal['function']] = None, index: int, function: easydel.inference.openai_api_modules.DeltaFunctionCall | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

function: easydel.inference.openai_api_modules.DeltaFunctionCall | None#

id: str | None#

index: int#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Optional[Literal['function']]#

class easydel.inference.openai_api_modules.ExtractedToolCallInformation(*, tools_called: bool, tool_calls: list[easydel.inference.openai_api_modules.ToolCall], content: str | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

content: str | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

tool_calls: list[easydel.inference.openai_api_modules.ToolCall]#

tools_called: bool#

class easydel.inference.openai_api_modules.Function(*, name: str, description: str | None = None, parameters: dict[str, typing.Any] = <factory>, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Function definition for OpenAI-compatible function calling.

description: str | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

parameters: dict[str, Any]#

class easydel.inference.openai_api_modules.FunctionCall(*, name: str, arguments: str, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a function call in the OpenAI format.

arguments: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

class easydel.inference.openai_api_modules.FunctionCallFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: str, Enum

Supported function call formats.

Different models and frameworks use different formats for function calling.

OPENAI#: OpenAI’s standard format

JSON_SCHEMA#: Direct JSON schema format

HERMES#: Hermes model function calling format

GORILLA#: Gorilla model function calling format

QWEN#: Qwen’s special token format (✿FUNCTION✿)

NOUS#: Nous XML-style format (<tool_call>)

GORILLA = 'gorilla'#

HERMES = 'hermes'#

JSON_SCHEMA = 'json_schema'#

NOUS = 'nous'#

OPENAI = 'openai'#

QWEN = 'qwen'#

class easydel.inference.openai_api_modules.FunctionDefinition(*, name: str, description: str | None = None, parameters: dict[str, typing.Any] = <factory>, required: list[str] | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Defines a function that can be called by the model.

name#

Function name

Type: str

description#

Function description for the model

Type: str | None

parameters#

JSON Schema for function parameters

Type: dict[str, Any]

required#

List of required parameter names

Type: list[str] | None

description: str | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#

parameters: dict[str, Any]#

required: list[str] | None#

class easydel.inference.openai_api_modules.OpenAIBaseModel(**extra_data: Any)[source]#

Bases: BaseModel

field_names: ClassVar[set[str] | None] = None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.Tool(*, type: str = 'function', function: Function, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Tool definition supporting function calling.

function: Function#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#

class easydel.inference.openai_api_modules.ToolCall(*, id: str, type: str = 'function', function: FunctionCall, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a tool call in responses.

function: FunctionCall#

id: str#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#

class easydel.inference.openai_api_modules.ToolDefinition(*, type: str = 'function', function: FunctionDefinition, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Defines a tool that can be called by the model.

function: FunctionDefinition#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#

class easydel.inference.openai_api_modules.UsageInfo(*, prompt_tokens: int = 0, completion_tokens: int | None = 0, total_tokens: int = 0, tokens_per_second: float = 0, processing_time: float = 0.0, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Token usage and performance metrics.

Tracks computational resources used for a request.

prompt_tokens#

Number of tokens in the prompt

Type: int

completion_tokens#

Number of tokens generated

Type: int | None

total_tokens#

Sum of prompt and completion tokens

Type: int

tokens_per_second#

Generation speed

Type: float

processing_time#

Total processing time in seconds

Type: float

completion_tokens: int | None#

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

processing_time: float#

prompt_tokens: int#

tokens_per_second: float#

total_tokens: int#