easydel.inference.openai_api_modules

Contents

easydel.inference.openai_api_modules#

OpenAI API compatibility models and utilities.

This module provides Pydantic models and utilities for OpenAI API compatibility, enabling EasyDeL inference engines to work with OpenAI-compatible clients and tools.

Key Components:
  • Request/Response models for chat completions and text completions

  • Function calling support with multiple format parsers

  • Token usage tracking and metrics

  • Streaming response models

Classes:

ChatMessage: Single message in a conversation DeltaMessage: Incremental message for streaming UsageInfo: Token usage and performance metrics ChatCompletionRequest: Request for chat completions ChatCompletionResponse: Response from chat completions CompletionRequest: Request for text completions CompletionResponse: Response from text completions FunctionCallFormat: Supported function call formats FunctionCallFormatter: Formatter for function call prompts FunctionCallParser: Parser for extracting function calls

Example

>>> from easydel.inference.openai_api_modules import (
...     ChatCompletionRequest,
...     ChatMessage
... )
>>> request = ChatCompletionRequest(
...     model="gpt-3.5-turbo",
...     messages=[
...         ChatMessage(role="user", content="Hello!")
...     ],
...     temperature=0.7
... )
class easydel.inference.openai_api_modules.ChatCompletionRequest(*, model: str, messages: list[easydel.inference.openai_api_modules.ChatMessage], max_tokens: int | None = None, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 0.95, top_k: int = 0, min_p: float = 0.0, suppress_tokens: list[int] = <factory>, functions: list[easydel.inference.openai_api_modules.FunctionDefinition] | None = None, function_call: str | dict[str, typing.Any] | None = None, tools: list[easydel.inference.openai_api_modules.ToolDefinition] | None = None, tool_choice: str | dict[str, typing.Any] | None = None, n: int | None = 1, stream: bool | None = False, stop: str | list[str] | None = None, logit_bias: dict[str, float] | None = None, user: str | None = None, chat_template_kwargs: dict[str, int | float | str | bool] | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the chat completion endpoint. Mirrors the OpenAI ChatCompletion request structure.

chat_template_kwargs: dict[str, int | float | str | bool] | None#
frequency_penalty: float#
function_call: str | dict[str, Any] | None#
functions: list[easydel.inference.openai_api_modules.FunctionDefinition] | None#
logit_bias: dict[str, float] | None#
max_tokens: int | None#
messages: list[easydel.inference.openai_api_modules.ChatMessage]#
min_p: float#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n: int | None#
presence_penalty: float#
repetition_penalty: float#
stop: str | list[str] | None#
stream: bool | None#
suppress_tokens: list[int]#
temperature: float#
tool_choice: str | dict[str, Any] | None#
tools: list[easydel.inference.openai_api_modules.ToolDefinition] | None#
top_k: int#
top_p: float#
user: str | None#
class easydel.inference.openai_api_modules.ChatCompletionResponse(*, id: str = <factory>, object: str = 'chat.completion', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.ChatCompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a non-streaming response from the chat completion endpoint.

choices: list[easydel.inference.openai_api_modules.ChatCompletionResponseChoice]#
created: int#
id: str#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#
usage: UsageInfo#
class easydel.inference.openai_api_modules.ChatCompletionResponseChoice(*, index: int, message: ChatMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call', 'tool_calls', 'abort']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a non-streaming chat completion response.

finish_reason: Optional[Literal['stop', 'length', 'function_call', 'tool_calls', 'abort']]#
index: int#
message: ChatMessage#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.ChatCompletionStreamResponse(*, id: str = <factory>, object: str = 'chat.completion.chunk', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a single chunk in a streaming response from the chat completion endpoint.

choices: list[easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice]#
created: int#
id: str#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#
usage: UsageInfo#
class easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice(*, index: int, delta: DeltaMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a streaming chat completion response chunk.

delta: DeltaMessage#
finish_reason: Optional[Literal['stop', 'length', 'function_call']]#
index: int#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.ChatMessage(*, role: str, content: str | list[Mapping[str, str]], name: str | None = None, function_call: dict[str, Any] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single message in a chat conversation.

role#

Message role (system, user, assistant, function)

Type

str

content#

Message content (text or structured)

Type

str | list[Mapping[str, str]]

name#

Optional name for the message sender

Type

str | None

function_call#

Optional function call made by assistant

Type

dict[str, Any] | None

content: str | list[Mapping[str, str]]#
function_call: dict[str, Any] | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None#
role: str#
class easydel.inference.openai_api_modules.CompletionLogprobs(*, tokens: list[str], token_logprobs: list[float], top_logprobs: list[dict[str, float]] | None = None, text_offset: list[int] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Log probabilities for token generation.

model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text_offset: list[int] | None#
token_logprobs: list[float]#
tokens: list[str]#
top_logprobs: list[dict[str, float]] | None#
class easydel.inference.openai_api_modules.CompletionRequest(*, model: str, prompt: str | list[str], max_tokens: int | None = None, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 0.95, top_k: int = 0, min_p: float = 0.0, suppress_tokens: list[int] = <factory>, n: int | None = 1, stream: bool | None = False, stop: str | list[str] | None = None, logit_bias: dict[str, float] | None = None, user: str | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the completions endpoint. Mirrors the OpenAI Completion request structure.

frequency_penalty: float#
logit_bias: dict[str, float] | None#
max_tokens: int | None#
min_p: float#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n: int | None#
presence_penalty: float#
prompt: str | list[str]#
repetition_penalty: float#
stop: str | list[str] | None#
stream: bool | None#
suppress_tokens: list[int]#
temperature: float#
top_k: int#
top_p: float#
user: str | None#
class easydel.inference.openai_api_modules.CompletionResponse(*, id: str = <factory>, object: str = 'text_completion', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.CompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a response from the completions endpoint.

choices: list[easydel.inference.openai_api_modules.CompletionResponseChoice]#
created: int#
id: str#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#
usage: UsageInfo#
class easydel.inference.openai_api_modules.CompletionResponseChoice(*, text: str, index: int, logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None = None, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a completion response.

finish_reason: Optional[Literal['stop', 'length', 'function_call']]#
index: int#
logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str#
class easydel.inference.openai_api_modules.CompletionStreamResponse(*, id: str = <factory>, object: str = 'text_completion.chunk', created: int = <factory>, model: str, choices: list[easydel.inference.openai_api_modules.CompletionStreamResponseChoice], usage: easydel.inference.openai_api_modules.UsageInfo | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Represents a streaming response from the completions endpoint.

choices: list[easydel.inference.openai_api_modules.CompletionStreamResponseChoice]#
created: int#
id: str#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

object: str#
usage: easydel.inference.openai_api_modules.UsageInfo | None#
class easydel.inference.openai_api_modules.CompletionStreamResponseChoice(*, index: int, text: str, logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None = None, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a single choice within a streaming completion response chunk.

finish_reason: Optional[Literal['stop', 'length', 'function_call']]#
index: int#
logprobs: easydel.inference.openai_api_modules.CompletionLogprobs | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str#
class easydel.inference.openai_api_modules.CountTokenRequest(*, model: str, conversation: str | list[easydel.inference.openai_api_modules.ChatMessage], **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a request to the token counting endpoint.

conversation: str | list[easydel.inference.openai_api_modules.ChatMessage]#
model: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.DeltaFunctionCall(*, name: str | None = None, arguments: str | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

arguments: str | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None#
class easydel.inference.openai_api_modules.DeltaMessage(*, role: str | None = None, content: str | list[Mapping[str, str]] | None = None, function_call: dict[str, Any] | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a change (delta) in a chat message.

Used in streaming responses to send incremental updates.

role#

Optional role if starting new message

Type

str | None

content#

Incremental content to append

Type

str | list[Mapping[str, str]] | None

function_call#

Optional function call updates

Type

dict[str, Any] | None

content: str | list[Mapping[str, str]] | None#
function_call: dict[str, Any] | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: str | None#
class easydel.inference.openai_api_modules.DeltaToolCall(*, id: str | None = None, type: Optional[Literal['function']] = None, index: int, function: easydel.inference.openai_api_modules.DeltaFunctionCall | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

function: easydel.inference.openai_api_modules.DeltaFunctionCall | None#
id: str | None#
index: int#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Optional[Literal['function']]#
class easydel.inference.openai_api_modules.ExtractedToolCallInformation(*, tools_called: bool, tool_calls: list[easydel.inference.openai_api_modules.ToolCall], content: str | None = None, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

content: str | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

tool_calls: list[easydel.inference.openai_api_modules.ToolCall]#
tools_called: bool#
class easydel.inference.openai_api_modules.Function(*, name: str, description: str | None = None, parameters: dict[str, typing.Any] = <factory>, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Function definition for OpenAI-compatible function calling.

description: str | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#
parameters: dict[str, Any]#
class easydel.inference.openai_api_modules.FunctionCall(*, name: str, arguments: str, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a function call in the OpenAI format.

arguments: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#
class easydel.inference.openai_api_modules.FunctionCallFormat(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: str, Enum

Supported function call formats.

Different models and frameworks use different formats for function calling.

OPENAI#

OpenAI’s standard format

JSON_SCHEMA#

Direct JSON schema format

HERMES#

Hermes model function calling format

GORILLA#

Gorilla model function calling format

QWEN#

Qwen’s special token format (✿FUNCTION✿)

NOUS#

Nous XML-style format (<tool_call>)

GORILLA = 'gorilla'#
HERMES = 'hermes'#
JSON_SCHEMA = 'json_schema'#
NOUS = 'nous'#
OPENAI = 'openai'#
QWEN = 'qwen'#
class easydel.inference.openai_api_modules.FunctionDefinition(*, name: str, description: str | None = None, parameters: dict[str, typing.Any] = <factory>, required: list[str] | None = None, **extra_data: ~typing.Any)[source]#

Bases: OpenAIBaseModel

Defines a function that can be called by the model.

name#

Function name

Type

str

description#

Function description for the model

Type

str | None

parameters#

JSON Schema for function parameters

Type

dict[str, Any]

required#

List of required parameter names

Type

list[str] | None

description: str | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str#
parameters: dict[str, Any]#
required: list[str] | None#
class easydel.inference.openai_api_modules.OpenAIBaseModel(**extra_data: Any)[source]#

Bases: BaseModel

field_names: ClassVar[set[str] | None] = None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class easydel.inference.openai_api_modules.Tool(*, type: str = 'function', function: Function, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Tool definition supporting function calling.

function: Function#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#
class easydel.inference.openai_api_modules.ToolCall(*, id: str, type: str = 'function', function: FunctionCall, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Represents a tool call in responses.

function: FunctionCall#
id: str#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#
class easydel.inference.openai_api_modules.ToolDefinition(*, type: str = 'function', function: FunctionDefinition, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Defines a tool that can be called by the model.

function: FunctionDefinition#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str#
class easydel.inference.openai_api_modules.UsageInfo(*, prompt_tokens: int = 0, completion_tokens: int | None = 0, total_tokens: int = 0, tokens_per_second: float = 0, processing_time: float = 0.0, **extra_data: Any)[source]#

Bases: OpenAIBaseModel

Token usage and performance metrics.

Tracks computational resources used for a request.

prompt_tokens#

Number of tokens in the prompt

Type

int

completion_tokens#

Number of tokens generated

Type

int | None

total_tokens#

Sum of prompt and completion tokens

Type

int

tokens_per_second#

Generation speed

Type

float

processing_time#

Total processing time in seconds

Type

float

completion_tokens: int | None#
model_config: ClassVar[ConfigDict] = {'extra': 'allow'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

processing_time: float#
prompt_tokens: int#
tokens_per_second: float#
total_tokens: int#