easydel.inference.openai_api_modules#
Defines Pydantic models for the vInference API, mimicking OpenAI’s structure.
- class easydel.inference.openai_api_modules.ChatCompletionRequest(*, model: str, messages: ~typing.List[~easydel.inference.openai_api_modules.ChatMessage], max_tokens: int = 16, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 1.0, top_k: int = 0, min_p: float = 0.0, suppress_tokens: ~typing.List[int] = <factory>, functions: ~typing.Optional[~typing.List[~easydel.inference.openai_api_modules.FunctionDefinition]] = None, function_call: ~typing.Optional[~typing.Union[str, ~typing.Dict[str, ~typing.Any]]] = None, tools: ~typing.Optional[~typing.List[~easydel.inference.openai_api_modules.ToolDefinition]] = None, tool_choice: ~typing.Optional[~typing.Union[str, ~typing.Dict[str, ~typing.Any]]] = None, n: ~typing.Optional[int] = 1, stream: ~typing.Optional[bool] = False, stop: ~typing.Optional[~typing.Union[str, ~typing.List[str]]] = None, logit_bias: ~typing.Optional[~typing.Dict[str, float]] = None, user: ~typing.Optional[str] = None)[source]#
Bases:
BaseModelRepresents a request to the chat completion endpoint. Mirrors the OpenAI ChatCompletion request structure.
- frequency_penalty: float#
- function_call: Optional[Union[str, Dict[str, Any]]]#
- functions: Optional[List[FunctionDefinition]]#
- logit_bias: Optional[Dict[str, float]]#
- max_tokens: int#
- messages: List[ChatMessage]#
- min_p: float#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n: Optional[int]#
- presence_penalty: float#
- repetition_penalty: float#
- stop: Optional[Union[str, List[str]]]#
- stream: Optional[bool]#
- suppress_tokens: List[int]#
- temperature: float#
- tool_choice: Optional[Union[str, Dict[str, Any]]]#
- tools: Optional[List[ToolDefinition]]#
- top_k: int#
- top_p: float#
- user: Optional[str]#
- class easydel.inference.openai_api_modules.ChatCompletionResponse(*, id: str = <factory>, object: str = 'chat.completion', created: int = <factory>, model: str, choices: ~typing.List[~easydel.inference.openai_api_modules.ChatCompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo)[source]#
Bases:
BaseModelRepresents a non-streaming response from the chat completion endpoint.
- choices: List[ChatCompletionResponseChoice]#
- created: int#
- id: str#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str#
- class easydel.inference.openai_api_modules.ChatCompletionResponseChoice(*, index: int, message: ChatMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None)[source]#
Bases:
BaseModelRepresents a single choice within a non-streaming chat completion response.
- finish_reason: Optional[Literal['stop', 'length', 'function_call']]#
- index: int#
- message: ChatMessage#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class easydel.inference.openai_api_modules.ChatCompletionStreamResponse(*, id: str = <factory>, object: str = 'chat.completion.chunk', created: int = <factory>, model: str, choices: ~typing.List[~easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo)[source]#
Bases:
BaseModelRepresents a single chunk in a streaming response from the chat completion endpoint.
- choices: List[ChatCompletionStreamResponseChoice]#
- created: int#
- id: str#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str#
- class easydel.inference.openai_api_modules.ChatCompletionStreamResponseChoice(*, index: int, delta: DeltaMessage, finish_reason: Optional[Literal['stop', 'length', 'function_call']] = None)[source]#
Bases:
BaseModelRepresents a single choice within a streaming chat completion response chunk.
- delta: DeltaMessage#
- finish_reason: Optional[Literal['stop', 'length', 'function_call']]#
- index: int#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class easydel.inference.openai_api_modules.ChatMessage(*, role: str, content: Union[str, List[Mapping[str, str]]], name: Optional[str] = None, function_call: Optional[Dict[str, Any]] = None)[source]#
Bases:
BaseModelRepresents a single message in a chat conversation.
- content: Union[str, List[Mapping[str, str]]]#
- function_call: Optional[Dict[str, Any]]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: Optional[str]#
- role: str#
- class easydel.inference.openai_api_modules.CompletionLogprobs(*, tokens: List[str], token_logprobs: List[float], top_logprobs: Optional[List[Dict[str, float]]] = None, text_offset: Optional[List[int]] = None)[source]#
Bases:
BaseModelLog probabilities for token generation.
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text_offset: Optional[List[int]]#
- token_logprobs: List[float]#
- tokens: List[str]#
- top_logprobs: Optional[List[Dict[str, float]]]#
- class easydel.inference.openai_api_modules.CompletionRequest(*, model: str, prompt: ~typing.Union[str, ~typing.List[str]], max_tokens: int = 16, presence_penalty: float = 0.0, frequency_penalty: float = 0.0, repetition_penalty: float = 1.0, temperature: float = 0.7, top_p: float = 1.0, top_k: int = 0, min_p: float = 0.0, suppress_tokens: ~typing.List[int] = <factory>, n: ~typing.Optional[int] = 1, stream: ~typing.Optional[bool] = False, stop: ~typing.Optional[~typing.Union[str, ~typing.List[str]]] = None, logit_bias: ~typing.Optional[~typing.Dict[str, float]] = None, user: ~typing.Optional[str] = None)[source]#
Bases:
BaseModelRepresents a request to the completions endpoint. Mirrors the OpenAI Completion request structure.
- frequency_penalty: float#
- logit_bias: Optional[Dict[str, float]]#
- max_tokens: int#
- min_p: float#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- n: Optional[int]#
- presence_penalty: float#
- prompt: Union[str, List[str]]#
- repetition_penalty: float#
- stop: Optional[Union[str, List[str]]]#
- stream: Optional[bool]#
- suppress_tokens: List[int]#
- temperature: float#
- top_k: int#
- top_p: float#
- user: Optional[str]#
- class easydel.inference.openai_api_modules.CompletionResponse(*, id: str = <factory>, object: str = 'text_completion', created: int = <factory>, model: str, choices: ~typing.List[~easydel.inference.openai_api_modules.CompletionResponseChoice], usage: ~easydel.inference.openai_api_modules.UsageInfo)[source]#
Bases:
BaseModelRepresents a response from the completions endpoint.
- choices: List[CompletionResponseChoice]#
- created: int#
- id: str#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str#
- class easydel.inference.openai_api_modules.CompletionResponseChoice(*, text: str, index: int, logprobs: Optional[CompletionLogprobs] = None, finish_reason: Optional[Literal['stop', 'length']] = None)[source]#
Bases:
BaseModelRepresents a single choice within a completion response.
- finish_reason: Optional[Literal['stop', 'length']]#
- index: int#
- logprobs: Optional[CompletionLogprobs]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str#
- class easydel.inference.openai_api_modules.CompletionStreamResponse(*, id: str = <factory>, object: str = 'text_completion.chunk', created: int = <factory>, model: str, choices: ~typing.List[~easydel.inference.openai_api_modules.CompletionStreamResponseChoice], usage: ~typing.Optional[~easydel.inference.openai_api_modules.UsageInfo] = None)[source]#
Bases:
BaseModelRepresents a streaming response from the completions endpoint.
- choices: List[CompletionStreamResponseChoice]#
- created: int#
- id: str#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- object: str#
- class easydel.inference.openai_api_modules.CompletionStreamResponseChoice(*, index: int, text: str, logprobs: Optional[CompletionLogprobs] = None, finish_reason: Optional[Literal['stop', 'length']] = None)[source]#
Bases:
BaseModelRepresents a single choice within a streaming completion response chunk.
- finish_reason: Optional[Literal['stop', 'length']]#
- index: int#
- logprobs: Optional[CompletionLogprobs]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- text: str#
- class easydel.inference.openai_api_modules.CountTokenRequest(*, model: str, conversation: Union[str, List[ChatMessage]])[source]#
Bases:
BaseModelRepresents a request to the token counting endpoint.
- conversation: Union[str, List[ChatMessage]]#
- model: str#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class easydel.inference.openai_api_modules.DeltaMessage(*, role: Optional[str] = None, content: Optional[Union[str, List[Mapping[str, str]]]] = None, function_call: Optional[Dict[str, Any]] = None)[source]#
Bases:
BaseModelRepresents a change (delta) in a chat message, used in streaming responses.
- content: Optional[Union[str, List[Mapping[str, str]]]]#
- function_call: Optional[Dict[str, Any]]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- role: Optional[str]#
- class easydel.inference.openai_api_modules.FunctionDefinition(*, name: str, description: ~typing.Optional[str] = None, parameters: ~typing.Dict[str, ~typing.Any] = <factory>, required: ~typing.Optional[~typing.List[str]] = None)[source]#
Bases:
BaseModelDefines a function that can be called by the model.
- description: Optional[str]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str#
- parameters: Dict[str, Any]#
- required: Optional[List[str]]#
- class easydel.inference.openai_api_modules.ToolDefinition(*, type: str = 'function', function: FunctionDefinition)[source]#
Bases:
BaseModelDefines a tool that can be called by the model.
- function: FunctionDefinition#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- type: str#
- class easydel.inference.openai_api_modules.UsageInfo(*, prompt_tokens: int = 0, completion_tokens: Optional[int] = 0, total_tokens: int = 0, tokens_per_second: float = 0, processing_time: float = 0.0)[source]#
Bases:
BaseModelProvides information about token usage and processing time for a request.
- completion_tokens: Optional[int]#
- model_config: ClassVar[ConfigDict] = {}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- processing_time: float#
- prompt_tokens: int#
- tokens_per_second: float#
- total_tokens: int#