easydel.inference.vinference.api_server.api_client

easydel.inference.vinference.api_server.api_client#

A client for interacting with the vInference API server, mimicking OpenAI’s API structure.

exception easydel.inference.vinference.api_server.api_client.vInferenceAPIError(status_code: int, message: str, response_content: Optional[str] = None)[source]#

Bases: Exception

Custom exception class for vInference API errors.

class easydel.inference.vinference.api_server.api_client.vInferenceChatCompletionClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#

Bases: object

Client for interacting with the vInference Chat Completion API endpoint.

This client handles communication with the vInference server, including sending requests, handling responses (streaming or non-streaming), managing retries, and parsing errors.

create_chat_completion(request: ChatCompletionRequest, extra_headers: Optional[dict] = None) → Generator[Union[ChatCompletionStreamResponse, ChatCompletionResponse], None, None][source]#

Sends a chat completion request to the vInference API.

Handles both streaming and non-streaming responses based on the stream attribute in the request object.

Parameters

request (ChatCompletionRequest) – The chat completion request object.
extra_headers (tp.Optional[dict]) – Optional dictionary of extra headers to include in the request. Defaults to None.

Yields

tp.Union[ChatCompletionStreamResponse, ChatCompletionResponse] – For streaming requests, yields ChatCompletionStreamResponse objects for each chunk received. For non-streaming requests, yields a single ChatCompletionResponse object.

Raises

vInferenceAPIError – If the API returns an error status code or if there’s an issue parsing the response.
requests.RequestException – For underlying network connection issues.

class easydel.inference.vinference.api_server.api_client.vInferenceClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#

Bases: object

Unified client for interacting with all vInference API endpoints.

This client provides access to both chat completions and text completions through a single interface.

class easydel.inference.vinference.api_server.api_client.vInferenceCompletionClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#

Bases: object

Client for interacting with the vInference Completion API endpoint.

This client handles communication with the vInference server for text completions, supporting both streaming and non-streaming modes.

create_completion(request: CompletionRequest, extra_headers: Optional[dict] = None) → Generator[Union[CompletionStreamResponse, CompletionResponse], None, None][source]#

Sends a text completion request to the vInference API.

Handles both streaming and non-streaming responses based on the stream attribute in the request object.

Parameters

request (CompletionRequest) – The completion request object.
extra_headers (tp.Optional[dict]) – Optional dictionary of extra headers to include in the request. Defaults to None.

Yields

tp.Union[CompletionStreamResponse, CompletionResponse] – For streaming requests, yields CompletionStreamResponse objects for each chunk received. For non-streaming requests, yields a single CompletionResponse object.

Raises

vInferenceAPIError – If the API returns an error status code or if there’s an issue parsing the response.
requests.RequestException – For underlying network connection issues.

easydel.inference.vinference.api_server.api_client

Contents

easydel.inference.vinference.api_server.api_client#