easydel.inference.vinference.api_server.api_client#
A client for interacting with the vInference API server, mimicking OpenAI’s API structure.
- exception easydel.inference.vinference.api_server.api_client.vInferenceAPIError(status_code: int, message: str, response_content: Optional[str] = None)[source]#
Bases:
ExceptionCustom exception class for vInference API errors.
- class easydel.inference.vinference.api_server.api_client.vInferenceChatCompletionClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#
Bases:
objectClient for interacting with the vInference Chat Completion API endpoint.
This client handles communication with the vInference server, including sending requests, handling responses (streaming or non-streaming), managing retries, and parsing errors.
- create_chat_completion(request: ChatCompletionRequest, extra_headers: Optional[dict] = None) Generator[Union[ChatCompletionStreamResponse, ChatCompletionResponse], None, None][source]#
Sends a chat completion request to the vInference API.
Handles both streaming and non-streaming responses based on the stream attribute in the request object.
- Parameters
request (ChatCompletionRequest) – The chat completion request object.
extra_headers (tp.Optional[dict]) – Optional dictionary of extra headers to include in the request. Defaults to None.
- Yields
tp.Union[ChatCompletionStreamResponse, ChatCompletionResponse] – For streaming requests, yields ChatCompletionStreamResponse objects for each chunk received. For non-streaming requests, yields a single ChatCompletionResponse object.
- Raises
vInferenceAPIError – If the API returns an error status code or if there’s an issue parsing the response.
requests.RequestException – For underlying network connection issues.
- class easydel.inference.vinference.api_server.api_client.vInferenceClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#
Bases:
objectUnified client for interacting with all vInference API endpoints.
This client provides access to both chat completions and text completions through a single interface.
- class easydel.inference.vinference.api_server.api_client.vInferenceCompletionClient(base_url: str, max_retries: int = 5, timeout: float = 30.0)[source]#
Bases:
objectClient for interacting with the vInference Completion API endpoint.
This client handles communication with the vInference server for text completions, supporting both streaming and non-streaming modes.
- create_completion(request: CompletionRequest, extra_headers: Optional[dict] = None) Generator[Union[CompletionStreamResponse, CompletionResponse], None, None][source]#
Sends a text completion request to the vInference API.
Handles both streaming and non-streaming responses based on the stream attribute in the request object.
- Parameters
request (CompletionRequest) – The completion request object.
extra_headers (tp.Optional[dict]) – Optional dictionary of extra headers to include in the request. Defaults to None.
- Yields
tp.Union[CompletionStreamResponse, CompletionResponse] – For streaming requests, yields CompletionStreamResponse objects for each chunk received. For non-streaming requests, yields a single CompletionResponse object.
- Raises
vInferenceAPIError – If the API returns an error status code or if there’s an issue parsing the response.
requests.RequestException – For underlying network connection issues.