easydel.inference.vinference.api_server.api_server_test

easydel.inference.vinference.api_server.api_server_test#

An example asynchronous client script for testing the vInference API server.

class easydel.inference.vinference.api_server.api_server_test.ChatCompletionClient(base_url: str)[source]#

Bases: object

An asynchronous client for interacting with the chat completion endpoint.

async create_chat_completion(messages: List[Dict[str, str]], model: str, stream: bool = True, **kwargs) → AsyncGenerator[Dict[str, Any], None][source]#

Sends a chat completion request to the server and streams the response.

Parameters

messages (tp.List[tp.Dict[str, str]]) – A list of message dictionaries, e.g., [{“role”: “user”, “content”: “Hello!”}].
model (str) – The name of the model to use.
stream (bool) – Whether to request a streaming response. Defaults to True.
**kwargs – Additional parameters to pass to the API (e.g., temperature, max_tokens).

Yields

tp.Dict[str, tp.Any] – Each chunk of the response as a dictionary.

Raises

Exception – If the server returns a non-200 status code.

async easydel.inference.vinference.api_server.api_server_test.main()[source]#: Main function to run the example chat completion interaction.

easydel.inference.vinference.api_server.api_server_test

Contents

easydel.inference.vinference.api_server.api_server_test#