easydel.inference.vinference.api_server.api_server_test#
An example asynchronous client script for testing the vInference API server.
- class easydel.inference.vinference.api_server.api_server_test.ChatCompletionClient(base_url: str)[source]#
Bases:
objectAn asynchronous client for interacting with the chat completion endpoint.
- async create_chat_completion(messages: List[Dict[str, str]], model: str, stream: bool = True, **kwargs) AsyncGenerator[Dict[str, Any], None][source]#
Sends a chat completion request to the server and streams the response.
- Parameters
messages (tp.List[tp.Dict[str, str]]) – A list of message dictionaries, e.g., [{“role”: “user”, “content”: “Hello!”}].
model (str) – The name of the model to use.
stream (bool) – Whether to request a streaming response. Defaults to True.
**kwargs – Additional parameters to pass to the API (e.g., temperature, max_tokens).
- Yields
tp.Dict[str, tp.Any] – Each chunk of the response as a dictionary.
- Raises
Exception – If the server returns a non-200 status code.