easydel.inference.tools.parsers.hermes_tool_parser

easydel.inference.tools.parsers.hermes_tool_parser#

class easydel.inference.tools.parsers.hermes_tool_parser.HermesToolParser(tokenizer: AutoTokenizer)[source]#

Bases: ToolParser

Tool call parser for Hermes models.

Handles tool calls wrapped in <tool_call> XML-style tags with JSON content. Designed for NousResearch Hermes models and similar architectures that use XML-style delimiters for function calling.

Format:

<tool_call>{“name”: “function_name”, “arguments”: {…}}</tool_call>

Features:

XML-style token boundary detection (<tool_call> and </tool_call>)
Token-level buffering for accurate boundary detection
Supports multiple tool calls in a single response
Handles partial JSON parsing for streaming
Scratch pad support for intermediate reasoning

current_tool_name_sent#: Tracks if function name was sent in stream

prev_tool_call_arr#: Previous tool calls for streaming comparison

current_tool_id#: Index of current tool being processed

streamed_args_for_tool#: Arguments sent so far for each tool

tool_call_start_token#: Opening delimiter for tool calls

tool_call_end_token#: Closing delimiter for tool calls

buffered_delta_text#: Buffer for multi-token delimiter detection

extract_tool_calls(model_output: str, request: ChatCompletionRequest) → ExtractedToolCallInformation[source]#

Extract tool calls from complete model response.

Parses XML-style tool call tags and extracts JSON function calls. Supports multiple tool calls and returns remaining content.

Parameters

model_output – Complete model output containing tool calls
request – Original chat completion request (unused)

Returns

tools_called: Whether tool calls were found
tool_calls: List of ToolCall objects
content: Text content before tool calls (if any)

Return type

ExtractedToolCallInformation with

Example

Input: “Let me help. <tool_call>{“name”: “search”, “arguments”: {“q”: “weather”}}</tool_call>” Output: tools_called=True, tool_calls=[ToolCall(…)], content=”Let me help. “

extract_tool_calls_streaming(previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], request: ChatCompletionRequest) → easydel.inference.openai_api_modules.DeltaMessage | None[source]#

Extract tool calls from streaming model output.

Handles incremental parsing of tool calls during streaming generation. Maintains state across calls to track partial tool calls and arguments. Uses buffering to handle multi-token delimiters correctly.

Parameters

previous_text – Text generated before this delta
current_text – Text including this delta
delta_text – New text in this streaming chunk
previous_token_ids – Token IDs before this delta
current_token_ids – Token IDs including this delta
delta_token_ids – New token IDs in this chunk
request – Original chat completion request

Returns

DeltaMessage with incremental tool call updates or content, or None if more tokens needed for parsing

State Management:

Tracks tool call boundaries with start/end token counts
Maintains current tool ID for multi-tool responses
Buffers partial arguments until complete
Handles transition between content and tool calls

tool_call_delta_buffer(delta_text: str) → str[source]#

Buffer delta text to handle multi-token delimiters.

This method accumulates partial tokens that might form tool call delimiters, ensuring accurate boundary detection when delimiters span multiple tokens.

Parameters: delta_text – The new text delta from streaming
Returns: Processed text with complete delimiters or empty string if buffering

easydel.inference.tools.parsers.hermes_tool_parser

Contents

easydel.inference.tools.parsers.hermes_tool_parser#