easydel.inference.tools.parsers.hermes_tool_parser#
- class easydel.inference.tools.parsers.hermes_tool_parser.HermesToolParser(tokenizer: AutoTokenizer)[source]#
Bases:
ToolParserTool call parser for Hermes models.
Handles tool calls wrapped in <tool_call> XML-style tags with JSON content. Designed for NousResearch Hermes models and similar architectures that use XML-style delimiters for function calling.
- Format:
<tool_call>{“name”: “function_name”, “arguments”: {…}}</tool_call>
- Features:
XML-style token boundary detection (<tool_call> and </tool_call>)
Token-level buffering for accurate boundary detection
Supports multiple tool calls in a single response
Handles partial JSON parsing for streaming
Scratch pad support for intermediate reasoning
- current_tool_name_sent#
Tracks if function name was sent in stream
- prev_tool_call_arr#
Previous tool calls for streaming comparison
- current_tool_id#
Index of current tool being processed
- streamed_args_for_tool#
Arguments sent so far for each tool
- tool_call_start_token#
Opening delimiter for tool calls
- tool_call_end_token#
Closing delimiter for tool calls
- buffered_delta_text#
Buffer for multi-token delimiter detection
- extract_tool_calls(model_output: str, request: ChatCompletionRequest) ExtractedToolCallInformation[source]#
Extract tool calls from complete model response.
Parses XML-style tool call tags and extracts JSON function calls. Supports multiple tool calls and returns remaining content.
- Parameters
model_output – Complete model output containing tool calls
request – Original chat completion request (unused)
- Returns
tools_called: Whether tool calls were found
tool_calls: List of ToolCall objects
content: Text content before tool calls (if any)
- Return type
ExtractedToolCallInformation with
Example
Input: “Let me help. <tool_call>{“name”: “search”, “arguments”: {“q”: “weather”}}</tool_call>” Output: tools_called=True, tool_calls=[ToolCall(…)], content=”Let me help. “
- extract_tool_calls_streaming(previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], request: ChatCompletionRequest) easydel.inference.openai_api_modules.DeltaMessage | None[source]#
Extract tool calls from streaming model output.
Handles incremental parsing of tool calls during streaming generation. Maintains state across calls to track partial tool calls and arguments. Uses buffering to handle multi-token delimiters correctly.
- Parameters
previous_text – Text generated before this delta
current_text – Text including this delta
delta_text – New text in this streaming chunk
previous_token_ids – Token IDs before this delta
current_token_ids – Token IDs including this delta
delta_token_ids – New token IDs in this chunk
request – Original chat completion request
- Returns
DeltaMessage with incremental tool call updates or content, or None if more tokens needed for parsing
- State Management:
Tracks tool call boundaries with start/end token counts
Maintains current tool ID for multi-tool responses
Buffers partial arguments until complete
Handles transition between content and tool calls
- tool_call_delta_buffer(delta_text: str) str[source]#
Buffer delta text to handle multi-token delimiters.
This method accumulates partial tokens that might form tool call delimiters, ensuring accurate boundary detection when delimiters span multiple tokens.
- Parameters
delta_text – The new text delta from streaming
- Returns
Processed text with complete delimiters or empty string if buffering