easydel.inference.tools.parsers.hermes_tool_parser#

class easydel.inference.tools.parsers.hermes_tool_parser.HermesToolParser(tokenizer: AutoTokenizer)[source]#

Bases: ToolParser

Tool call parser for Hermes models.

Handles tool calls wrapped in <tool_call> XML-style tags with JSON content. Designed for NousResearch Hermes models and similar architectures that use XML-style delimiters for function calling.

Format:

<tool_call>{“name”: “function_name”, “arguments”: {…}}</tool_call>

Features:
  • XML-style token boundary detection (<tool_call> and </tool_call>)

  • Token-level buffering for accurate boundary detection

  • Supports multiple tool calls in a single response

  • Handles partial JSON parsing for streaming

  • Scratch pad support for intermediate reasoning

current_tool_name_sent#

Tracks if function name was sent in stream

prev_tool_call_arr#

Previous tool calls for streaming comparison

current_tool_id#

Index of current tool being processed

streamed_args_for_tool#

Arguments sent so far for each tool

tool_call_start_token#

Opening delimiter for tool calls

tool_call_end_token#

Closing delimiter for tool calls

buffered_delta_text#

Buffer for multi-token delimiter detection

extract_tool_calls(model_output: str, request: ChatCompletionRequest) ExtractedToolCallInformation[source]#

Extract tool calls from complete model response.

Parses XML-style tool call tags and extracts JSON function calls. Supports multiple tool calls and returns remaining content.

Parameters
  • model_output – Complete model output containing tool calls

  • request – Original chat completion request (unused)

Returns

  • tools_called: Whether tool calls were found

  • tool_calls: List of ToolCall objects

  • content: Text content before tool calls (if any)

Return type

ExtractedToolCallInformation with

Example

Input: “Let me help. <tool_call>{“name”: “search”, “arguments”: {“q”: “weather”}}</tool_call>” Output: tools_called=True, tool_calls=[ToolCall(…)], content=”Let me help. “

extract_tool_calls_streaming(previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], request: ChatCompletionRequest) easydel.inference.openai_api_modules.DeltaMessage | None[source]#

Extract tool calls from streaming model output.

Handles incremental parsing of tool calls during streaming generation. Maintains state across calls to track partial tool calls and arguments. Uses buffering to handle multi-token delimiters correctly.

Parameters
  • previous_text – Text generated before this delta

  • current_text – Text including this delta

  • delta_text – New text in this streaming chunk

  • previous_token_ids – Token IDs before this delta

  • current_token_ids – Token IDs including this delta

  • delta_token_ids – New token IDs in this chunk

  • request – Original chat completion request

Returns

DeltaMessage with incremental tool call updates or content, or None if more tokens needed for parsing

State Management:
  • Tracks tool call boundaries with start/end token counts

  • Maintains current tool ID for multi-tool responses

  • Buffers partial arguments until complete

  • Handles transition between content and tool calls

tool_call_delta_buffer(delta_text: str) str[source]#

Buffer delta text to handle multi-token delimiters.

This method accumulates partial tokens that might form tool call delimiters, ensuring accurate boundary detection when delimiters span multiple tokens.

Parameters

delta_text – The new text delta from streaming

Returns

Processed text with complete delimiters or empty string if buffering