easydel.inference.tools.parsers.llama_tool_parser#
- class easydel.inference.tools.parsers.llama_tool_parser.Llama3JsonToolParser(tokenizer: PreTrainedTokenizerBase)[source]#
Bases:
ToolParserTool call parser for Llama 3.x and 4 models with JSON format.
Intended for use with the examples/tool_chat_template_llama.jinja template. Handles JSON-formatted tool calls with support for both single and multiple tool invocations separated by semicolons.
Format supported: - Single: {“name”: “func”, “arguments”: {…}} - Multiple: {“name”: “func1”, …}; {“name”: “func2”, …} - Uses <|python_tag|> token as optional marker - Supports both “arguments” and “parameters” field names
Used when –enable-auto-tool-choice –tool-call-parser llama3_json or llama4_json are set.
- bot_token#
Special token marking tool calls
- Type
str
- tool_call_regex#
Pattern for extracting JSON tool calls
- Type
re.Pattern
- prev_tool_call_arr#
Previous tool calls for streaming comparison
- Type
list
- extract_tool_calls(model_output: str, request: ChatCompletionRequest) ExtractedToolCallInformation[source]#
Extract tool calls from complete Llama model response.
Extracts JSON content and ignores surrounding plain text. Supports both single JSON and multiple JSONs separated by semicolons. Handles both “arguments” and “parameters” field names for compatibility.
- Parameters
model_output – Complete model output text
request – Original request (unused)
- Returns
ExtractedToolCallInformation with parsed JSON tool calls
- extract_tool_calls_streaming(previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], request: ChatCompletionRequest) easydel.inference.openai_api_modules.DeltaMessage | None[source]#
Extract tool calls from streaming model output.
Processes incremental model output to identify partial tool calls and emit appropriate streaming updates. Maintains state across calls to handle incomplete JSON/XML structures.
- Parameters
previous_text – Text accumulated up to previous call
current_text – Text accumulated including current chunk
delta_text – New text in current chunk
previous_token_ids – Token IDs up to previous call
current_token_ids – Token IDs including current chunk
delta_token_ids – New token IDs in current chunk
request – Original request with tool definitions
- Returns
Incremental tool call update, or None if no update
- Return type
- Raises
NotImplementedError – Must be implemented by subclasses
Note
This method is stateful - it uses instance variables to track parsing progress across streaming chunks.