easydel.inference.tools.parsers.llama_tool_parser

easydel.inference.tools.parsers.llama_tool_parser#

class easydel.inference.tools.parsers.llama_tool_parser.Llama3JsonToolParser(tokenizer: PreTrainedTokenizerBase)[source]#

Bases: ToolParser

Tool call parser for Llama 3.x and 4 models with JSON format.

Intended for use with the examples/tool_chat_template_llama.jinja template. Handles JSON-formatted tool calls with support for both single and multiple tool invocations separated by semicolons.

Format supported: - Single: {“name”: “func”, “arguments”: {…}} - Multiple: {“name”: “func1”, …}; {“name”: “func2”, …} - Uses <|python_tag|> token as optional marker - Supports both “arguments” and “parameters” field names

Used when –enable-auto-tool-choice –tool-call-parser llama3_json or llama4_json are set.

bot_token#

Special token marking tool calls

Type: str

tool_call_regex#

Pattern for extracting JSON tool calls

Type: re.Pattern

prev_tool_call_arr#

Previous tool calls for streaming comparison

Type: list

extract_tool_calls(model_output: str, request: ChatCompletionRequest) → ExtractedToolCallInformation[source]#

Extract tool calls from complete Llama model response.

Extracts JSON content and ignores surrounding plain text. Supports both single JSON and multiple JSONs separated by semicolons. Handles both “arguments” and “parameters” field names for compatibility.

Parameters

model_output – Complete model output text
request – Original request (unused)

Returns

ExtractedToolCallInformation with parsed JSON tool calls

extract_tool_calls_streaming(previous_text: str, current_text: str, delta_text: str, previous_token_ids: Sequence[int], current_token_ids: Sequence[int], delta_token_ids: Sequence[int], request: ChatCompletionRequest) → easydel.inference.openai_api_modules.DeltaMessage | None[source]#

Extract tool calls from streaming model output.

Processes incremental model output to identify partial tool calls and emit appropriate streaming updates. Maintains state across calls to handle incomplete JSON/XML structures.

Parameters

previous_text – Text accumulated up to previous call
current_text – Text accumulated including current chunk
delta_text – New text in current chunk
previous_token_ids – Token IDs up to previous call
current_token_ids – Token IDs including current chunk
delta_token_ids – New token IDs in current chunk
request – Original request with tool definitions

Returns

Incremental tool call update, or None if no update

Return type

DeltaMessage

Raises

NotImplementedError – Must be implemented by subclasses

Note

This method is stateful - it uses instance variables to track parsing progress across streaming chunks.

easydel.inference.tools.parsers.llama_tool_parser

Contents

easydel.inference.tools.parsers.llama_tool_parser#