easydel.trainers.prompt_utils

easydel.trainers.prompt_utils#

easydel.trainers.prompt_utils.apply_chat_template(example: dict[str, list[dict[str, str]]], tokenizer: Any, tools: Optional[list[Union[dict, Callable]]] = None) → dict[str, str][source]#

Apply a chat template to a conversational example along with the schema for a list of functions in tools.

For more details, see [maybe_apply_chat_template].

easydel.trainers.prompt_utils.convert_to_openai_format(input_data: Union[List[List[Dict[str, str]]], List[Dict[str, str]], Dict[str, str]]) → List[Dict[str, Union[str, List[Dict[str, str]]]]][source]#

Converts various input formats (list[list[dict]], list[dict], dict) into the OpenAI Chat Completions message list format.

If the input_data already conforms to the target OpenAIMessageList format (specifically with content as list of parts), it is returned directly.

Target Format Example for one message: {

“role”: “user”, “content”: [{“type”: “text”, “text”: “message content here”}]

}

Parameters: input_data – Data in one of the supported formats or already in the target OpenAIMessageList format. Keys like ‘role’, ‘content’, ‘text’, ‘message’ are searched case-insensitively within dictionaries during conversion.
Returns: A list of messages in the target OpenAI format. Returns an empty list if the input is invalid, cannot be parsed, results in no valid messages, or is an unsupported type. Returns the input directly if it already matches the target format.

easydel.trainers.prompt_utils.extract_prompt(example: dict[str, Sequence]) → dict[str, Sequence][source]#: Extracts the shared prompt from a preference data example, where the prompt is implicit within both the chosen and rejected completions.

easydel.trainers.prompt_utils.is_conversational(example: dict[str, Any]) → bool[source]#: Check if the example is in a conversational format.

easydel.trainers.prompt_utils.maybe_apply_chat_template(example: dict[str, list[dict[str, str]]], tokenizer: Any, tools: Optional[list[Union[dict, Callable]]] = None) → dict[str, str][source]#: If the example is in a conversational format, apply a chat template to it.

easydel.trainers.prompt_utils.maybe_extract_prompt(example: dict[str, list]) → dict[str, list][source]#: Extracts the shared prompt from a preference data example, where the prompt is implicit within both the chosen and rejected completions.

easydel.trainers.prompt_utils.maybe_unpair_preference_dataset(dataset: DatasetType, num_proc: Optional[int] = None, desc: Optional[str] = None) → DatasetType[source]#: Unpair a preference dataset if it is paired.

easydel.trainers.prompt_utils.reverse_openai_format(openai_messages: List[Dict[str, Union[str, List[Dict[str, str]]]]], content_key_name: str = 'content') → Optional[Union[Dict[str, str], List[Dict[str, str]]]][source]#

Converts a list of OpenAI Chat Completion messages back into simpler formats.

Input Format Example: [

{
“role”: “user”, “content”: [{“type”: “text”, “text”: “Hello AI.”}]

}, {

“role”: “assistant”, “content”: [{“type”: “text”, “text”: “Hello User!”}]

}

]

Output Format Examples: - If input has 1 message: {“role”: “user”, “content”: “Hello AI.”} - If input has >1 message: [

{“role”: “user”, “content”: “Hello AI.”}, {“role”: “assistant”, “content”: “Hello User!”}

]

If input is empty: []

Parameters

openai_messages – A list of messages in the OpenAI format.
content_key_name – The key name to use for the message text in the output dictionaries (e.g., “content”, “text”). Defaults to “content”.

Returns

A single dictionary if only one message was processed, a list of dictionaries if multiple messages were processed, an empty list if the input was empty, or None if the input list structure is invalid.

easydel.trainers.prompt_utils.unpair_preference_dataset(dataset: DatasetType, num_proc: Optional[int] = None, desc: Optional[str] = None) → DatasetType[source]#: Unpair a preference dataset.

easydel.trainers.prompt_utils

Contents

easydel.trainers.prompt_utils#