easydel.inference.evaluations.esurge_eval#

class easydel.inference.evaluations.esurge_eval.eSurgeLMEvalAdapter(surge: eSurge, processor: Any, max_length: int = 8192, max_new_tokens: int = 2048, top_p: float = 0.95, temperature: float = 0.0, batch_size: int | None = None)[source]#

Bases: object

Adapter for EasyDeL models to be compatible with lm-evaluation-harness.

This class inherits from lm_eval.api.model.LM to ensure compatibility with the harness, allowing EasyDeL models to be evaluated using the lm-evaluation-harness framework. It wraps an eSurge instance for efficient inference with advanced features like smart bytecode decoding and context management.

apply_chat_template(messages, add_generation_prompt: bool)[source]#

Apply chat template to messages.

This method is required by lm_eval for chat-based evaluations.

Parameters

messages – List of message dictionaries with ‘role’ and ‘content’ keys

Returns

String with the formatted chat template applied

property batch_size#

Get the batch size.

property device#

Get the device (CPU/GPU).

property eot_token_id#

Get the end-of-text token ID.

generate_until(instances)[source]#

Generate text until a specified set of stop sequences is reached for each instance.

This method is part of the lm-evaluation-harness LM interface.

Parameters

instances – List of Instance objects from lm-evaluation-harness. Each instance is expected to contain the prompt as the first argument and an optional dictionary as the second argument with a ‘until’ key containing a list of stop sequences.

Returns

List of generated strings, one for each instance.

greedy_until(requests)[source]#

Generate completions for prompts until a stop sequence is reached using greedy decoding.

This method is part of the lm-evaluation-harness LM interface. It currently raises NotImplementedError as its functionality is covered by generate_until.

Parameters

requests – List of (context, stopping_sequences) tuples.

Returns

List of generated completions.

Raises

NotImplementedError – This method is not implemented as generate_until provides similar functionality.

loglikelihood(instances)[source]#

Compute log-likelihood of completions given contexts.

This method is part of the lm-evaluation-harness LM interface. It currently provides a placeholder implementation, especially for non-multiple-choice tasks.

Parameters

instances – List of Instance objects from lm-evaluation-harness. For multiple-choice tasks, instances are expected to have context and continuation.

Returns

List of (log_likelihood, is_greedy) tuples.

For multiple-choice tasks, log-likelihood is high if the extracted choice matches the continuation, low otherwise. For other tasks, a placeholder value is returned.

loglikelihood_rolling(instances)[source]#

Calculate log-likelihood of token sequences in a rolling fashion.

This method is part of the lm-evaluation-harness LM interface. It currently provides a placeholder implementation as actual rolling log-likelihood calculation might not be directly supported by the current eSurge setup.

Parameters

instances – List of Instance objects from lm-evaluation-harness. Instances are expected to contain the token sequence as the first argument.

Returns

List of lists of (loglikelihood, is_greedy) pairs, one inner list per instance. Each inner list contains pairs for each token in the sequence (except the first). Currently returns placeholder values.

property max_gen_toks#

Get the maximum number of tokens to generate.

property max_length#

Get the maximum context length.

stop()[source]#

Stop the eSurge engine.

Terminates the underlying eSurge scheduler thread.

tok_decode(tokens)[source]#

Decode token IDs into a string.

Parameters

tokens – A list or tensor of token IDs.

Returns

The decoded string.

tok_encode(string: str)[source]#

Encode a string into token IDs.

Parameters

string – The input string.

Returns

A list of token IDs.

property tokenizer_name#

Get the tokenizer name for chat template support.

Returns the name or path of the tokenizer/model being used. This is required by lm_eval for proper chat template handling.