easydel.inference.evaluations.esurge_eval#
- class easydel.inference.evaluations.esurge_eval.eSurgeLMEvalAdapter(surge: eSurge, processor: Any, max_length: int = 8192, max_new_tokens: int = 2048, top_p: float = 0.95, temperature: float = 0.0, batch_size: int | None = None)[source]#
Bases:
objectAdapter for EasyDeL models to be compatible with lm-evaluation-harness.
This class inherits from lm_eval.api.model.LM to ensure compatibility with the harness, allowing EasyDeL models to be evaluated using the lm-evaluation-harness framework. It wraps an eSurge instance for efficient inference with advanced features like smart bytecode decoding and context management.
- apply_chat_template(messages, add_generation_prompt: bool)[source]#
Apply chat template to messages.
This method is required by lm_eval for chat-based evaluations.
- Parameters
messages – List of message dictionaries with ‘role’ and ‘content’ keys
- Returns
String with the formatted chat template applied
- property batch_size#
Get the batch size.
- property device#
Get the device (CPU/GPU).
- property eot_token_id#
Get the end-of-text token ID.
- generate_until(instances)[source]#
Generate text until a specified set of stop sequences is reached for each instance.
This method is part of the lm-evaluation-harness LM interface.
- Parameters
instances – List of Instance objects from lm-evaluation-harness. Each instance is expected to contain the prompt as the first argument and an optional dictionary as the second argument with a ‘until’ key containing a list of stop sequences.
- Returns
List of generated strings, one for each instance.
- greedy_until(requests)[source]#
Generate completions for prompts until a stop sequence is reached using greedy decoding.
This method is part of the lm-evaluation-harness LM interface. It currently raises NotImplementedError as its functionality is covered by generate_until.
- Parameters
requests – List of (context, stopping_sequences) tuples.
- Returns
List of generated completions.
- Raises
NotImplementedError – This method is not implemented as generate_until provides similar functionality.
- loglikelihood(instances)[source]#
Compute log-likelihood of completions given contexts.
This method is part of the lm-evaluation-harness LM interface. It currently provides a placeholder implementation, especially for non-multiple-choice tasks.
- Parameters
instances – List of Instance objects from lm-evaluation-harness. For multiple-choice tasks, instances are expected to have context and continuation.
- Returns
- List of (log_likelihood, is_greedy) tuples.
For multiple-choice tasks, log-likelihood is high if the extracted choice matches the continuation, low otherwise. For other tasks, a placeholder value is returned.
- loglikelihood_rolling(instances)[source]#
Calculate log-likelihood of token sequences in a rolling fashion.
This method is part of the lm-evaluation-harness LM interface. It currently provides a placeholder implementation as actual rolling log-likelihood calculation might not be directly supported by the current eSurge setup.
- Parameters
instances – List of Instance objects from lm-evaluation-harness. Instances are expected to contain the token sequence as the first argument.
- Returns
List of lists of (loglikelihood, is_greedy) pairs, one inner list per instance. Each inner list contains pairs for each token in the sequence (except the first). Currently returns placeholder values.
- property max_gen_toks#
Get the maximum number of tokens to generate.
- property max_length#
Get the maximum context length.
- tok_decode(tokens)[source]#
Decode token IDs into a string.
- Parameters
tokens – A list or tensor of token IDs.
- Returns
The decoded string.
- tok_encode(string: str)[source]#
Encode a string into token IDs.
- Parameters
string – The input string.
- Returns
A list of token IDs.
- property tokenizer_name#
Get the tokenizer name for chat template support.
Returns the name or path of the tokenizer/model being used. This is required by lm_eval for proper chat template handling.