easydel.inference.esurge.request#

Request management for the eSurge engine.

Defines the core request structures and status tracking for managing inference requests throughout their lifecycle.

Classes:

EngineRequest: Main request object for tracking generation EngineRequestStatus: Enum of request statuses

Example

>>> request = EngineRequest(
...     request_id="req_123",
...     prompt_token_ids=[1, 2, 3],
...     sampling_params=params,
...     eos_token_id=2
... )
>>> request.status = EngineRequestStatus.RUNNING
class easydel.inference.esurge.request.EngineRequest(request_id: str, prompt_token_ids: list[int], sampling_params: easydel.inference.sampling_params.SamplingParams | None, eos_token_id: int | None, client_index: int = 0, arrival_time: float | None = None, priority: int = 0, parent_request_id: str | None = None, sample_index: int = 0)[source]#

Bases: object

Request object for tracking generation through the engine.

Manages the state and metadata of a single inference request, including tokens, sampling parameters, and execution status.

request_id#

Unique identifier for the request.

prompt_token_ids#

Input token IDs.

sampling_params#

Parameters controlling generation.

eos_token_id#

End-of-sequence token ID.

client_index#

Index of the client making request.

arrival_time#

Timestamp when request arrived.

priority#

Request priority for scheduling.

parent_request_id#

ID of parent request for n>1 sampling (None for n=1).

sample_index#

Index of this sample (0 to n-1) for n>1 sampling.

status#

Current request status.

events#

List of events during processing.

stop_reason#

Reason for stopping generation.

Example

>>> request = EngineRequest(
...     request_id="req_123",
...     prompt_token_ids=[1, 2, 3],
...     sampling_params=sampling_params,
...     eos_token_id=2
... )
append_output_token_ids(token_ids: int | list[int]) None[source]#
classmethod from_engine_core_request(request: EngineCoreRequest) EngineRequest[source]#
get_finished_reason() easydel.inference.esurge.engine_types.FinishReason | None[source]#
is_finished() bool[source]#
property is_output_corrupted: bool#
property num_output_tokens: int#
property num_tokens: int#
property num_tokens_with_spec: int#
record_event(event_type: EngineCoreEventType, timestamp: float | None = None) None[source]#
take_events() list[easydel.inference.esurge.engine_types.EngineCoreEvent] | None[source]#
class easydel.inference.esurge.request.EngineRequestStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#

Bases: IntEnum

Status of a request.

FINISHED_ABORTED = 8#
FINISHED_IGNORED = 9#
FINISHED_LENGTH_CAPPED = 7#
FINISHED_STOPPED = 6#
PREEMPTED = 5#
RUNNING = 4#
WAITING = 1#
WAITING_FOR_FSM = 2#
WAITING_FOR_REMOTE_KVS = 3#
static get_finished_reason(status: EngineRequestStatus) easydel.inference.esurge.engine_types.FinishReason | None[source]#
static is_finished(status: EngineRequestStatus) bool[source]#