easydel.inference.esurge.request#
Request management for the eSurge engine.
Defines the core request structures and status tracking for managing inference requests throughout their lifecycle.
- Classes:
EngineRequest: Main request object for tracking generation EngineRequestStatus: Enum of request statuses
Example
>>> request = EngineRequest(
... request_id="req_123",
... prompt_token_ids=[1, 2, 3],
... sampling_params=params,
... eos_token_id=2
... )
>>> request.status = EngineRequestStatus.RUNNING
- class easydel.inference.esurge.request.EngineRequest(request_id: str, prompt_token_ids: list[int], sampling_params: easydel.inference.sampling_params.SamplingParams | None, eos_token_id: int | None, client_index: int = 0, arrival_time: float | None = None, priority: int = 0, parent_request_id: str | None = None, sample_index: int = 0)[source]#
Bases:
objectRequest object for tracking generation through the engine.
Manages the state and metadata of a single inference request, including tokens, sampling parameters, and execution status.
- request_id#
Unique identifier for the request.
- prompt_token_ids#
Input token IDs.
- sampling_params#
Parameters controlling generation.
- eos_token_id#
End-of-sequence token ID.
- client_index#
Index of the client making request.
- arrival_time#
Timestamp when request arrived.
- priority#
Request priority for scheduling.
- parent_request_id#
ID of parent request for n>1 sampling (None for n=1).
- sample_index#
Index of this sample (0 to n-1) for n>1 sampling.
- status#
Current request status.
- events#
List of events during processing.
- stop_reason#
Reason for stopping generation.
Example
>>> request = EngineRequest( ... request_id="req_123", ... prompt_token_ids=[1, 2, 3], ... sampling_params=sampling_params, ... eos_token_id=2 ... )
- classmethod from_engine_core_request(request: EngineCoreRequest) EngineRequest[source]#
- get_finished_reason() easydel.inference.esurge.engine_types.FinishReason | None[source]#
- property is_output_corrupted: bool#
- property num_output_tokens: int#
- property num_tokens: int#
- property num_tokens_with_spec: int#
- record_event(event_type: EngineCoreEventType, timestamp: float | None = None) None[source]#
- take_events() list[easydel.inference.esurge.engine_types.EngineCoreEvent] | None[source]#
- class easydel.inference.esurge.request.EngineRequestStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
Bases:
IntEnumStatus of a request.
- FINISHED_ABORTED = 8#
- FINISHED_IGNORED = 9#
- FINISHED_LENGTH_CAPPED = 7#
- FINISHED_STOPPED = 6#
- PREEMPTED = 5#
- RUNNING = 4#
- WAITING = 1#
- WAITING_FOR_FSM = 2#
- WAITING_FOR_REMOTE_KVS = 3#
- static get_finished_reason(status: EngineRequestStatus) easydel.inference.esurge.engine_types.FinishReason | None[source]#
- static is_finished(status: EngineRequestStatus) bool[source]#