easydel.inference.vwhisper.utils#
- easydel.inference.vwhisper.utils.chunk_iter_with_batch(audio_array: ndarray, chunk_length: int, stride_left: int, stride_right: int, batch_size: int, feature_extractor)[source]#
Process an audio array into chunks with overlapping strides.
- Parameters
audio_array – Input audio array
chunk_length – Length of each chunk in samples
stride_left – Left stride in samples
stride_right – Right stride in samples
batch_size – Number of chunks to process at once
feature_extractor – Feature extractor to process audio
- Yields
Batches of processed audio chunks
- easydel.inference.vwhisper.utils.process_audio_input(audio_input: Union[str, bytes, ndarray, Dict[str, Union[ndarray, int]]], feature_extractor)[source]#
Process audio input into a numpy array with correct sampling rate.
- Parameters
audio_input – Input audio in various formats
feature_extractor – Feature extractor with sampling rate info
- Returns
Tuple of (audio_array, stride)