savant_rs.deepstream

class savant_rs.deepstream.ComputeMode

Compute backend for transform operations.

  • DEFAULT – VIC on Jetson, dGPU on x86_64 (default).

  • GPU – always use GPU compute.

  • VIC – VIC hardware (Jetson only, raises error on dGPU).

DEFAULT = ComputeMode.DEFAULT
GPU = ComputeMode.GPU
VIC = ComputeMode.VIC
class savant_rs.deepstream.DsNvBufSurfaceGstBuffer

RAII guard for an NvBufSurface-backed GstBuffer.

Wraps a GStreamer buffer and automatically unrefs it when the Python object is garbage-collected. Use ptr to obtain the raw pointer for interop with functions that accept raw addresses, and take to transfer ownership out of the guard.

static from_ptr(ptr, add_ref=True)

Wrap a raw GstBuffer* pointer in a guard.

Parameters:
  • ptr (int) – Raw GstBuffer* pointer address.

  • add_ref (bool) – If True (default) an additional reference is taken — use for borrowed pointers (pad probes, callbacks). If False the guard assumes ownership of an existing reference — use for pointers obtained via the legacy int-returning API.

Raises:

ValueError – If ptr is 0 (null).

ptr

Raw GstBuffer* pointer address.

Raises:

RuntimeError – If the buffer has been consumed via take.

take()

Transfer ownership out of the guard and return the raw pointer.

After this call the guard is empty — ptr will raise and the destructor becomes a no-op.

Returns:

Raw GstBuffer* pointer (caller owns the reference).

Return type:

int

Raises:

RuntimeError – If already consumed.

class savant_rs.deepstream.DsNvNonUniformSurfaceBuffer(max_batch_size, gpu_id=0)

Zero-copy heterogeneous batch (nvstreammux2-style).

Assembles individual NvBufSurface buffers of arbitrary dimensions and pixel formats into a single batched GstBuffer.

Parameters:
  • max_batch_size (int) – Maximum number of surfaces in the batch.

  • gpu_id (int) – GPU device ID (default 0).

Raises:

RuntimeError – If batch creation fails.

add(src_buf, id=None)

Add a source buffer to the batch (zero-copy).

The source buffer’s NvBufSurface is appended to the batch without copying pixel data.

Parameters:
  • src_buf_ptr (int) – Raw GstBuffer* pointer of the source NVMM surface.

  • id (int | None) – Optional frame identifier stored in SavantIdMeta. When None, the id is inherited from the source buffer’s existing SavantIdMeta (if any).

Raises:
  • ValueError – If src_buf_ptr is 0 (null).

  • RuntimeError – If the batch is already finalized or full.

as_gst_buffer()

Return the underlying GstBuffer guard. Available only after finalize.

Returns:

Guard for the finalized batch buffer.

Return type:

DsNvBufSurfaceGstBuffer

Raises:

RuntimeError – If not yet finalized.

extract_slot_view(slot_index)

Create a zero-copy single-frame view of one filled slot.

Available only after finalize.

Raises:

RuntimeError – If not yet finalized or slot index out of bounds.

finalize()

Finalize the batch (non-consuming).

Writes SavantIdMeta with the collected frame IDs and assembles the heterogeneous NvBufSurface. Call as_gst_buffer afterward to access the buffer.

Raises:

RuntimeError – If already finalized.

gpu_id
is_finalized
max_batch_size
num_filled
slot_ptr(index)

Return (data_ptr, pitch, width, height) for a slot by index.

Parameters:

index (int) – Zero-based slot index (0 .. num_filled - 1).

Returns:

(data_ptr, pitch, width, height) — CUDA device pointer, row stride, and the slot’s native dimensions in pixels.

Return type:

tuple[int, int, int, int]

Raises:

RuntimeError – If not yet finalized or index is out of bounds.

class savant_rs.deepstream.DsNvSurfaceBufferGenerator(format, width, height, fps_num=30, fps_den=1, gpu_id=0, mem_type=None, pool_size=4)

Python wrapper for DsNvSurfaceBufferGenerator.

Parameters:
  • format (VideoFormat | str) – Video format.

  • width (int) – Frame width in pixels.

  • height (int) – Frame height in pixels.

  • fps_num (int) – Framerate numerator (default 30).

  • fps_den (int) – Framerate denominator (default 1).

  • gpu_id (int) – GPU device ID (default 0).

  • mem_type (MemType | int) – Memory type (default MemType.DEFAULT).

  • pool_size (int) – Buffer pool size (default 4).

acquire_surface(id=None)

Acquire a new NvBufSurface buffer from the pool.

Returns:

Guard owning the acquired buffer.

Return type:

GstBuffer

acquire_surface_with_params(pts_ns, duration_ns, id=None)

Acquire a buffer and stamp PTS and duration on it.

Convenience wrapper around acquire_surface() + set_buffer_pts() + set_buffer_duration().

Parameters:
  • pts_ns (int) – Presentation timestamp in nanoseconds.

  • duration_ns (int) – Frame duration in nanoseconds.

  • id (int or None) – Optional buffer ID / frame index.

Returns:

Guard owning the acquired buffer.

Return type:

GstBuffer

acquire_surface_with_ptr(id=None)

Acquire a buffer and return (GstBuffer, data_ptr, pitch).

create_surface(gst_buffer_dest, id=None)

Create a new NvBufSurface and attach it to the given buffer.

format
height
nvmm_caps_str()

Return the NVMM caps string for configuring an appsrc.

push_to_appsrc(appsrc_ptr, pts_ns, duration_ns, id=None)

Push a new NVMM buffer to an AppSrc element.

static send_eos(appsrc_ptr)

Send an end-of-stream signal to an AppSrc element.

transform(src_buf, config, id=None, src_rect=None)

Transform (scale + letterbox) a source buffer into a new destination.

transform_with_ptr(src_buf, config, id=None, src_rect=None)

Like transform() but also returns (GstBuffer, data_ptr, pitch).

width
class savant_rs.deepstream.DsNvUniformSurfaceBuffer

Pool-allocated batched NvBufSurface with per-slot fill tracking.

Obtained from DsNvUniformSurfaceBufferGenerator.acquire_batched_surface. Fill individual slots with fill_slot, then call finalize, then as_gst_buffer to access the buffer.

as_gst_buffer()

Return the underlying GstBuffer guard. Available only after finalize.

Returns:

Guard for the finalized batched buffer.

Return type:

DsNvBufSurfaceGstBuffer

Raises:

RuntimeError – If not yet finalized.

extract_slot_view(slot_index)

Create a zero-copy single-frame view of one filled slot.

Available only after finalize.

Raises:

RuntimeError – If not yet finalized or slot index out of bounds.

fill_slot(src_buf, src_rect=None, id=None)

Transform a source buffer into the next available batch slot.

The source surface is scaled (with optional letterboxing) into the destination slot according to the TransformConfig that was passed to acquire_batched_surface. The same source buffer may be used for several slots with different src_rect regions.

Parameters:
  • src_buf_ptr (int) – Raw GstBuffer* pointer of the source NVMM surface (as returned by DsNvSurfaceBufferGenerator.acquire_surface).

  • src_rect (Rect | None) – Optional crop rectangle applied to the source before scaling. When None the full source frame is used. Coordinates are (top, left, width, height) in pixels.

  • id (int | None) – Optional frame identifier stored in SavantIdMeta. When None, the id is inherited from the source buffer’s existing SavantIdMeta (if any).

Raises:
  • ValueError – If src_buf_ptr is 0 (null).

  • RuntimeError – If the batch is already finalized, the batch is full, or the GPU transform fails.

finalize()

Finalize the batch (non-consuming).

Writes SavantIdMeta with the collected frame IDs and sets numFilled on the underlying NvBufSurface. Call as_gst_buffer afterward to access the buffer.

Raises:

RuntimeError – If already finalized.

is_finalized
max_batch_size
num_filled
slot_ptr(index)

Return (data_ptr, pitch) for a slot by index.

Parameters:

index (int) – Zero-based slot index (0 .. max_batch_size - 1).

Returns:

(data_ptr, pitch) — CUDA device pointer and row stride in bytes.

Return type:

tuple[int, int]

Raises:

RuntimeError – If index is out of bounds.

class savant_rs.deepstream.DsNvUniformSurfaceBufferGenerator(format, width, height, max_batch_size, pool_size=2, fps_num=30, fps_den=1, gpu_id=0, mem_type=None)

Homogeneous batched NvBufSurface buffer generator.

Produces buffers whose surfaceList is an array of independently fillable GPU surfaces, all sharing the same pixel format and dimensions.

Parameters:
  • format (VideoFormat | str) – Pixel format (e.g. "RGBA").

  • width (int) – Slot width in pixels.

  • height (int) – Slot height in pixels.

  • max_batch_size (int) – Maximum number of slots per batch.

  • pool_size (int) – Number of pre-allocated batched buffers (default 2).

  • fps_num (int) – Framerate numerator (default 30).

  • fps_den (int) – Framerate denominator (default 1).

  • gpu_id (int) – GPU device ID (default 0).

  • mem_type (MemType | None) – Memory type (default MemType.DEFAULT).

Raises:

RuntimeError – If pool creation fails.

acquire_batched_surface(config)

Acquire a DsNvUniformSurfaceBuffer from the pool, ready for slot filling.

Parameters:

config (TransformConfig) – Scaling / letterboxing configuration applied to every fill_slot call on the returned surface.

Returns:

A fresh batched surface with num_filled == 0.

Return type:

DsNvUniformSurfaceBuffer

Raises:

RuntimeError – If the pool is exhausted.

format
gpu_id
height
max_batch_size
width
class savant_rs.deepstream.DstPadding(left=0, top=0, right=0, bottom=0)

Optional per-side destination padding for letterboxing.

When set in TransformConfig.dst_padding, reduces the effective destination area before the letterbox rect is computed.

bottom
left
right
top
class savant_rs.deepstream.Interpolation

Interpolation method for scaling.

  • NEAREST – nearest-neighbor.

  • BILINEAR – bilinear (default).

  • ALGO1 – GPU: cubic, VIC: 5-tap.

  • ALGO2 – GPU: super, VIC: 10-tap.

  • ALGO3 – GPU: Lanczos, VIC: smart.

  • ALGO4 – GPU: (ignored), VIC: nicest.

  • DEFAULT – GPU: nearest, VIC: nearest.

ALGO1 = Interpolation.ALGO1
ALGO2 = Interpolation.ALGO2
ALGO3 = Interpolation.ALGO3
ALGO4 = Interpolation.ALGO4
BILINEAR = Interpolation.BILINEAR
DEFAULT = Interpolation.DEFAULT
NEAREST = Interpolation.NEAREST
class savant_rs.deepstream.MemType

NvBufSurface memory type.

  • DEFAULT — CUDA Device for dGPU, Surface Array for Jetson.

  • CUDA_PINNED — CUDA Host (pinned) memory.

  • CUDA_DEVICE — CUDA Device memory.

  • CUDA_UNIFIED — CUDA Unified memory.

  • SURFACE_ARRAY — NVRM Surface Array (Jetson only).

  • HANDLE — NVRM Handle (Jetson only).

  • SYSTEM — System memory (malloc).

CUDA_DEVICE = MemType.CUDA_DEVICE
CUDA_PINNED = MemType.CUDA_PINNED
CUDA_UNIFIED = MemType.CUDA_UNIFIED
DEFAULT = MemType.DEFAULT
HANDLE = MemType.HANDLE
SURFACE_ARRAY = MemType.SURFACE_ARRAY
SYSTEM = MemType.SYSTEM
name()

Return the canonical name of this memory type.

class savant_rs.deepstream.Padding

Padding mode for letterboxing.

  • NONE – scale to fill, may distort aspect ratio.

  • RIGHT_BOTTOM – image at top-left, padding on right/bottom.

  • SYMMETRIC – image centered, equal padding on all sides (default).

NONE = Padding.NONE
RIGHT_BOTTOM = Padding.RIGHT_BOTTOM
SYMMETRIC = Padding.SYMMETRIC
class savant_rs.deepstream.Rect(top, left, width, height)

A rectangle in pixel coordinates (top, left, width, height).

Used as an optional source crop region for transform and send_frame.

height
left
top
width
class savant_rs.deepstream.SkiaContext(width, height, gpu_id=0)

GPU-accelerated Skia rendering context backed by CUDA-GL interop.

fbo_id
static from_nvbuf(buf, gpu_id=0)
height
render_to_nvbuf(buf, config=None)
width
class savant_rs.deepstream.SurfaceView

Zero-copy view of a single GPU surface.

Wraps an NvBufSurface-backed buffer or arbitrary CUDA memory with cached surface parameters. Implements __cuda_array_interface__ for single-plane formats (RGBA, BGRx, GRAY8) so the surface can be consumed by CuPy, PyTorch, and other CUDA-aware libraries.

Construction:

  • SurfaceView.from_buffer(buf, slot_index) — from a GstBuffer.

  • SurfaceView.from_cuda_array(obj) — from any object exposing __cuda_array_interface__ (CuPy array, PyTorch CUDA tensor, etc.).

channels

Number of interleaved channels per pixel.

color_format

Raw NvBufSurfaceColorFormat value.

data_ptr

CUDA data pointer to the first pixel.

static from_buffer(buf, slot_index=0)

Create a view from an NvBufSurface-backed buffer.

Parameters:
  • buf (GstBuffer | int) – Source buffer.

  • slot_index (int) – Zero-based slot index (default 0).

Raises:
  • ValueError – If buf is null or slot_index is out of bounds.

  • RuntimeError – If the buffer is not a valid NvBufSurface or uses a multi-plane format (NV12, I420, etc.).

static from_cuda_array(obj, gpu_id=0)

Create a view from any object exposing __cuda_array_interface__.

Supported shapes:

  • (H, W, C) — interleaved: C must be 1 (GRAY8) or 4 (RGBA).

  • (H, W) — grayscale (GRAY8).

The source object is kept alive for the lifetime of this view.

Parameters:
  • obj – A CuPy array, PyTorch CUDA tensor, or any object with __cuda_array_interface__.

  • gpu_id (int) – CUDA device ID (default 0).

Raises:
  • TypeError – If obj has no __cuda_array_interface__.

  • ValueError – If shape, dtype, or strides are unsupported.

gpu_id

GPU device ID.

height

Surface height in pixels.

pitch

Row stride in bytes.

width

Surface width in pixels.

class savant_rs.deepstream.TransformConfig(padding=Ellipsis, dst_padding=None, interpolation=Ellipsis, compute_mode=Ellipsis)

Configuration for a transform (scale / letterbox) operation.

All fields have sensible defaults (Padding.SYMMETRIC, Interpolation.BILINEAR, ComputeMode.DEFAULT).

compute_mode
dst_padding
interpolation
padding
class savant_rs.deepstream.VideoFormat

Video pixel format.

  • RGBA — 8-bit RGBA (4 bytes/pixel).

  • BGRx — 8-bit BGRx (4 bytes/pixel, alpha ignored).

  • NV12 — YUV 4:2:0 semi-planar (default encoder format).

  • NV21 — YUV 4:2:0 semi-planar (UV swapped).

  • I420 — YUV 4:2:0 planar (JPEG encoder format).

  • UYVY — YUV 4:2:2 packed.

  • GRAY8 — single-channel grayscale.

BGRx = VideoFormat.BGRx
GRAY8 = VideoFormat.GRAY8
I420 = VideoFormat.I420
NV12 = VideoFormat.NV12
NV21 = VideoFormat.NV21
RGBA = VideoFormat.RGBA
UYVY = VideoFormat.UYVY
static from_name(name)

Parse a video format from a string name.

name()

Return the canonical name of this format (e.g. "NV12").

savant_rs.deepstream.bridge_savant_id_meta(element_ptr)

Install pad probes on an element to propagate SavantIdMeta.

Parameters:

element_ptr (int) – Raw pointer address of the GstElement.

savant_rs.deepstream.get_nvbufsurface_info(buf)

Extract NvBufSurface descriptor fields from an existing GstBuffer.

Returns:

(data_ptr, pitch, width, height)

Return type:

tuple[int, int, int, int]

savant_rs.deepstream.get_savant_id_meta(buf)

Read SavantIdMeta from a GStreamer buffer.

Returns:

Meta entries, e.g. [("frame", 42)].

Return type:

list[tuple[str, int]]

savant_rs.deepstream.gpu_mem_used_mib(gpu_id=0)

Returns GPU memory currently used, in MiB.

  • dGPU (x86_64): Uses NVML to query device gpu_id.

  • Jetson (aarch64): Reads /proc/meminfo (unified memory).

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

GPU memory used in MiB.

Return type:

int

Raises:

RuntimeError – If NVML or /proc/meminfo is unavailable.

savant_rs.deepstream.init_cuda(gpu_id=0)

Initialize CUDA context for the given GPU device.

Parameters:

gpu_id (int) – GPU device ID (default 0).

savant_rs.deepstream.release_buffer(buf_ptr)

Release (unref) a raw GstBuffer* pointer.

Call this to free a buffer obtained from acquire_surface, acquire_surface_with_params, acquire_surface_with_ptr, transform, transform_with_ptr, or finalize when the buffer is no longer needed and is not being passed into a GStreamer pipeline.

Parameters:

buf_ptr (int) – Raw GstBuffer* pointer to release.

Raises:

ValueError – If buf_ptr is 0 (null).

savant_rs.deepstream.set_buffer_duration(buf, duration_ns)

Set the duration on a GstBuffer.

Parameters:
  • buf (GstBuffer | int) – Buffer to modify.

  • duration_ns (int) – Duration in nanoseconds.

savant_rs.deepstream.set_buffer_pts(buf, pts_ns)

Set the PTS (presentation timestamp) on a GstBuffer.

Parameters:
  • buf (GstBuffer | int) – Buffer to modify.

  • pts_ns (int) – PTS in nanoseconds.

savant_rs.deepstream.set_num_filled(buf, count)

Set numFilled on a batched NvBufSurface GstBuffer.

Parameters:
  • buf (GstBuffer | int) – Buffer containing a batched NvBufSurface.

  • count (int) – Number of filled slots.

Pure-Python helpers

The following symbols are injected into savant_rs.deepstream at import time and are available as from savant_rs.deepstream import ....

OpenCV CUDA GpuMat helpers for NvBufSurface buffers.

Injected into savant_rs.deepstream at import time so that from savant_rs.deepstream import nvgstbuf_as_gpu_mat etc. work.

Two context managers for different call sites:

  • nvgstbuf_as_gpu_mat() — takes a DsNvBufSurfaceGstBuffer guard (or raw int pointer), extracts NvBufSurface metadata internally. Use outside callbacks (e.g. pre-filling a background before send_frame).

  • nvbuf_as_gpu_mat() — takes raw CUDA params (data_ptr, pitch, width, height) directly. Use inside the on_gpumat callback which already provides these values.

  • GpuMatCudaArray — exposes __cuda_array_interface__ (v3) for a cv2.cuda.GpuMat, bridging it to consumers like Picasso send_frame.

  • make_gpu_mat() — allocates a zero-initialised GpuMat.

class savant_rs._ds_gpumat.GpuMatCudaArray(mat: GpuMat)

Exposes __cuda_array_interface__ (v3) for a cv2.cuda.GpuMat.

OpenCV’s GpuMat does not implement the protocol natively, so this thin wrapper bridges it to any consumer that expects the interface (CuPy, SurfaceView.from_cuda_array, Picasso send_frame, etc.).

Only CV_8UC1 (GRAY8) and CV_8UC4 (RGBA) mats are supported.

The wrapper keeps a reference to the source mat so the underlying device memory stays alive for as long as this object exists.

savant_rs._ds_gpumat.from_gpumat(gen: DsNvSurfaceBufferGenerator, gpumat: GpuMat, *, interpolation: int = 1, id: int | None = None) DsNvBufSurfaceGstBuffer

Acquire a buffer from the pool and fill it from a GpuMat.

If the source GpuMat dimensions differ from the generator’s dimensions the image is scaled using cv2.cuda.resize() with the given interpolation method. When sizes match the data is copied directly (zero-overhead copyTo).

Parameters:
  • gen – Surface generator (determines destination dimensions and format).

  • gpumat – Source GpuMat (must be CV_8UC4).

  • interpolation – OpenCV interpolation flag (default cv2.INTER_LINEAR). Common choices: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA.

  • id – Optional frame identifier for SavantIdMeta.

Returns:

DsNvBufSurfaceGstBuffer RAII guard owning the newly acquired buffer.

savant_rs._ds_gpumat.make_gpu_mat(width: int, height: int, channels: int = 4) GpuMat

Allocate a cv2.cuda.GpuMat of the given size.

Returns:

A zero-initialised GpuMat with CV_8UC<channels> type.

savant_rs._ds_gpumat.nvbuf_as_gpu_mat(data_ptr: int, pitch: int, width: int, height: int, stream: Stream | None = None) Generator[tuple[GpuMat, Stream], None, None]

Wrap raw CUDA memory as an OpenCV CUDA GpuMat.

Unlike nvgstbuf_as_gpu_mat(), this function takes the CUDA device pointer and layout directly — no GstBuffer or get_nvbufsurface_info call involved. Designed for the Picasso on_gpumat callback which already supplies these values.

Parameters:
  • data_ptr – CUDA device pointer to the surface data.

  • pitch – Row stride in bytes.

  • width – Surface width in pixels.

  • height – Surface height in pixels.

Yields:

(gpumat, stream) – the GpuMat is CV_8UC4.

savant_rs._ds_gpumat.nvgstbuf_as_gpu_mat(buf: DsNvBufSurfaceGstBuffer | int, stream: Stream | None = None) Generator[tuple[GpuMat, Stream], None, None]

Expose an NvBufSurface DsNvBufSurfaceGstBuffer as an OpenCV CUDA GpuMat.

Extracts the CUDA device pointer, pitch, width and height from the buffer’s NvBufSurface metadata, then creates a zero-copy GpuMat together with a CUDA Stream. When the with block exits the stream is synchronised (waitForCompletion).

Parameters:

bufDsNvBufSurfaceGstBuffer RAII guard or raw GstBuffer* pointer as int.

Yields:

(gpumat, stream) – the GpuMat is CV_8UC4 with the buffer’s native width, height and pitch.

Convenience wrapper: SkiaContext + skia-python in one object.

Injected into savant_rs.deepstream at import time so that from savant_rs.deepstream import SkiaCanvas works.

class savant_rs._ds_skia_canvas.SkiaCanvas(ctx)

Convenience wrapper: SkiaContext + skia-python in one object.

Handles creation of the skia GrDirectContext and Surface backed by the SkiaContext’s GPU FBO.

canvas() Canvas

Get the skia-python Canvas for drawing.

classmethod create(width: int, height: int, gpu_id: int = 0)

Create with an empty (transparent) canvas.

Parameters:
  • width – Canvas width in pixels.

  • height – Canvas height in pixels.

  • gpu_id – GPU device ID (default 0).

classmethod from_fbo(fbo_id: int, width: int, height: int) SkiaCanvas

Create from an existing OpenGL FBO.

Used internally by the Picasso on_render callback to wrap the worker’s GPU canvas without creating a separate SkiaContext.

Parameters:
  • fbo_id – OpenGL FBO ID backing the canvas.

  • width – Canvas width in pixels.

  • height – Canvas height in pixels.

classmethod from_nvbuf(buf_ptr: int, gpu_id: int = 0)

Create with canvas pre-loaded from an NvBufSurface.

Canvas dimensions match the source buffer.

Parameters:
  • buf_ptr – Raw pointer of the source GstBuffer.

  • gpu_id – GPU device ID (default 0).

property gr_context: GrDirectContext

The Skia GPU GrDirectContext backing this canvas.

Use this to create GPU-resident images via skia.Image.makeTextureImage() for efficient repeated drawing without per-frame CPU -> GPU transfers:

raster = skia.Image.MakeFromEncoded(data)
gpu_img = raster.makeTextureImage(canvas.gr_context)
# gpu_img now lives in VRAM; drawImage is pure GPU work
property height: int

Canvas height in pixels.

render_to_nvbuf(buf_ptr: int, config: TransformConfig | None = None)

Flush Skia and copy to destination NvBufSurface.

Supports optional scaling + letterboxing when canvas dimensions differ from the destination buffer.

Parameters:
  • buf_ptr – Raw pointer of the destination GstBuffer.

  • config – Optional TransformConfig for scaling / letterboxing. None means direct 1:1 copy (canvas and destination must have the same dimensions).

property width: int

Canvas width in pixels.