savant_rs.deepstream

class savant_rs.deepstream.BufferGenerator(format, width, height, fps_num=30, fps_den=1, gpu_id=0, mem_type=None, pool_size=4)

Python wrapper for BufferGenerator.

Parameters:
  • format (VideoFormat | str) – Video format.

  • width (int) – Frame width in pixels.

  • height (int) – Frame height in pixels.

  • fps_num (int) – Framerate numerator (default 30).

  • fps_den (int) – Framerate denominator (default 1).

  • gpu_id (int) – GPU device ID (default 0).

  • mem_type (MemType | int) – Memory type (default MemType.DEFAULT).

  • pool_size (int) – Buffer pool size (default 4).

acquire(id=None)

Acquire a new NvBufSurface buffer from the pool.

Returns:

Guard owning the acquired buffer.

Return type:

SharedBuffer

acquire_with_params(pts_ns, duration_ns, id=None)

Acquire a buffer and stamp PTS and duration on it.

Convenience wrapper around acquire() that stamps PTS and duration on the buffer.

Parameters:
  • pts_ns (int) – Presentation timestamp in nanoseconds.

  • duration_ns (int) – Frame duration in nanoseconds.

  • id (int or None) – Optional buffer ID / frame index.

Returns:

Guard owning the acquired buffer.

Return type:

SharedBuffer

format
height
nvmm_caps_str()

Return the NVMM caps string for configuring an appsrc.

static send_eos(appsrc_ptr)

Send an end-of-stream signal to an AppSrc element.

transform(src_buf, config, id=None, src_rect=None)

Transform (scale + letterbox) a source buffer into a new destination.

width
class savant_rs.deepstream.ComputeMode

Compute backend for transform operations.

  • DEFAULT – VIC on Jetson, dGPU on x86_64 (default).

  • GPU – always use GPU compute.

  • VIC – VIC hardware (Jetson only, raises error on dGPU).

DEFAULT = ComputeMode.DEFAULT
GPU = ComputeMode.GPU
VIC = ComputeMode.VIC
class savant_rs.deepstream.DstPadding(left=0, top=0, right=0, bottom=0)

Optional per-side destination padding for letterboxing.

When set in TransformConfig.dst_padding, reduces the effective destination area before the letterbox rect is computed.

bottom
left
right
top
static uniform(value)

Create destination padding with equal values on all sides.

Parameters:

value – Padding value applied to left, top, right, and bottom.

Returns:

A new DstPadding with all sides set to value.

class savant_rs.deepstream.Interpolation

Interpolation method for scaling.

Variants whose behaviour differs between GPU (dGPU / x86_64) and VIC (Video Image Compositor / Jetson) carry compound names.

  • NEAREST – nearest-neighbor (same on both).

  • BILINEAR – bilinear (default, same on both).

  • GPU_CUBIC_VIC_5TAP – GPU: cubic, VIC: 5-tap.

  • GPU_SUPER_VIC_10TAP – GPU: super-sampling, VIC: 10-tap.

  • GPU_LANCZOS_VIC_SMART – GPU: Lanczos, VIC: smart.

  • GPU_IGNORED_VIC_NICEST – GPU: ignored (no-op), VIC: nicest.

  • DEFAULT – platform default (nearest on both).

BILINEAR = Interpolation.BILINEAR
DEFAULT = Interpolation.DEFAULT
GPU_CUBIC_VIC_5TAP = Interpolation.GPU_CUBIC_VIC_5TAP
GPU_IGNORED_VIC_NICEST = Interpolation.GPU_IGNORED_VIC_NICEST
GPU_LANCZOS_VIC_SMART = Interpolation.GPU_LANCZOS_VIC_SMART
GPU_SUPER_VIC_10TAP = Interpolation.GPU_SUPER_VIC_10TAP
NEAREST = Interpolation.NEAREST
static from_name(name)

Parse an interpolation method from a string name.

Accepts canonical names ("cubic", "lanczos", etc.) and legacy names ("algo1""algo4"). Case-insensitive.

class savant_rs.deepstream.MemType

NvBufSurface memory type.

  • DEFAULT — CUDA Device for dGPU, Surface Array for Jetson.

  • CUDA_PINNED — CUDA Host (pinned) memory.

  • CUDA_DEVICE — CUDA Device memory.

  • CUDA_UNIFIED — CUDA Unified memory.

  • SURFACE_ARRAY — NVRM Surface Array (Jetson only).

  • HANDLE — NVRM Handle (Jetson only).

  • SYSTEM — System memory (malloc).

CUDA_DEVICE = MemType.CUDA_DEVICE
CUDA_PINNED = MemType.CUDA_PINNED
CUDA_UNIFIED = MemType.CUDA_UNIFIED
DEFAULT = MemType.DEFAULT
HANDLE = MemType.HANDLE
SURFACE_ARRAY = MemType.SURFACE_ARRAY
SYSTEM = MemType.SYSTEM
name()

Return the canonical name of this memory type.

class savant_rs.deepstream.NonUniformBatch(gpu_id=0)

Zero-copy heterogeneous batch (nvstreammux2-style).

Assembles individual NvBufSurface buffers of arbitrary dimensions and pixel formats into a single batched GstBuffer.

Parameters:

gpu_id (int) – GPU device ID (default 0).

add(src_view)

Add a source SurfaceView to the batch (zero-copy).

finalize(ids=None)

Finalize the batch and return the underlying SharedBuffer.

The batch is consumed; further calls will raise RuntimeError.

Parameters:

ids (list[tuple[SavantIdMetaKind, int]] | None) – Optional per-slot SavantIdMeta entries.

gpu_id
num_filled
class savant_rs.deepstream.Padding

Padding mode for letterboxing.

  • NONE – scale to fill, may distort aspect ratio.

  • RIGHT_BOTTOM – image at top-left, padding on right/bottom.

  • SYMMETRIC – image centered, equal padding on all sides (default).

NONE = Padding.NONE
RIGHT_BOTTOM = Padding.RIGHT_BOTTOM
SYMMETRIC = Padding.SYMMETRIC
static from_name(name)

Parse a padding mode from a string name.

Accepts "none", "right_bottom" / "rightbottom", "symmetric". Case-insensitive.

class savant_rs.deepstream.Rect(top, left, width, height)

A rectangle in pixel coordinates (top, left, width, height).

Used as an optional source crop region for transform and send_frame.

height
left
top
width
class savant_rs.deepstream.SavantIdMetaKind

Kind tag for SavantIdMeta entries.

Each NvBufSurface buffer can carry a list of (SavantIdMetaKind, int) pairs that identify the logical frame or batch it belongs to.

  • FRAME — per-frame identifier.

  • BATCH — per-batch identifier.

BATCH = SavantIdMetaKind.BATCH
FRAME = SavantIdMetaKind.FRAME
class savant_rs.deepstream.SharedBuffer

Safe Python wrapper for a SharedBuffer.

Uses the Option<T> pattern to emulate Rust move semantics in Python. After a consuming Rust method (e.g. nvinfer.submit) calls [take_inner](Self::take_inner), the wrapper becomes empty and all subsequent property access raises RuntimeError.

Python code cannot construct, clone, or deconstruct this type.

duration_ns

Buffer duration in nanoseconds, or None if unset.

is_consumed

True if the buffer has been consumed (inner is None).

pts_ns

Buffer PTS in nanoseconds, or None if unset.

savant_ids()

Read SavantIdMeta from the buffer.

Returns:

Meta entries,

e.g. [(SavantIdMetaKind.FRAME, 42)].

Return type:

list[tuple[SavantIdMetaKind, int]]

set_savant_ids(ids)

Replace SavantIdMeta on the buffer.

Parameters:

ids (list[tuple[SavantIdMetaKind, int]]) – Meta entries to set.

strong_count

Number of strong Arc references to the underlying buffer.

class savant_rs.deepstream.SkiaContext(width, height, gpu_id=0)

GPU-accelerated Skia rendering context backed by CUDA-GL interop.

fbo_id
height
render_to_nvbuf(buf, config=None)
width
class savant_rs.deepstream.SurfaceBatch

Pool-allocated batched NvBufSurface with per-slot fill tracking.

Obtained from UniformBatchGenerator.acquire_batch. Fill individual slots with transform_slot, then call finalize, then shared_buffer to access the buffer.

finalize()

Finalize the batch: set numFilled and attach IDs from acquisition.

is_finalized
max_batch_size
memset_slot(index, value)

Fill a slot’s surface with a constant byte value.

num_filled
shared_buffer()

Return the underlying SharedBuffer. Available only after finalize.

transform_slot(slot, src_buf, src_rect=None)

Transform a source buffer into a specific batch slot.

upload_slot(index, data)

Upload pixel data from a NumPy array into a batch slot.

view(slot_index)

Create a zero-copy single-slot SurfaceView from the batch.

class savant_rs.deepstream.SurfaceView

Zero-copy view of a single GPU surface.

Wraps an NvBufSurface-backed buffer or arbitrary CUDA memory with cached surface parameters. Implements __cuda_array_interface__ for single-plane formats (RGBA, BGRx, GRAY8) so the surface can be consumed by CuPy, PyTorch, and other CUDA-aware libraries.

Construction:

  • SurfaceView.from_buffer(buf, slot_index) — from a GstBuffer.

  • SurfaceView.from_cuda_array(obj) — from any object exposing __cuda_array_interface__ (CuPy array, PyTorch CUDA tensor, etc.).

channels

Number of interleaved channels per pixel.

color_format

Raw NvBufSurfaceColorFormat value.

cuda_stream

CUDA stream handle associated with this view (as an integer pointer).

Returns 0 for the default (legacy) stream.

data_ptr

CUDA data pointer to the first pixel.

fill(color)

Fill the surface with a repeating pixel colour.

color must have exactly as many elements as the surface has channels (e.g. [R, G, B, A] for RGBA, [Y] for GRAY8).

Example:

view.fill([128, 0, 255, 255])   # semi-blue, opaque RGBA
Parameters:

color (list[int]) – Per-channel byte values.

Raises:
  • ValueError – If color length does not match the surface’s channel count.

  • RuntimeError – If the view has been consumed or the GPU operation fails.

static from_buffer(buf, slot_index=0, cuda_stream=0)

Create a view from an NvBufSurface-backed buffer.

Parameters:
  • buf (GstBuffer | int) – Source buffer.

  • slot_index (int) – Zero-based slot index (default 0).

Raises:
  • ValueError – If buf is null or slot_index is out of bounds.

  • RuntimeError – If the buffer is not a valid NvBufSurface or uses a multi-plane format (NV12, I420, etc.).

static from_cuda_array(obj, gpu_id=0, cuda_stream=0)

Create a view from any object exposing __cuda_array_interface__.

Supported shapes:

  • (H, W, C) — interleaved: C must be 1 (GRAY8) or 4 (RGBA).

  • (H, W) — grayscale (GRAY8).

The source object is kept alive for the lifetime of this view.

Parameters:
  • obj – A CuPy array, PyTorch CUDA tensor, or any object with __cuda_array_interface__.

  • gpu_id (int) – CUDA device ID (default 0).

Raises:
  • TypeError – If obj has no __cuda_array_interface__.

  • ValueError – If shape, dtype, or strides are unsupported.

gpu_id

GPU device ID.

height

Surface height in pixels.

memset(value)

Fill the surface with a constant byte value.

Every byte of the surface (up to pitch × height) is set to value. This is the fastest fill but only produces a uniform colour when all channels share the same byte (e.g. 0 for black, 255 for white on RGBA). Use fill() for arbitrary colours.

Parameters:

value (int) – Byte value (0–255) to fill every byte with.

Raises:

RuntimeError – If the view has been consumed or the GPU operation fails.

pitch

Row stride in bytes.

upload(data)

Upload pixel data from a NumPy array to the surface.

Parameters:

data (numpy.ndarray) – A 3-D uint8 array with shape (height, width, channels) matching the surface dimensions and color format (e.g. 4 channels for RGBA).

Raises:
  • ValueError – If data has wrong shape, dtype, or dimensions.

  • RuntimeError – If the view has been consumed or the GPU operation fails.

width

Surface width in pixels.

class savant_rs.deepstream.TransformConfig(padding=Ellipsis, dst_padding=None, interpolation=Ellipsis, compute_mode=Ellipsis)

Configuration for a transform (scale / letterbox) operation.

All fields have sensible defaults (Padding.SYMMETRIC, Interpolation.BILINEAR, ComputeMode.DEFAULT).

compute_mode
dst_padding
interpolation
padding
class savant_rs.deepstream.UniformBatchGenerator(format, width, height, max_batch_size, pool_size=2, fps_num=30, fps_den=1, gpu_id=0, mem_type=None)

Homogeneous batched NvBufSurface buffer generator.

Produces buffers whose surfaceList is an array of independently fillable GPU surfaces, all sharing the same pixel format and dimensions.

Parameters:
  • format (VideoFormat | str) – Pixel format (e.g. "RGBA").

  • width (int) – Slot width in pixels.

  • height (int) – Slot height in pixels.

  • max_batch_size (int) – Maximum number of slots per batch.

  • pool_size (int) – Number of pre-allocated batched buffers (default 2).

  • fps_num (int) – Framerate numerator (default 30).

  • fps_den (int) – Framerate denominator (default 1).

  • gpu_id (int) – GPU device ID (default 0).

  • mem_type (MemType | None) – Memory type (default MemType.DEFAULT).

Raises:

RuntimeError – If pool creation fails.

acquire_batch(config, ids=None)

Acquire a SurfaceBatch from the pool, ready for slot filling.

Parameters:
  • config (TransformConfig) – Scaling / letterboxing configuration.

  • ids (list[tuple[SavantIdMetaKind, int]] | None) – Optional per-slot SavantIdMeta entries.

format
gpu_id
height
max_batch_size
width
class savant_rs.deepstream.VideoFormat

Video pixel format.

  • RGBA — 8-bit RGBA (4 bytes/pixel).

  • BGRx — 8-bit BGRx (4 bytes/pixel, alpha ignored).

  • NV12 — YUV 4:2:0 semi-planar (default encoder format).

  • NV21 — YUV 4:2:0 semi-planar (UV swapped).

  • I420 — YUV 4:2:0 planar (JPEG encoder format).

  • UYVY — YUV 4:2:2 packed.

  • GRAY8 — single-channel grayscale.

BGRx = VideoFormat.BGRx
GRAY8 = VideoFormat.GRAY8
I420 = VideoFormat.I420
NV12 = VideoFormat.NV12
NV21 = VideoFormat.NV21
RGBA = VideoFormat.RGBA
UYVY = VideoFormat.UYVY
static from_name(name)

Parse a video format from a string name.

name()

Return the canonical name of this format (e.g. "NV12").

savant_rs.deepstream.get_nvbufsurface_info(buf)

Extract NvBufSurface descriptor fields from an existing GstBuffer.

Returns:

(data_ptr, pitch, width, height)

Return type:

tuple[int, int, int, int]

savant_rs.deepstream.get_savant_id_meta(buf)

Read SavantIdMeta from a GStreamer buffer.

Returns:

Meta entries, e.g. [("frame", 42)].

Return type:

list[tuple[str, int]]

savant_rs.deepstream.gpu_architecture(gpu_id=0)

Returns the GPU architecture family name (x86_64 dGPU only, via NVML).

Returns a lowercase architecture name such as "ampere", "ada", "hopper", "turing", etc. Returns None on Jetson/aarch64.

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

Architecture name or None if not on x86_64.

Return type:

str | None

Raises:

RuntimeError – If NVML initialization fails.

savant_rs.deepstream.gpu_mem_used_mib(gpu_id=0)

Returns GPU memory currently used, in MiB.

  • dGPU (x86_64): Uses NVML to query device gpu_id.

  • Jetson (aarch64): Reads /proc/meminfo (unified memory).

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

GPU memory used in MiB.

Return type:

int

Raises:

RuntimeError – If NVML or /proc/meminfo is unavailable.

savant_rs.deepstream.gpu_platform_tag(gpu_id=0)

Returns a directory-safe platform tag for TensorRT engine caching.

  • Jetson: Jetson model name (e.g. "agx_orin_64gb", "orin_nano_8gb").

  • dGPU (x86_64): GPU architecture family (e.g. "ampere", "ada").

  • Unknown: "unknown" if the platform cannot be determined.

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

Platform tag string.

Return type:

str

Raises:

RuntimeError – If CUDA/NVML initialization fails.

savant_rs.deepstream.has_nvenc(gpu_id=0)

Returns True if the GPU has NVENC hardware encoding support.

  • Jetson: Orin Nano is the only Jetson without NVENC; all others have it. Unknown models conservatively return False.

  • dGPU (x86_64): Uses NVML encoder_capacity(H264) — returns False for datacenter GPUs without NVENC (H100, A100, A30, etc.).

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

True if NVENC is available.

Return type:

bool

Raises:

RuntimeError – If CUDA/NVML initialization fails.

savant_rs.deepstream.init_cuda(gpu_id=0)

Initialize CUDA context for the given GPU device.

Parameters:

gpu_id (int) – GPU device ID (default 0).

savant_rs.deepstream.is_jetson_kernel()

Returns True if the kernel is a Jetson (Tegra) kernel.

Checks uname -r for the “tegra” suffix.

savant_rs.deepstream.jetson_model(gpu_id=0)

Returns the Jetson model name if running on a Jetson device, or None if not.

Uses CUDA SM count and /proc/meminfo MemTotal to identify the model. Works inside containers where /proc/device-tree is typically not mounted. Requires uname -r to contain “tegra” and a working CUDA.

Parameters:

gpu_id (int) – GPU device ID (default 0).

Returns:

Model name (e.g. “Orin Nano 8GB”) or None if not Jetson.

Return type:

str | None

Raises:

RuntimeError – If CUDA or /proc/meminfo is unavailable.

savant_rs.deepstream.set_num_filled(buf, count)

Set numFilled on a batched NvBufSurface GstBuffer.

Parameters:
  • buf (SharedBuffer | int) – Buffer containing a batched NvBufSurface.

  • count (int) – Number of filled slots.

Pure-Python helpers

The following symbols are injected into savant_rs.deepstream at import time and are available as from savant_rs.deepstream import ....

OpenCV CUDA GpuMat helpers for NvBufSurface buffers.

Injected into savant_rs.deepstream at import time so that from savant_rs.deepstream import nvgstbuf_as_gpu_mat etc. work.

Two context managers for different call sites:

  • nvgstbuf_as_gpu_mat() — takes a SharedBuffer guard (or raw int pointer), extracts NvBufSurface metadata internally. Use outside callbacks (e.g. pre-filling a background before send_frame).

  • nvbuf_as_gpu_mat() — takes raw CUDA params (data_ptr, pitch, width, height) directly. Use inside the on_gpumat callback which already provides these values.

  • GpuMatCudaArray — exposes __cuda_array_interface__ (v3) for a cv2.cuda.GpuMat, bridging it to consumers like Picasso send_frame.

  • make_gpu_mat() — allocates a zero-initialised GpuMat.

class savant_rs._ds_gpumat.GpuMatCudaArray(mat: GpuMat)

Exposes __cuda_array_interface__ (v3) for a cv2.cuda.GpuMat.

OpenCV’s GpuMat does not implement the protocol natively, so this thin wrapper bridges it to any consumer that expects the interface (CuPy, SurfaceView.from_cuda_array, Picasso send_frame, etc.).

Only CV_8UC1 (GRAY8) and CV_8UC4 (RGBA) mats are supported.

The wrapper keeps a reference to the source mat so the underlying device memory stays alive for as long as this object exists.

savant_rs._ds_gpumat.from_gpumat(gen: BufferGenerator, gpumat: GpuMat, *, interpolation: int = 1, id: int | None = None) SharedBuffer

Acquire a buffer from the pool and fill it from a GpuMat.

If the source GpuMat dimensions differ from the generator’s dimensions the image is scaled using cv2.cuda.resize() with the given interpolation method. When sizes match the data is copied directly (zero-overhead copyTo).

Parameters:
  • gen – Surface generator (determines destination dimensions and format).

  • gpumat – Source GpuMat (must be CV_8UC4).

  • interpolation – OpenCV interpolation flag (default cv2.INTER_LINEAR). Common choices: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA.

  • id – Optional frame identifier for SavantIdMeta.

Returns:

SharedBuffer RAII guard owning the newly acquired buffer.

savant_rs._ds_gpumat.make_gpu_mat(width: int, height: int, channels: int = 4) GpuMat

Allocate a cv2.cuda.GpuMat of the given size.

Returns:

A zero-initialised GpuMat with CV_8UC<channels> type.

savant_rs._ds_gpumat.nvbuf_as_gpu_mat(data_ptr: int, pitch: int, width: int, height: int, stream: Stream | None = None) Generator[tuple[GpuMat, Stream], None, None]

Wrap raw CUDA memory as an OpenCV CUDA GpuMat.

Unlike nvgstbuf_as_gpu_mat(), this function takes the CUDA device pointer and layout directly — no GstBuffer or get_nvbufsurface_info call involved. Designed for the Picasso on_gpumat callback which already supplies these values.

Parameters:
  • data_ptr – CUDA device pointer to the surface data.

  • pitch – Row stride in bytes.

  • width – Surface width in pixels.

  • height – Surface height in pixels.

Yields:

(gpumat, stream) – the GpuMat is CV_8UC4.

savant_rs._ds_gpumat.nvgstbuf_as_gpu_mat(buf: SharedBuffer | int, stream: Stream | None = None) Generator[tuple[GpuMat, Stream], None, None]

Expose an NvBufSurface SharedBuffer as an OpenCV CUDA GpuMat.

Extracts the CUDA device pointer, pitch, width and height from the buffer’s NvBufSurface metadata, then creates a zero-copy GpuMat together with a CUDA Stream. When the with block exits the stream is synchronised (waitForCompletion).

Parameters:

bufSharedBuffer RAII guard or raw GstBuffer* pointer as int.

Yields:

(gpumat, stream) – the GpuMat is CV_8UC4 with the buffer’s native width, height and pitch.

Convenience wrapper: SkiaContext + skia-python in one object.

Injected into savant_rs.deepstream at import time so that from savant_rs.deepstream import SkiaCanvas works.

class savant_rs._ds_skia_canvas.SkiaCanvas(ctx)

Convenience wrapper: SkiaContext + skia-python in one object.

Handles creation of the skia GrDirectContext and Surface backed by the SkiaContext’s GPU FBO.

canvas() Canvas

Get the skia-python Canvas for drawing.

classmethod create(width: int, height: int, gpu_id: int = 0)

Create with an empty (transparent) canvas.

Parameters:
  • width – Canvas width in pixels.

  • height – Canvas height in pixels.

  • gpu_id – GPU device ID (default 0).

classmethod from_fbo(fbo_id: int, width: int, height: int) SkiaCanvas

Create from an existing OpenGL FBO.

Used internally by the Picasso on_render callback to wrap the worker’s GPU canvas without creating a separate SkiaContext.

Parameters:
  • fbo_id – OpenGL FBO ID backing the canvas.

  • width – Canvas width in pixels.

  • height – Canvas height in pixels.

property gr_context: GrDirectContext

The Skia GPU GrDirectContext backing this canvas.

Use this to create GPU-resident images via skia.Image.makeTextureImage() for efficient repeated drawing without per-frame CPU -> GPU transfers:

raster = skia.Image.MakeFromEncoded(data)
gpu_img = raster.makeTextureImage(canvas.gr_context)
# gpu_img now lives in VRAM; drawImage is pure GPU work
property height: int

Canvas height in pixels.

render_to_nvbuf(buf_ptr: int, config: TransformConfig | None = None)

Flush Skia and copy to destination NvBufSurface.

Supports optional scaling + letterboxing when canvas dimensions differ from the destination buffer.

Parameters:
  • buf_ptr – Raw pointer of the destination GstBuffer.

  • config – Optional TransformConfig for scaling / letterboxing. None means direct 1:1 copy (canvas and destination must have the same dimensions).

property width: int

Canvas width in pixels.