Process. Understand. Search. Video.

Stop wrestling with ML pipelines, video infrastructure, and large datasets. Build magical video AI functionality into your apps with just a few API calls.

/init_project, /push_video, /query

Process and understand video for any use case at any scale using computer vision.

01.

Sieve computer vision workflow
Initialize a project
Initialize a project using a pre-built workflow or a custom workflow you've specified to Sieve.

02.

Sieve push_video API call
Push a video
Submit a video to Sieve API using a signed URL pointing to a storage bucket.

03.

Sieve query API call
Query for data
Find intervals of video that match a given query, or retrieve all processed information as a JSON.
How it works

Choose a pre-built video AI workflow

Many video AI applications that work out-of-the-box. No data collection, labeling, or infrastructure setup. Video object detection models to detect and count objects, background / object removal using state-of-the-art segmentation, and much more.
face
Person Tracking
directions_car
Vehicle Tracking
diamond
Logo Recognition
sports_basketball
Ball Tracking
house
Home Inspection
sports_esports
Esports Analytics
videocam
Video Editing
linked_camera
Dashcam Vision
do_not_disturb_on
Content Moderation
subtitles
Video Captioning
wallpaper
Background Removal
account_circle
Object Removal
pets
Pet Tracking
accessibility_new
Pose Estimation
visibility
Eye Gaze Tracking
home_mini
Product Recognition
view_in_ar
Shot Detection
looks_3
Inventory Counting
book
Text Reading
satellite_alt
Satellite Data

Build custom workflows using AI building blocks

Multiple models, each with different performance and capabilities. YOLOv7 is the latest state-of-the-art for object detection and tracking, ResNet18 and MobileNet for classification, and many others available for their respective tasks.
category
Detection
Track objects and calculate position, size, dwell times, etc.
category
YOLO Family
State-of-the-art object detection model, fast and performant. Pick from YOLOv4, YOLOv5, YOLOv6, YOLOv7, or YOLOR.
category
R-CNN Family
Object detection model from Microsoft. Pick from R-CNN, Fast R-CNN, and Faster R-CNN.
category
DINO
State-of-the-art generalizable transformer model from Facebook.
group_work
Categorization
Classify and group videos and objects into different categories.
group_work
ResNet Family
Powerful classification model from Microsoft. Pick ResNet18 or ResNet50, depending on performance requirements.
group_work
MobileNet Family
Fast, light-weight classification model. Pick between MobileNetV1, MobileNetV2, and MobileNetV3.
group_work
ViT
State-of-the-art vision transformer from Google.
directions_walk
Segmentation
Extract and track mask across the entire video or specific objects.
directions_walk
DeepLab Family
State-of-the-art segmentation models. Pick from DeepLab V3 and DeepLab V3+.
directions_walk
PointRend
Pixel-perfect object segmentation model from Facebook.
directions_walk
MaskDINO
State-of-the-art transformer model from Facebook for segmentation.
brush
Inpainting
Fill erased regions of content with replacements that make sense.
brush
E2FGVI
State-of-the-art model specifically for video inpainting
text_fields
OCR / Text Recognition
Recognize and search text using state-of-the-art OCR.
text_fields
EasyOCR
Popular OCR library implemented by Sieve to work on video.
settings
Custom Models
Upload your own models with custom code and weights or request Sieve for a new one.
settings
Custom Model
Any model with custom pre and post processing that runs on CPU or GPU.

Submit feedback for customization and better performance

All available models have been trained using millions of examples from the real-world, but Sieve offers a way to fine-tune these models to teach them new things to detect or to improve on what they already know. This means that they can work for any use case and the utmost performance, without you having to train models or manage datasets yourself.

01.

radar
Submit feedback via API calls.
Simple API calls to provide ML feedback, as few or as many as you'd like.

02.

loop
System automatically improves.
Sieve automatically analyzes feedback and finds extra samples to retrain ML models.

03.

analytics
Track performance metrics.
Validate performance, track key metrics, and troubleshoot quality problems quickly.

Video as structured data

Sieve's API is your AI infrastructure for all things video. A fast, scalable, and cost-effective video processing engine. A queryable database of structured video metadata. A self-improving video AI engine that keeps track of performance.
A powerful video AI processing engine
Sieve allows developers to use out-of-the-box AI workflows built for specific use cases, configure their own workflows with popular models such as YOLOv7, PointRend, and ResNet, or upload their own custom models. Automatically built and deployed to run quickly, cost-effectively, and at scale.
Sieve video AI workflow engine
MongoDB for Video Data
Sieve acts as a structured database for all video content. Sieve stores everything in a video as an object with properties that change over time, and allows users to query information as such. Perform arbitrary video search queries using MongoDB syntax, find the moments you're interested in.
See the Docs
Sieve object paradigm