Process and understand video for any use case at any scale using computer vision.
Initialize a project
Initialize a project using a pre-built workflow or a custom workflow you've specified to Sieve.
Push a video
Submit a video to Sieve API using a signed URL pointing to a storage bucket.
Query for data
Find intervals of video that match a given query, or retrieve all processed information as a JSON.
How it works
Choose a pre-built video AI workflow
Many video AI applications that work out-of-the-box. No data collection, labeling, or infrastructure setup. Video object detection models to detect and count objects, background / object removal using state-of-the-art segmentation, and much more.
Eye Gaze Tracking
Build custom workflows using AI building blocks
Multiple models, each with different performance and capabilities. YOLOv7 is the latest state-of-the-art for object detection and tracking, ResNet18 and MobileNet for classification, and many others available for their respective tasks.
Track objects and calculate position, size, dwell times, etc.
State-of-the-art object detection model, fast and performant. Pick from YOLOv4, YOLOv5, YOLOv6, YOLOv7, or YOLOR.
Object detection model from Microsoft. Pick from R-CNN, Fast R-CNN, and Faster R-CNN.
State-of-the-art generalizable transformer model from Facebook.
Classify and group videos and objects into different categories.
Powerful classification model from Microsoft. Pick ResNet18 or ResNet50, depending on performance requirements.
Fast, light-weight classification model. Pick between MobileNetV1, MobileNetV2, and MobileNetV3.
State-of-the-art vision transformer from Google.
Extract and track mask across the entire video or specific objects.
State-of-the-art segmentation models. Pick from DeepLab V3 and DeepLab V3+.
Pixel-perfect object segmentation model from Facebook.
State-of-the-art transformer model from Facebook for segmentation.
Fill erased regions of content with replacements that make sense.
State-of-the-art model specifically for video inpainting
OCR / Text Recognition
Recognize and search text using state-of-the-art OCR.
Popular OCR library implemented by Sieve to work on video.
Upload your own models with custom code and weights or request Sieve for a new one.
Any model with custom pre and post processing that runs on CPU or GPU.
Submit feedback for customization and better performance
All available models have been trained using millions of examples from the real-world, but Sieve offers a way to fine-tune these models to teach them new things to detect or to improve on what they already know. This means that they can work for any use case and the utmost performance, without you having to train models or manage datasets yourself.
Submit feedback via API calls.
Simple API calls to provide ML feedback, as few or as many as you'd like.
System automatically improves.
Sieve automatically analyzes feedback and finds extra samples to retrain ML models.
Track performance metrics.
Validate performance, track key metrics, and troubleshoot quality problems quickly.
Video as structured data
Sieve's API is your AI infrastructure for all things video. A fast, scalable, and cost-effective video processing engine. A queryable database of structured video metadata. A self-improving video AI engine that keeps track of performance.
A powerful video AI processing engine
Sieve allows developers to use out-of-the-box AI workflows built for specific use cases, configure their own workflows with popular models such as YOLOv7, PointRend, and ResNet, or upload their own custom models. Automatically built and deployed to run quickly, cost-effectively, and at scale.
MongoDB for Video Data
Sieve acts as a structured database for all video content. Sieve stores everything in a video as an object with properties that change over time, and allows users to query information as such. Perform arbitrary video search queries using MongoDB syntax, find the moments you're interested in.