Twelve Labs

🎉Twelve Labs raises $50 Million Series A led by NVIDIA and NEA

🎉Twelve Labs raises $50 Million Series A led by NVIDIA and NEA

Understanding videos is much more than extracting objects from images

A video contains rich information such as movement, objects, sound, text on screen, and speech. In order for an AI to contextually understand videos, it must extract all of this information as well as understand the complex relations between objects and connections between past and present.

How our AI works

Our comprehensive AI extracts multiple features from video such as time, objects, text in video, conversation, people, and actions to synthesize the vast amount information into vectors. Vectors enable fast, semantic and scalable search.

Why developers and product
managers love building
with Twelve Labs

Rich Understanding

Powerful AI delivers context-specific search and insights, replacing ineffective keyword tagging.

Multimodal

Search anything within your video : Visuals, conversations, logos, and text.

Easy Integration

End-to-end infrastructure to make all of your videos searchable. Start building with just a few API calls.

State-of-the-art Accuracy

The AI models behind Twelve Labs outperform even the strongest open-source and commercial models in both research and industry.

Our Strength

State-of-the-art performance

Ranked #1 in the video retrieval track from the 2021 ICCV VALUE Challenge hosted by Microsoft,
outperforming the giants in cost as well as performance

2021 VALUE Challenge Record (2021.11)

Rank

Model

Mean
-Rank

Ave
-Score

TVR

How2R

YC2R

VATEX
-EN

ViSeRet (ensemble)

Twelve Labs & KAIST

1.75

35.67

14.18

7.74

62.72

58.03

craig.starr (ensemble)

Kakao Brain

2.25

35.32

15.41

6.75

66.04

53.07

hgzjy25

Tencent OVBU

32.79

15.78

5.56

60.89

48.94

ViSeRet (single)

Twelve Labs & KAIST

4.25

32.17

9.77

7.74

55.73

55.46

Rank

Model

Mean
-Rank

Ave
-Score

TVR

How2R

YC2R

VATEX
-EN

ViSeRet (ensemble)

Twelve Labs & KAIST

1.75

35.67

14.18

7.74

62.72

58.03

craig.starr (ensemble)

Kakao Brain

2.25

35.32

15.41

6.75

66.04

53.07

hgzjy25

Tencent OVBU

32.79

15.78

5.56

60.89

48.94

ViSeRet (single)

Twelve Labs & KAIST

4.25

32.17

9.77

7.74

55.73

55.46

4th Workshop on Closing the Loop Between Vision and Language

Watch video

ViSeRet: A simple yet effective approach to moment retrieval via fine-grained video segmentation

Read report

Our video-first approach

End-to-end system born to process videos

Semantic on complex queries

Easy finetuning & domain adaptation

Existing solutions

Collage of image & speech APIs resulting in brittle system

Rule/simple tag-based search

No finetuning, domain adaptation functionality

Interested in
making your videos searchable?

Next generation video understanding technology at your finger tips

Search Generate Pricing Technology API docs Playground

Company Blog Security Terms of use Privacy policy