Earlier this 12 months, we preannounced that TwelveLabs video understanding fashions had been coming to Amazon Bedrock. As we speak, we’re asserting the fashions at the moment are obtainable for looking by means of movies, classifying scenes, summarizing, and extracting insights with precision and reliability.
TwelveLabs has launched Marengo, a video embedding mannequin proficient at performing duties resembling search and classification, and Pegasus, a video language mannequin that may generate textual content based mostly on video knowledge. These fashions are educated on Amazon SageMaker HyperPod to ship groundbreaking video evaluation that gives textual content summaries, metadata era, and inventive optimization.
With the TwelveLabs fashions in Amazon Bedrock, you could find particular moments utilizing pure language video search capabilities like “present me the primary landing of the sport” or “discover the scene the place the primary characters first meet” and immediately soar to these actual moments. You too can construct purposes to know video content material by producing descriptive textual content resembling titles, subjects, hashtags, summaries, chapters, or highlights for locating insights and connections with out requiring predefined labels or classes.
For instance, you could find recurring themes in buyer suggestions or spot product utilization patterns that weren’t apparent earlier than. Whether or not you’ve got tons of or 1000’s of hours of video content material, now you can rework that total library right into a searchable information useful resource whereas sustaining enterprise-grade safety and efficiency.
Let’s check out the Marengo and Pegasus movies that TwelveLabs has revealed.
You possibly can rework video workflows with these fashions throughout industries. Media producers and editors can immediately find particular scenes or dialogue, which implies you possibly can deal with storytelling moderately than sifting by means of hours of footage. Advertising and marketing groups are streamlining their promoting workflows by rapidly personalizing content material to resonate with varied audiences, whereas safety groups are utilizing the know-how to proactively determine potential dangers by recognizing patterns throughout a number of video feeds.
Getting began with TwelveLabs fashions in Amazon Bedrock
Earlier than getting began, in the event you’re new to utilizing TwelveLabs fashions, go to the Amazon Bedrock console and select Mannequin entry within the backside left navigation pane. To entry the newest TwelveLabs fashions, request entry for Marengo Embed 2.7 or Pegasus 1.2 in TwelveLabs by obtainable Areas.
To make use of TwelveLabs fashions in Amazon Bedrock, select Chat/Textual content Playground beneath Take a look at within the left navigation pane. Select Choose mannequin, choose TwelveLabs because the class and Pegasus because the mannequin, after which select Apply.
For looking or producing textual content out of your video, it is best to add your video to an Amazon Easy Storage Service (Amazon S3) bucket or enter Base64-formatted video string.
I’ll use a pattern video generated with Amazon Nova Reel with a immediate: “A cute raccoon taking part in guitar underwater.”
Enter the S3 URI of your uploaded video and the S3 Bucket Proprietor, after which run your immediate: “Inform me in regards to the video by timeline”.
You may get following mannequin response:
The video opens with an underwater scene that includes a sandy seabed and swaying seaweed, making a serene ambiance. At round 5 seconds, a raccoon seems, swimming in direction of the digital camera with a curious expression. By 10 seconds, the raccoon is seen swimming close to the seabed. The main target then shifts to a gaggle of koi fish swimming gracefully close to the seabed from 11 to 13 seconds.
At 14 seconds, the scene transitions to an ukulele resting on the seabed, with the digital camera zooming in to spotlight its particulars. The raccoon reappears at 20 seconds, swimming among the many koi fish, and by 23 seconds, it begins to play the ukulele. The raccoon continues to play the ukulele from 29 to 41 seconds, with the koi fish swimming round it, including a vigorous and colourful contact to the underwater setting.
From 42 to 49 seconds, the raccoon is proven taking part in the ukulele with the koi fish swimming round it. The video concludes with a display screen displaying "Video generated with Amazon Novo Reel 1.1" and "Out there in Amazon Bedrock" from 50 to 54 seconds, adopted by the AWS emblem from 55 to 57 seconds.
The TwelveLabs fashions could be simply built-in into your purposes utilizing the Amazon Bedrock Invoke API, which supplies a unified interface for conversational AI interactions.
Right here’s an instance of use the AWS SDK for Python (Boto3) with the TwelveLabs Pegasus mannequin:
import boto3 import json import base64 AWS_REGION = "us-west-2" MODEL_ID = "us.twelvelabs.pegasus-1-2-v1:0" # OR "eu.twelvelabs.pegasus-1-2-v1:0" whether it is doing cross area inference in europe VIDEO_PATH = "pattern.mp4" def read_file(file_path: str) -> str: """Learn a file and return as base64 encoded string.""" strive: with open(file_path, 'rb') as file: file_content = file.learn() return base64.b64encode(file_content).decode('utf-8') besides Exception as e: increase Exception(f"Error studying file {file_path}: {str(e)}") bedrock_runtime = boto3.shopper( service_name="bedrock-runtime", region_name=AWS_REGION ) request_body = { "inputPrompt": "inform me in regards to the video", "mediaSource": { "base64String": read_file(VIDEO_PATH) } } response = bedrock_runtime.invoke_model( modelId=MODEL_ID, physique=json.dumps(request_body), contentType="software/json", settle for="software/json" ) response_body = json.masses(response['body'].learn()) print(json.dumps(response_body, indent=2))
The TwelveLabs Marengo Embed 2.7 mannequin generates vector embeddings from video, textual content, audio, or picture inputs. These embeddings can be utilized for similarity search, clustering, and different machine studying (ML) duties. The mannequin helps asynchronous inference by means of the Bedrock StartAsyncInvoke API.
For video supply, you possibly can request JSON format for the TwelveLabs Marengo Embed 2.7 mannequin utilizing the StartAsyncInvoke
API.
{ "modelId": "twelvelabs.marengo-embed-2-7-v1:0", "modelInput": { "inputType": "video", "mediaSource": { "s3Location": { "uri": "s3://your-video-object-s3-path", "bucketOwner": "your-video-object-s3-bucket-owner-account" } } }, "outputDataConfig": { "s3OutputDataConfig": { "s3Uri": "s3://your-bucket-name" } } }
You may get a response delivered to the required S3 location.
{ "embedding": [0.345, -0.678, 0.901, ...], "embeddingOption": "visual-text", "startSec": 0.0, "endSec": 5.0 }
That can assist you get began, take a look at a broad vary of code examples for a number of use circumstances and a wide range of programming languages. To study extra, go to TwelveLabs Pegasus 1.2 and TwelveLabs Marengo Embed 2.7 within the AWS Documentation.
Now obtainable
TwelveLabs fashions are usually obtainable as we speak in Amazon Bedrock: the Marengo mannequin within the US East (N. Virginia), Europe (Eire), and Asia Pacific (Seoul) Area, and the Pegasus mannequin in US West (Oregon), and Europe (Eire) Area accessible with cross-Area inference from US and Europe Areas. Test the full Area checklist for future updates. To study extra, go to the TwelveLabs in Amazon Bedrock product web page and the Amazon Bedrock pricing web page.
Give TwelveLabs fashions a strive on the Amazon Bedrock console as we speak, and ship suggestions to AWS re:Put up for Amazon Bedrock or by means of your regular AWS Assist contacts.
— Channy
Up to date on July 16, 2025 – Revised the screenshots and code half.