|
Many applications must interact with the content available through various modalities. Some of these applications process complex documents such as insurance claims and medical accounts. Mobile applications must analyze user -general media. The organization must create a semantic index at the top of its digital assets that contain documents, images, audio and video files. However, it is not easy to set up knowledge from non -comprehensive multimodal content: you must implement pipe processing for different data formats and go through a few steps to get the necessary information. This usually means having multiple models in production for which you have to manage cost optimization (through fine fine -tuning and fast engineering), warranty (for example against hallucinations), integration with target applications (including data formats) and model updates.
To make this process easier, we have introduced in the AWS Re: Invent Amazon Bedrock Data Automation, Amazon Bedrock’s ability, which makes generating valuable knowledge of non -structures, multimodal content such as documents, images, sound and videos. With data automation based on subsoil, you can shorten the development and efforts to build intelligent processing processing, media analysis and other multimodal solutions focused on data.
You can use data automation to Bedrock as a separate feature or as Amazon Bedrock Nowledge Bases analyzer for index knowledge of multimodal content and provide greater lifting for search-auled generation (RAG).
Currently, bed data automation is generally available with endpoints between the region to be available in multiple AWS regions and use calculation across different rent. Based on your feedback during the preview, we have also improved accuracy and added support to recognize the logo for images and videos.
Let’s take a look at how it works in practice.
Using Amazon Bedrock data automation with endpoints for cross regions
The blog post published for the preview of data automation Bedrock shows how to use the visual demo in the Amazon Bedrock console to extract information from documents and videos. I recommend that you go through the demo with the console experience to understand how this ability works and what you can do to customize it. For this post I focus more on how data automation works in your applications, starting with a few steps in the console and follower with code samples.
Tea Data automation The Amazon Bedrock console section is now asking for confirmation to allow the cross region to support the first access to it. For example:
From the perspective of the API InvokeDataAutomationAsync
Operation now requires another parameter (dataAutomationProfileArn
) For specific use of data automation profile. The value for this parameter depends on the AWS account and ID:
arn:aws:bedrock:<REGION>:<ACCOUNT_ID>:data-automation-profile/us.data-automation-v1
Also dataAutomationArn
The parameter was renamed to dataAutomationProjectArn
Better to reflect that it contains the name of the project Amazon Resource (RNA). You must now use a project or plan to automate data automation. If you go in blueprints, you will get your own output. If you want to continue getting standard default output, configure the parameter DataAutomationProjectArn
use arn:aws:bedrock:<REGION>:aws:data-automation-project/public-default
.
As the name suggests, InvokeDataAutomationAsync
The operation is asynchronous. You pass on the input and output configuration, and when the result is ready, it is a writer in Amazon Simple Storage Service (Amazon S3) Kbelík, as specified in the output configuration. You can receive Amazon Eventbridge notifications from Bedrock data automation using notificationConfiguration
parameter.
You can configure the outputs in two ways with the data automation:
- Standard output Predefined knowledge is given under data type, such as document semantics, video chapter summary and audio transcripts. With standard outputs, you can set the required information in several steps.
- The actual output It allows you to determine that extraction needs user plans for multiple knowledge.
If you want to view new skills in action, create a project and customize the standard output settings. I choose the simple text of the Intetead of Markdown for documents. Note that you can automate these configuration steps using the API Automation Data Automation Bedrock.
For videos I want a complete audio transcript and summary of the video entitare. I also ask for a summary of each chapter.
To configure the plan, I choose Own output settings in Data automation Part of the Amazon Bedrock Console Navigation pane. I am looking for US-DRIVER-LICENSE Sample plan. You can go through other sample plans for other example and ideas.
Sample plans cannot be adjusted so I use Action Offer for duplicating the plan and add it to my project. There I can fine -tune the data so that it is also accumulated by adjusting the plan and adding customs that can extract or calculate the data in the format I need using generative AI.
I record a picture of an American driving license to the S3 bucket. Then I use this sample script python that uses Bedrock data automation via AWS SDK for Python (Boto3) to extra draw text from the picture:
import json
import sys
import time
import boto3
DEBUG = False
AWS_REGION = '<REGION>'
BUCKET_NAME = '<BUCKET>'
INPUT_PATH = 'BDA/Input'
OUTPUT_PATH = 'BDA/Output'
PROJECT_ID = '<PROJECT_ID>'
BLUEPRINT_NAME = 'US-Driver-License-demo'
# Fields to display
BLUEPRINT_FIELDS = (
'NAME_DETAILS/FIRST_NAME',
'NAME_DETAILS/MIDDLE_NAME',
'NAME_DETAILS/LAST_NAME',
'DATE_OF_BIRTH',
'DATE_OF_ISSUE',
'EXPIRATION_DATE'
)
# AWS SDK for Python (Boto3) clients
bda = boto3.client('bedrock-data-automation-runtime', region_name=AWS_REGION)
s3 = boto3.client('s3', region_name=AWS_REGION)
sts = boto3.client('sts')
def log(data):
if DEBUG:
if type(data) is dict:
text = json.dumps(data, indent=4)
else:
text = str(data)
print(text)
def get_aws_account_id() -> str:
return sts.get_caller_identity().get('Account')
def get_json_object_from_s3_uri(s3_uri) -> dict:
s3_uri_split = s3_uri.split('/')
bucket = s3_uri_split(2)
key = '/'.join(s3_uri_split(3:))
object_content = s3.get_object(Bucket=bucket, Key=key)('Body').read()
return json.loads(object_content)
def invoke_data_automation(input_s3_uri, output_s3_uri, data_automation_arn, aws_account_id) -> dict:
params = {
'inputConfiguration': {
's3Uri': input_s3_uri
},
'outputConfiguration': {
's3Uri': output_s3_uri
},
'dataAutomationConfiguration': {
'dataAutomationProjectArn': data_automation_arn
},
'dataAutomationProfileArn': f"arn:aws:bedrock:{AWS_REGION}:{aws_account_id}:data-automation-profile/us.data-automation-v1"
}
response = bda.invoke_data_automation_async(**params)
log(response)
return response
def wait_for_data_automation_to_complete(invocation_arn, loop_time_in_seconds=1) -> dict:
while True:
response = bda.get_data_automation_status(
invocationArn=invocation_arn
)
status = response('status')
if status not in ('Created', 'InProgress'):
print(f" {status}")
return response
print(".", end='', flush=True)
time.sleep(loop_time_in_seconds)
def print_document_results(standard_output_result):
print(f"Number of pages: {standard_output_result('metadata')('number_of_pages')}")
for page in standard_output_result('pages'):
print(f"- Page {page('page_index')}")
if 'text' in page('representation'):
print(f"{page('representation')('text')}")
if 'markdown' in page('representation'):
print(f"{page('representation')('markdown')}")
def print_video_results(standard_output_result):
print(f"Duration: {standard_output_result('metadata')('duration_millis')} ms")
print(f"Summary: {standard_output_result('video')('summary')}")
statistics = standard_output_result('statistics')
print("Statistics:")
print(f"- Speaket count: {statistics('speaker_count')}")
print(f"- Chapter count: {statistics('chapter_count')}")
print(f"- Shot count: {statistics('shot_count')}")
for chapter in standard_output_result('chapters'):
print(f"Chapter {chapter('chapter_index')} {chapter('start_timecode_smpte')}-{chapter('end_timecode_smpte')} ({chapter('duration_millis')} ms)")
if 'summary' in chapter:
print(f"- Chapter summary: {chapter('summary')}")
def print_custom_results(custom_output_result):
matched_blueprint_name = custom_output_result('matched_blueprint')('name')
log(custom_output_result)
print('\n- Custom output')
print(f"Matched blueprint: {matched_blueprint_name} Confidence: {custom_output_result('matched_blueprint')('confidence')}")
print(f"Document class: {custom_output_result('document_class')('type')}")
if matched_blueprint_name == BLUEPRINT_NAME:
print('\n- Fields')
for field_with_group in BLUEPRINT_FIELDS:
print_field(field_with_group, custom_output_result)
def print_results(job_metadata_s3_uri) -> None:
job_metadata = get_json_object_from_s3_uri(job_metadata_s3_uri)
log(job_metadata)
for segment in job_metadata('output_metadata'):
asset_id = segment('asset_id')
print(f'\nAsset ID: {asset_id}')
for segment_metadata in segment('segment_metadata'):
# Standard output
standard_output_path = segment_metadata('standard_output_path')
standard_output_result = get_json_object_from_s3_uri(standard_output_path)
log(standard_output_result)
print('\n- Standard output')
semantic_modality = standard_output_result('metadata')('semantic_modality')
print(f"Semantic modality: {semantic_modality}")
match semantic_modality:
case 'DOCUMENT':
print_document_results(standard_output_result)
case 'VIDEO':
print_video_results(standard_output_result)
# Custom output
if 'custom_output_status' in segment_metadata and segment_metadata('custom_output_status') == 'MATCH':
custom_output_path = segment_metadata('custom_output_path')
custom_output_result = get_json_object_from_s3_uri(custom_output_path)
print_custom_results(custom_output_result)
def print_field(field_with_group, custom_output_result) -> None:
inference_result = custom_output_result('inference_result')
explainability_info = custom_output_result('explainability_info')(0)
if '/' in field_with_group:
# For fields part of a group
(group, field) = field_with_group.split('/')
inference_result = inference_result(group)
explainability_info = explainability_info(group)
else:
field = field_with_group
value = inference_result(field)
confidence = explainability_info(field)('confidence')
print(f'{field}: {value or '<EMPTY>'} Confidence: {confidence}')
def main() -> None:
if len(sys.argv) < 2:
print("Please provide a filename as command line argument")
sys.exit(1)
file_name = sys.argv(1)
aws_account_id = get_aws_account_id()
input_s3_uri = f"s3://{BUCKET_NAME}/{INPUT_PATH}/{file_name}" # File
output_s3_uri = f"s3://{BUCKET_NAME}/{OUTPUT_PATH}" # Folder
data_automation_arn = f"arn:aws:bedrock:{AWS_REGION}:{aws_account_id}:data-automation-project/{PROJECT_ID}"
print(f"Invoking Bedrock Data Automation for '{file_name}'", end='', flush=True)
data_automation_response = invoke_data_automation(input_s3_uri, output_s3_uri, data_automation_arn, aws_account_id)
data_automation_status = wait_for_data_automation_to_complete(data_automation_response('invocationArn'))
if data_automation_status('status') == 'Success':
job_metadata_s3_uri = data_automation_status('outputConfiguration')('s3Uri')
print_results(job_metadata_s3_uri)
if __name__ == "__main__":
main()
The initial configuration in the script includes the name of the S3, which should be used at the input and output, renting an input file in the bucket, the output path for the results, the project ID that uses its own data automation to the subsoil and the Blueprint field that appears on the output.
I will start the script to hand over the input file name. In the output I see information about automation of data on the bedrock. Tea US-DRIVER-LICENSE It is the match and the name and data in the driving license are printed at the output.
As expected, I see the output of the information I chose from the plan associated with the Data Automation Project.
Similarly, I run the same script for a video file by my colleague Mike Chambers. To keep the output small, I will not print the entire audio transcription or the text shown in the video.
What to know
Amazon Bedrock data automation is now available through the inference of the cross region in the following two AWS regions: Us East (N. Virginia) and the US West (Oregon). When using data automation from these areas from these regions, data can be processed by inference in any of the oven areas: US East (Ohio, N. Virginia) and US West (N. California, Oregon). All these regions are in the US, so data is processed with the same geography. Later in 2025 we try to add support for multiple regions in Europe and Asia.
Compared to preview and using Cross-Region-regions, inference does not change prices compared to preview. For more information, visit Amazon Bedrock.
Bed data automation now also includes a number of security, administrative and correct abilities such as AWS Key Management Service (AWS KMS) Customer Management Games Support for Granular Encryption, AWS Profitrin will connect directly to API to automate data based on your virtual private private private private private private private private private private private private private private private private private private Cloud (VPC) Intear Internet interconnection and sources and tasks for data automation and cost monitoring and promotion and promotion of Access Management (IAM).
I used Python in this blog post, but Bedrock data automation is available at all AWS SDKs. For example, you can use Java, .NET or REST for Backend document processing application; JavaScript for a web application that processes images, videos, sound gold files; and Swift for the native mobile application that processes the content provided by end users. It has never been so easy to get information from multimodal data.
Here are several reading designs that you can learn more (including code samples):
– Tax
–
How’s the Blog of news? Take this 1 minute survey!
(This survey is hosted by an external company. AWS processes your information as described in the AWS Privacy Notice. AWS will own data collected via this survey and will not share information with dew survey.)