Great Expectations can use S3, RDS, and RedShift, among other AWS options, as data sources. Typically, you would use Assume Role when you’re connecting to and making use of AWS services.
GX doesn’t currently allow you to specify an AWS role to assume directly in your Datasource configuration. But with a few lines of code, you can quickly assume a role and pass its credentials to the relevant GX functions.
This approach uses Boto3, which is the AWS SDK package for creating and managing AWS services in Python.
First, if Boto3 is not already installed in your environment, install it:
1pip install boto3
Once Boto3 is installed, import it into the file you’re working with:
1import boto3
Establish a Boto3 session:
1session = boto3.Session()
Assume the role you need in Boto3:
1new_session=session.assume_role(RoleArn=<your role>)
Get session credentials for that role:
1creds = new_session.get("Credentials")
Set the credentials to get_context:
1{"access_key_id": creds.access_key,2 "secret_access_key": creds.secret_key,3 "session_token": creds.token}
Now that you can get the data context with the assumed role, you may proceed in your GX workflow as normal.
Note: The Boto3 code snippets provided here were accurate as of publication, but if needed, you can refer to the latest Boto3 documentation.
Thanks to community member Polina (@shpolina) from our Slack community for contributing this solution!
Great Expectations is part of an increasingly flexible and powerful modern data ecosystem. This is just one example of the ways in which Great Expectations is able to give you greater control of your data quality processes within that ecosystem.
We’re committed to supporting and growing the community around Great Expectations. It’s not enough to build a great platform; we want to build a great community as well. Join our public Slack channel here, find us on GitHub, sign up for one of our weekly cloud workshops, or head to https://greatexpectations.io/ to learn more.