Alright, folks. This release is big, but we’re starting quiet. It’s a crescendo.
Our major goals for this release were to:
- Make it much, much easier to develop custom Expectations and fully integrate them into the Great Expectations ecosystem
- Make it much, much easier for teams to get started with Great Expectations when deploying against materialized data, such as data warehouses, data lakes, logs files, etc.
We’re very excited to roll these changes out to the community and see what you bring to them. As per usual, we're releasing these features at the earliest possible moment, knowing that there will be rough edges and more learning to do. We're counting on your feedback to identify from the community to help guide the next round of development.
First and foremost, 0.13.0 completes a refactor that’s been a long time coming: modular Expectations. The new Expectation class now includes all the logic for creating an Expectation and integrating it with the other moving parts of Great Expectations: execution on multiple platforms, documentation, rendering, profiling, etc. It’s actually quite a lot. Gathering all these parts into a single class makes the experience of developing Expectations much better, which will in turn make the Great Expectations ecosystem much richer and more expressive.
To learn how to use Modular Expectations, check out:
- How to create custom Expectations
- How to create a parameterized custom Expectation super, super fast.
- How to make a new custom Expectation renderable to Data Docs as a string, table row, or graph.
0.13.0 also introduces better internals for Datasources. The immediate payoff for these new-style Datasources is a much better experience around deploying and maintaining Great Expectations in production. We’ve also made some serious improvements to performance, especially when deploying Great Expectations against the kinds of large datasets that are common in data warehouses, data lakes, logs files, etc.
You can read more about these changes in the reference documentation for Datasources. After we’ve had a little time to harden them, these classes will replace legacy Datasources and BatchKwargGenerators as the preferred method of connecting to data in Great Expectations.
- How-to guide: How to choose which DataConnector to use
- How to configure DataContext components using test_yaml_config
- The second tab under any of these how-to guides:
- How-to guide: How to configure a MSSQL Datasource
- How-to guide: How to configure a MySQL Datasource
- How-to guide: How to configure a Pandas/filesystem Datasource
- How-to guide: How to configure a Pandas/S3 Datasource
- How-to guide: How to configure a Redshift Datasource
- How-to guide: How to configure a Snowflake Datasource
- How-to guide: How to configure a Spark/filesystem Datasource
We'll be showing off some of these new workflows in a community webinar next Thursday. If you're interested, please register here: Webinar 0.13.0 New Features: New Capabilities in New-Style Data Sources - Dec 10th @ 3:30pm ET
What does “minimally breaking” mean?
Great Expectations v0.13.0 is a minimally breaking major release. It makes some powerful and long-awaited changes to the library, but they’re mostly under the hood.
- If you are already a user of Great Expectations, you will most likely be able to change versions without any issues, then migrate your existing code and configs to use new features on your preferred timeline.
- If you are a new user, you will be able to take advantage of new features right away.
For the time being, we’re labeling the new features as “experimental.” We’ve vetted them thoroughly against questions from the community Slack channel and issues from github and Discussions. We’re confident that they represent a big step forward.
...and like all new abstractions, they’ll likely need some hardening to work perfectly in all cases. Therefore: we’re proceeding cautiously with an experimental release that allows you to continue to use the old versions, even though we don’t think you’ll want to for very long.
If you’d like hands-on help getting started 0.13, you can sign up for a time with us here. We would love to work with you.
We’re very pleased with all of the progress in 0.13.0. It consolidates a huge amount of community learning, and strengthens the core of the project in some very important ways. It’s a great example of how open source creates a symbiotic learning cycle between code and community.
There’s a lot more to come on this front---expect a fast release cadence over the next several months.
If you’d like to participate in development, there are three specific areas where we’d like to enlist help from the community.
First, specific feedback on the new workflows and abstractions. Each time we’ve introduced new concepts into the Great Expectations ecosystem, it has taken a while to create a “curriculum” to teach new users how to get the most out of them. This time around, we’d love to shorten the learning curve. Our best tool for doing that is being in the room (virtually) with data teams as they try out the new abstractions for the first time. The teams get immediate answers to their questions; we make sure that those answers get back into shared resources for the whole community: how-to guides, tutorials, webinars, and so on. If you’re up for that kind of hands-on tutorial, like we said above, you can sign up for a time with us here to work with you here.
Second, extending the library of Expectations to be more expressive. If you’ve created custom Expectations that others could use, but haven’t contributed them back to the open source library, we’d love to work with you to share them. If you’ve been thinking about creating new Expectations, but haven’t known how to get there, we’d love to work with you, too. The range of Expectations that can be created and shared is enormous---we’re excited to start building them together. Let us know here what Expectations you'd like to see here.
Third, developing more powerful data Profilers. Profilers have been in alpha for almost a year, and we’ve been steadily collecting feedback and ideas for improvements. We’ve landed on an approach that creates lots of scope for automation while still leaving developers in control. We’re looking for a handful of design partners to work closely with us, to make certain we’ve nailed the abstractions and experience. If you're interested in being a design partner for profilers, let us know here.
If you’d like to be involved in any of these areas---or have other ideas for improving the library---please make yourself heard by filling out the surveys above or jump into our slack and ping a Great Expectations Core member. To keep up to date on all of the updates, newsletters and blogs sign up for our newsletter
As always, thanks for your support, feedback, and contributions. It’s a pleasure to work on this project with you.