Community roundup: January 2023

Great Expectations community roundup January 2023 cover card

We demo’d new functionality, reviewed the product roadmap, and welcomed new team members at the Great Expectations January community meetup.

We gather as a community on the third Tuesday of every month. Sign up here to join the next one!

This month, we covered

Community stats, thanks, and kudos
A GX product roadmap update
Introducing Suzie Antal, senior product manager at GX
How to include primary keys in Validation Results, a new feature!

You can watch the complete recording below.

Thanks and kudos

The GX community is a huge part of our success, and especially the people who contribute to the platform. Thank you to everyone who contributed on GitHub in December and January:

Our Slack supporters are key to our community’s ecosystem, especially during those times when GX’s developer relations team can’t be available.

Kudos to the top Slack supporters this month: Thiago Militino, Adarsha, Aleksei Chumagin, Amauri, Han Siong, Veronica Moi, Aravind Narayanan, Philip Fürste, and Dimitrios Truchan.

Roadmap update

Our core themes for the GX roadmap are usability, community, capability, and quality.

Here’s how those showed up in recent work on the GX platform:

Data Docs can now include failed rows! Watch a demo of this new ID/PK functionality.
We’ve made improvements to the code quality of DataContexts which will help make maintenance and future work easier.
New API documentation uses Sphinx, so you’ll see the functionality you know and love from other Python projects coming to our API docs! Check out the progress here.
We’ve done a lot of cleanup on the documentation and tests for our core Expectations so that they’re a better example for community-contributed Expectations.
New integration guides are incoming: our integration guide for AWS S3 and Pandas is now available, with additional guides for Spark, Athena, and RedShift arriving within the next couple of weeks.
We now support Python 3.10!

As a reminder: we have updated our workflows to help us better respond to community PRs! As part of this new routine, PRs that go more than five days without any update may be closed. Do not hesitate to reopen your PR if it was closed due to inactivity. We completely understand that everyone has different availability.

Welcome

We are very excited to introduce Suzie Antal as the senior product manager for GX Open Source!

Suzie has firsthand knowledge of the importance of data quality from her work in analytics. She’s looking forward to learning more about the community’s needs and pain points, so don’t hesitate to reach out to her on Slack @Suzie A.

Feature demo

Will Shin, a GX software engineer, showed off the platform’s new ID/PK feature, which returns the index of any lines that failed an Expectation.

You can watch that demo below, or read on for a summary.

Background

ID/PK is a great example of a GX feature that was able to come to fruition because of the community.

Before ID/PK, Expectation Validation Results identified what was wrong with the data, but not where: there was no way to identify which particular rows failed the Expectation in most cases. The exception was Pandas users, who could use unexpected_index_list to create a list of index numbers using the default Pandas index.

The initial request for ID/PK came in a GitHub issue opened by KentonParton. In the following discussion, many community members contributed valuable insight as needs for this feature were fleshed out.

Community member Aidan Fennessy undertook the initial work on this feature, implementing ID/PK for Pandas, before passing the baton to the GX team to finish implementation. Thanks, Aidan and Kenton!

What ID/PK does

With ID/PK, you can specify the primary keys using unexpected_index_column_names in the result_format of a Checkpoint. The keys for rows that fail the Checkpoint’s Expectations will be included in the Unexpected Value Count table.

In addition to those keys, the output now by default includes a query that will allow you to retrieve all the rows that failed the Expectation. This output will vary slightly depending on whether you’re using Pandas, SQL, or Spark.

ID/PK also allows you to include multiple index column names. For details, watch the demo!

Join the conversation

Mehul Batra is looking for insight into what languages people prefer for creating CLIs.
We want to know: what behavior would you like to see in Data Docs?
Deepa KP is looking for suggestions about moving records to different Snowflake tables depending on whether they pass or fail Expectations.
Fraser shared a 10-minute walkthrough of the newest features in Dagster.
Monica Miller dropped the link to register for Datanova, a free virtual data conference.

Additional updates

Next month, we’re meeting on February 21: get your invite here.
GX had a lot of great community contributions this month, on top of the new features discussed above!
Meet Josh Zheng, GX’s new director of developer relations.
Check out our tips on getting started with logging, the first pillar of observability.
GX is a great place to work, and here are some reasons why. And…
We’re hiring engineers and a developer advocate! Check out our open roles.

Have you done something cool with Great Expectations that you'd like to share? If you're interested in demoing or have a piece of data quality content that you'd like us to feature, DM @Kyle Eaton on our Slack.

January's community roundup features a demo of how to get the key of rows that fail Expectations, a roadmap update, and more!

Erin Kapp

This month, we covered

Thanks and kudos

Roadmap update

Welcome

Feature demo

Background

What ID/PK does

Join the conversation

Additional updates

Like our blogs?

Great Expectations email sign-up

Search our blog for the latest on data management

Community roundup: January 2023

January's community roundup features a demo of how to get the key of rows that fail Expectations, a roadmap update, and more!

Erin Kapp

This month, we covered

Thanks and kudos

Roadmap update

Welcome

Feature demo

Background

What ID/PK does

Join the conversation

Additional updates

SHARE THIS ARTICLE

Like our blogs?

Great Expectations email sign-up

Search our blog for the latest on data management