Community roundup: July 2023

Featuring multi-Batch Checkpoint configuration, the new GX Discourse forum, and more

Erin Kapp
July 20, 2023
Erin Kapp
July 20, 2023
Great Expectations community roundup July 2023 cover card

At this month’s meetup, we:

  • Met Mollie Pettit, GX’s new senior community product manager!

  • Got a tour of the brand-new GX Discourse forum

  • Learned how to do multi-Batch Checkpoints efficiently

And learned about work by this month’s contributors, the latest on the product roadmap, and more.

Watch the recording:

Sign up here to join the next one! 

Thanks and kudos

Our Slack supporters are an indispensable part of keeping the GX community vibrant! Kudos to all our top Slack supporters for July:

July 2023 top Slack supporters

And our GitHub contributors do great work every month. We want to especially recognize these contributors for July:

July 2023 featured GitHub contributors
  • You can connect to Azure Blob Store, Google Cloud Storage, and Amazon S3 more easily using Fluent Datasources now, thanks to Toivo Mattila’s work adding recursive file discovery!

  • Expect_day_sum_to_be_close_to_equivalent_week_day_mean gets another useful update from Hadas Manor.

  • Expect_queried_column_pair_values_to_be_both_filled_or_null makes its debut: thanks for this new column pair Expectation, Eden Omardeker!

  • New contributor workflow documentation was created by Christian Bromann. Contributors writing for contributors adds a whole other level of insight to those docs: thank you!

Welcome, Mollie!

We are incredibly excited to welcome Mollie Pettit as our new senior community product manager!

Mollie began her professional career in geoscience before moving into data science and then data visualization engineering. She’s always been drawn to fostering communities, with one major example being her co-founding of the Data Visualization Society in 2019 then running its global conference for two years.

Developer relations allows Mollie to combine her technical skills with community development. She officially joined GX full-time this month and is looking forward to getting to know the community!

You can reach her on the GX Slack as @Mollie Pettit and on Discourse @mollie.pettit.

We have Discourse

GX’s new Discourse forum has launched!

Slack is a great space for a lot of communications, but it falls short as a knowledgebase. Most significantly, it only keeps messages for 90 days on its free plan. But even if message retention were longer, those messages wouldn’t be publicly searchable outside the app, which makes it much harder for users to find answers to their questions.

So to address both those things, we’ve revamped the GX Discourse forum to be a more welcome and effective place for Q&A! The Slack will still be a place for community members to connect in a more social setting, about basically anything that isn’t a Q&A support-type question.

For current Slack users, we’ve established a pipeline to help ease the transition:

Discourse questions are automatically cross-posted to Slack. You can post in Discourse and be sure that everyone will see it, even if they’re sticking with Slack. Responses to the Slack thread will be cross-posted back to the Discourse.

In the meetup, Mollie gave a tour of the new setup:

Check out the Discourse at

Roadmap update

ICYMI: 🎉 Support for SQL Alchemy v2 and Pandas v2 is now live! 🎉

We’ve also made a number of improvements to GX Cloud. Most recently, we implemented a way to create and edit Expectations entirely in the UI and added historical charts for viewing Validation Results over time. 

If you’d like to check out Cloud yourself, sign up for the Beta here!

Exploring the API walkthrough

Developer advocate Haebichan Jung walked us through some flowcharts that help you explore the GX API.

If you have questions about the flowchart, @Haebichan Jung is happy to talk! Reach out to him in the Feedback section of the Discourse or the #gx-feedback channel in Slack.

Multi-Batch Checkpoints

Lately, multi-Batch Checkpoints have been a hot topic in the GX Slack.

Several of the users we’ve heard from have been struggling with the same thing: getting all the Batches, not just one of them, to be processed. The common workaround is to create a separate Asset for each file, a separate Batch for each Asset, and then a separate Batch Request and Expectation Suite, etc. This works, but is fairly inefficient.

Our recommended solution is to use a single Asset. While this might not immediately sound more efficient, it actually makes a huge different from a coding standpoint: our solution takes 5 lines of code to handle as many Batches as you have, whereas the workaround takes 3 lines per Batch.

GX developer advocate @Austin Robinson demonstrated the difference between these two solutions:

You can grab the Gist used in the presentation here, or click here for documentation.

Join the conversation

Additional updates

Have you done something cool with Great Expectations that you'd like to share? If you're interested in demoing or have a piece of data quality content that you'd like us to feature, DM @Mollie Pettit on our Slack.

Like our blogs?

Sign up for emails and get more blogs and news

Great Expectations email sign-up

Hello friend of Great Expectations!

Our email content features product updates from the open source platform and our upcoming Cloud product, new blogs and community celebrations.

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Error message placeholder

Banner Image

Search our blog for the latest on data management

©2023 Great Expectations. All Rights Reserved.