backgroundImage

Community roundup: May 2023

The community roundup featuring a product update, recognition of this month’s contributors, and a demo of quick-starting GX on Databricks with Q&A!

Erin Kapp
May 18, 2023
Great Expectations community roundup May 2023 cover card

At this month’s meetup, we:

  • Celebrated this month’s contributors

  • Found out what’s next in GX Cloud

  • Learned more about using GX with Databricks

We’re hiring in developer relations! If you or anyone you know is interested in joining the GX team, don’t hesitate to apply.

You can watch the complete recording below:

 

The GX community gets together on the third Tuesday of the month: get your invitation to join the next one!

Thanks and kudos

We’re super excited to see the community continue to grow! That’s due in no small part to the efforts of our Slack supporters. 

Kudos to all the long-standing contributors and new faces who are our top supporters for May:

May 2023 top Slack supporters

Special recognition this month goes out to Tobias Bruckert, Will, Hadas Manor, and Rishi!

May 2023 featured GitHub contributors
  • Tobias updated the classname for the MulticolumnDatetimeDifferenceInMonths Metric (#7734).

  • Will fixed a bug in expect_column_values_to_be_in_type_list that was causing sparkdf_datasets checks to fail (#7684).

  • Hadas fixed a bug in expect_day_count_to_be_close_to_equivalent_week_day_mean and deleted an Expectation that it had made redundant (#7782).

  • Rishi fixed a broken link in the README.md (#7780).

We also have some exciting community-contributed PRs in progress, which we’re looking forward to sharing next month!

Product updates

GX product manager Suzie Antal shared an update on what’s coming next for GX Cloud.

We’re sharing these Cloud updates in the community because we want it to be a great tool for current GX OSS users who need to collaborate with other teams, particularly nontechnical ones. If that’s you, we want to hear what would help you most!

Coming next in Cloud is: 

  • Improved sharing of Validation Results to make collaboration faster and easier

  • UI-based Expectation creation—especially for less-technical users

  • A new data health dashboard

If you have thoughts or ideas about any of these—especially how we could make Expectations more accessible to less-technical users, or what metrics and features you would find most helpful for measuring your data health—please contact Suzie! You can reach her @Suzie Antal on the GX Slack or in the #gx-feedback channel.

Demo: GX and Databricks: a powerful alliance

You might have seen this blog post about quickly spinning up GX OSS on Databricks and accessing BigQuery public datasets. The repo at the center of that process was created by Tanner Beam, an analytics engineer at GX, to use in his own work.

In this presentation, Tanner builds on the content of that post by providing additional context and commentary while demonstrating the repo live. He also shares an example of how he operationalizes the repo by building a dashboard for the BigQuery data.

 

Thanks, Tanner!

You can grab the repo here.

Q&A

We had some great followup questions:

Emanuel asked about inserting this process in the middle of a data pipeline and options for triggering notebook runs.

He also asked about using GX with files on the order of 2GB.

Amani asked about using the greatExpectationsOperator to validate a CSV file without loading the file to a database.

If we ran out of time for your question or you were having trouble connecting to ask it, please don’t hesitate to follow up on the GX Slack!

It’s easy to use

As additional evidence for how easy it is to get started with this repo, GX’s director of developer relations Josh Zheng successfully worked through the process described in the blog… right after logging into Databricks for the first time ever.

Join the conversation

Additional updates

Have you done something cool with GX that you’d like to share? If you’re interested in presenting at a community meetup, or if there’s a topic you’d like to hear from the GX team on, DM @Josh Zheng on the GX Slack.

Search our blog for the latest on data quality.


©2024 Great Expectations. All Rights Reserved.