At this month’s meetup, we:
Heard the latest on what’s coming soon in GX OSS
Got an exclusive demo of the GX Cloud Beta
Found out what’s new in the latest GreatExpectationsOperator from Airflow
Learned about using Docker with GX
And more!
You can watch the complete recording below:
The Great Expectations community gets together on the third Tuesday of every month. Sign up here to join the next one!
Thanks and kudos
The GX community is key to our success!
Special recognition this month goes out to Maayan Gad and Mantas Mykolaitis!
Maayan added support for a date function within conditional Expectations when using SQL Alchemy (#7359).
Mantas contributed the Expectation expect_queried_custom_query_results_to_be_custom and updated the query Expectation template so that templating can be used within the passed query (#7390).
Thank you to everyone who contributed to GX this month…
…and kudos to all of our Slack supporters for April!
The GX Slack community recently reached 10,000 members, and the efforts of our top Slack supporters have been pivotal in empowering the community’s growth.
Product updates
We recently launched the GX Cloud Beta! Sign up here if you’re interested in trying it out.
Tal Gluck, product manager, shared what’s up next for the GX platform:
SQL Alchemy v2 support
Pandas v2 support
SDK improvements that will make it easier to write and contribute Data Assistants
Continued testing improvements for GX OSS
If you have feedback or ideas for upcoming GX features, let Tal know at @Tal Gluck in #gx-feedback!
Cloud Beta demo
Erik Hencier, the product manager for GX Cloud, gave an exclusive demo from the GX Cloud Beta for live viewers only: we hope everyone who attended enjoyed it!
To sign up for the GX Cloud Beta, go to https://greatexpectations.io/cloud. You can learn more about GX Cloud on our blog, or reach Erik on the GX Slack @ErikHencier.
The next generation GX and Airflow experience
Tamara Fingerlin, a developer advocate at Astronomer, showed how to orchestrate GX data validations from Apache Airflow using the newest version of the GreatExpectationsOperator.
If you haven’t come across Airflow, it’s a tool for programmatically authoring, scheduling, and monitoring data pipelines. With over 12 million downloads per month, it’s immensely popular and definitely worth checking out if you’re in the market for a pipeline orchestration tool.
Tamara gave us a crash course in important Airflow terminology and an update on what’s new (you can use decorators now!) before diving into the GX Airflow operator.
Highlights of the new GXO include its update to simplify GX configuration—you no longer need to explicitly define things like Checkpoints, which is especially useful if you’re new to GX—and full backwards compatibility with the older version of the GX operator. A demo using European energy data showed the GXO in action.
Check out Tamara’s full presentation for details, plus a sneak peek at Astronomer’s new Cloud IDE, a notebook-style interface for writing DAGs that includes a cell to add the GreatExpectationsOperator without needing to write any Python.
The repo used in the presentation is available on GitHub here. If you have questions, you can contact Tamara at @Tamara Fingerlin in the #integration-airflow channel on the GX Slack.
Thanks for presenting, Tamara!
Your questions answered: Docker & GX
Ruben Orduz, a developer advocate at GX, presented on some of the use cases for GX and Docker, including:
How to use Docker to containerize GX for local development
Creating a self-contained deployment
Incorporating Docker into your CI pipeline
Accessing a Jupyter notebook from a Docker container via your local browser
Watch his presentation here…
…and then grab the sample repo on GitHub.
If you have questions about your GX + Docker setup, check out #gx-community-support on the GX Slack!
Join the conversation
We updated the GX Slack to make it easier to navigate and improve the overall experience! Get the details here.
Erica Robeen is looking for ideas about integrating GX with the W3 SHACL standards.
Cesar Garcia Saez shared his current project: establishing best practices for longitudinal datasets from open government data.
Vinicius Machado Mansur looked for suggestions about using flags in regex expressions.
Hassane Karkach requested feedback on their architecture for using GX with Spark/Kubernetes while loading DataContexts dynamically.
Vijay Mohan Jonnakuti asked for best practices for handling data quality on Snowflake.
Artur Iracki sought advice on using dynamic parameters within Expectations.
Additional updates
Next month we’re meeting on Tuesday, May 18! Get the invite here.
We’ve updated our community code of conduct to better describe how we work to make all GX community spaces safe, inclusive, and supportive.
Get a framework for using Databricks notebooks + GX data quality to set up a validated data workflow.
Have you tried our new fluent Datasource configuration?
This month we spotlighted GX contributor Hadas Manor!
We’re hiring in engineering and developer relations—check out our open roles.
Have you done something cool with GX that you’d like to share? If you’re interested in presenting at a community meetup, or if there’s a topic you’d like to hear from the GX team on, DM @Josh Zheng on the GX Slack.