Great Expectations sent a contingent to Data Council 2022 in Austin, TX. For many attendees, it was their first in-person conference in years, and it covered a broad range of topics relevant to data practitioners and startups in the space. These are some of our takeaways:
1. All Companies Are Data Companies, All Apps Are Data Apps
Data are the most important assets a business has.
The EU’s General Data Protection Regulation & California’s Consumer Privacy Act is pushing more and more companies towards data governance & data management.
Business leaders are bombarded with hype around AI/ML.
2. Data Warehouses and Data Lakes Are Converging Into Lakehouses
Another theme was the emergence of data lakehouses, which are hybrid data systems designed to leverage the structure of a data warehouse with the flexibility of a data lake.
Presenter Vinoth Chandar, CEO & Founder of Onehouse, a pioneer in the lakehouse space, said many companies were struggling to choose between the two older models when they actually needed a lakehouse.
Chandar said Uber, Robinhood, Amazon, Walmart, and others have adopted lakehouse models to empower their data science and business intelligence teams to scale data governance.
Many are expecting this space to grow but the proprietary nature of warehouses and lakes presents a challenge as companies grow, he said.
3. The Modern Stack Consists of Integrating Tools That Empower Collaboration
Open source adoption continues to accelerate but the market is “messy and frothy,” according to Maxime Beauchemin, CEO and Founder of Preset, and creator of Apache Superset and Apache Airflow.
It was vital that the tools, languages, and frameworks integrate easily because data developers would no longer tolerate inflexible, frustrating monolithic tools, Beauchemin said.
A number of presenters said there was no single answer and that different companies and personas required different solutions.
Side note: This is why Great Expectations is committed to an open source core. We want our product to integrate with the data stack you’re already using. You can check out and contribute to our Expectations Gallery here.
Funders said they believed the data sector would start to consolidate but not yet.
4. Business Process Is Driving Data Engineering Direction
But tooling is still governed by IT
When it comes to making decisions about their data infrastructure, teams are looking for integrating tools that power collaboration but they are hamstrung by outdated business practices.
Peter Wang from Anaconda said the IT department’s involvement in modern data stack decisions often curbed innovation but there were rarely consequences for making a safe choice that turned out to be limiting.
“Businesses only have to be wrong less often than their competitors,” he said. “No one ever got fired for hiring IBM.”
5. Data Literacy Is on the Rise
Another major trend is the rise of self-service for data products like dashboards and analysis tools.
A case study surfaced by Jesika Haria of LogicLoop found that 39 percent of roles at surveyed companies required some knowledge of SQL.
In the same study, users reported they wanted more collaboration and context in the data platforms themselves.
More semi-technical business users were contributing to data governance (?) at their organizations.
Bonus: Facetime With the Great Expectations Team, Contributors, and Friends
As a distributed company that has grown exponentially during a pandemic, our team really benefited from meeting each other and other people in the data space. The safety of our team and communities is paramount. At the same time, there’s no substitute for face time and we are thankful we were able to negotiate that safely at Data Council.