This blog is about Metrics. That’s Metrics as in 'the specific GX Metric object,' not 'the generic concept of metrics.'

Most GX users won’t ever need to interact with a Metric. You’ll really only encounter Metrics directly when you’re deep in an Expectation’s code or creating a Custom Expectation.

So this post is a primer aimed at the subset of advanced GX users who are doing that under-the-hood Expectations work. And, of course, at anyone who’s interested just because.

If you were hoping to hear about generic metrics, __can I interest you in this blog post instead__?

## What is a Metric?

Metrics are a key component of Expectations. One easy way to define a Metric is:

A Metric is an answer to a question you have about your data.

… where the question is part of your Expectation.

### Minimalist Metrics

For a simple example of how a Metric relates to an Expectation, let’s consider __expect_column_max_to_be_between__. You use this Expectation to describe an acceptable range of values (provided by you) for the column’s maximum.

To determine if this Expectation is being met, GX needs to answer a question about the data: *what is the column’s maximum value?*

With the answer to that single question, you can get the results of the Expectation. So this Expectation needs just one Metric,

`column.max`

Similarly, you can determine the results of __expect_column_unique_value_count_to_be_between__ if you answer *how many unique values does the column have?*—which corresponds to the sole Metric

`column.distinct_values.count`

Those examples are straightforward because those Expectations __produce a single overall statistic or result for each Batch they evaluate__. The Expectation is passed or failed based on that one answer.

But Expectations can also produce a pass/fail for each row, __with the Expectation’s results based on the totality of the row results__. Getting that kind of answer entails asking more than one question, which means more than one Metric.

With this kind of Expectation—which here we’ll call __ColumnMap, after the class that’s used to implement them__—we start to see multiple Metrics.

### Metrics for (Column)Maps

In a ColumnMap Expectation, you’re evaluating individual rows. If all the rows pass, the Expectation passes.

So the main questions you’re asking about the data as a whole are:

How many rows are there?

How many rows don’t meet the validation criteria?

How many invalid values are there?

What are the invalid values?

Generally, these questions show up in a ColumnMap Expectation as the following Metrics:

`table.row_count`

`column_values.nonnull.unexpected_count`

- expectation_name
`column_values.`

.unexpected_count - expectation_name
`column_values.`

.unexpected_values

The first two questions and their respective Metrics are straightforward:

`table.row_count`

`column_values.nonnull.unexpected_count`

with the number of rows that fail, though in some scenarios you’ll see `column_values.null.unexpected_count`

instead.The answer to these two Metrics are what you need to determine if the Expectation is passed.

Strictly speaking, you don’t *need* to ask how many unexpected values there are (

`column_values.`

`.unexpected_count`

) or what they are (`column_values.`

expectation_name`.unexpected_values`

). But without this information a failed ColumnMap Expectation can’t provide you with any context about the failure; in practice, you should always ask these questions.For many ColumnMap Expectations in the Expectation Gallery, such as __expect_column_values_to_be_increasing__, these four Metrics are the ones you’ll see:

#### Metrics & `mostly`

There’s one more aspect to consider for ColumnMap Expectations: they can use the

`mostly`

Using

`mostly`

`mostly`

, is 100%.Using

`mostly`

`table.row_count`

and `column_values.nonnull.unexpected_count`

Metrics that the default pass/fail behavior uses.### Making Metrics

We’ve talked about Metrics as the answers to questions. It’s natural to ask if the Metrics also calculate those answers.

In short: no. This is where the MetricProvider steps in.

As we start talking about calculating, recall that GX can use different Execution Engines. And Pandas, Spark, and SQLAlchemy will each need different code to carry out the same calculation... so actually each Metric needs multiple calculations.

MetricProvider handles the connection between the Metric and the appropriate Execution Engine. To quote the __MetricProvider conceptual guide__:

To allow Expectations to work with multiple backends, methods for calculating Metrics need to be implemented for each ExecutionEngine. For example, [calculating the mean in] pandas is implemented by calling the built-in pandas

method on the column, Spark is implemented with a built-in Spark`.mean()`

`function…`

mean…the inputs for MetricProvider classes are methods for calculating the Metric on different backend applications. Each method must be decorated with an appropriate decorator. On

, the MetricProvider class registers the decorated methods as part of the Metrics registry so that they can be invoked to calculate Metrics.`new`

That concludes this intro to Metrics in GX! You can __read more about implementing a Metric here__, or check out the rest of __our documentation__.