Output Statistics
Under the hoods of the matching process
Zingg Enterprise Feature
If you’ve ever asked “how deterministic rules are performing?” or “did my latest incremental run improve cluster quality?”, Output Statistics is your answer. The Output Statistics surface information about the linkages Zingg found among records within a cluster. While running Zingg incrementally, Output Statistics expose how cluster numbers change as records get inserted and updated into the identity graph. Match Statistics surfaces those insights by writing structured metrics for every match or incremental run, so you can:
See how dense or sparse your clusters are
Understand how much of a cluster is explained by deterministic rules vs. probabilistic links
Identify highly central records (connectors) and outliers
Track how clusters change across runs (growth, splits, merges, reassignments)
If the number of clusters changes disproportionately to the number of records updated or added, an alert could be triggered.
What gets written
Zingg writes statistics to the stats directory whenever you run phases like match or incremental. The output comprises of three types:
SUMMARY: High-level run summary
CLUSTER: One row per cluster with cluster level matching metrics
RECORD: One row per record with its matching metrics within its cluster
Last updated
Was this helpful?