# Field Definitions

#### fieldDefinition

This is a JSON array representing the fields from the source data to be used for matching, and the kind of matching they need.

Each field denotes a **column** from the input. Fields have the following JSON attributes:

**fieldName**

The **name** of the field from the input data schema.

**fields**

To be defined later. For now, please keep this as the `fieldName`

**dataType**

Type of the column - `string, integer, double`, etc.

**matchType**

* The way to match the given field. Multiple match types can be combined (comma-separated), for example: `FUZZY,NUMERIC`.
* Here are the supported match types and descriptions:

|                        Match Type | Description                                                                                                                                                                                                                                                                                                                    | Applicable To                        |
| --------------------------------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------ |
|                             FUZZY | Broad matches tolerant to typos, abbreviations, and other variations. Uses fuzzy string similarity features.                                                                                                                                                                                                                   | string, integer, long, double, date  |
|                  FUZZY\_OPTIMISED | Same semantics as FUZZY but uses an optimized implementation. Provides similar matching quality with significantly lower CPU and memory usage on large datasets. Recommended for production/large-scale matching when FUZZY quality is desired but performance is critical. [Zingg Enterprise Feature](#user-content-fn-1)[^1] | string, integer, long, double, date  |
|                             EXACT | Exact match with no tolerance for variation. Preferable for country codes, pin codes, and other categorical variables where you expect no variations.                                                                                                                                                                          | string, integer, long, date, boolean |
|                         DONT\_USE | Included in the output but no computation is done on this field. Useful for IDs required in the output. DONT\_USE fields are not shown to the labeler when [showConcise](https://github.com/zinggAI/zingg/blob/main/docs/stepbystep/configuration/label.md) is set to true.                                                    | any                                  |
|                             EMAIL | Matches only the local part (before the @) of email addresses.                                                                                                                                                                                                                                                                 | string                               |
|                  EMAIL\_OPTIMISED | Same semantics as EMAIL but uses an optimized implementation for much faster evaluation on large datasets while preserving match behavior. Recommended for large-scale runs where many emails are compared. [Zingg Enterprise Feature](#user-content-fn-1)[^1]                                                                 | string                               |
|                           PINCODE | Matches postal / pin codes (supports typical local formats such as `xxxxx` or `xxxxx-xxxx` depending on data).                                                                                                                                                                                                                 | string                               |
|                   NULL\_OR\_BLANK | By default, Zingg treats nulls as matches. If NULL\_OR\_BLANK is added to a field that also has other match types (e.g., `FUZZY`), Zingg will build an explicit feature for null/blank values so the model can learn their effect.                                                                                             | string, integer, long, date, boolean |
|                              TEXT | Compares overlapping words between two strings. Good for descriptive or long-text fields without many typos.                                                                                                                                                                                                                   | string                               |
|                           NUMERIC | Extracts numbers from strings and compares how many are the same across both strings (useful for apartment numbers, building numbers, etc.).                                                                                                                                                                                   | string                               |
|              NUMERIC\_WITH\_UNITS | Extracts product codes or numbers with units (e.g., `16gb`) and compares how many are same across both strings                                                                                                                                                                                                                 | string                               |
|            ONLY\_ALPHABETS\_EXACT | Compares only the alphabetic characters and requires an exact match. Useful when numeric parts should be ignored (e.g., building name vs flat number).                                                                                                                                                                         | string                               |
|            ONLY\_ALPHABETS\_FUZZY | Compares only the alphabetic characters using a fuzzy comparison; numeric characters are ignored. Useful for addresses when you want to handle street names fuzzily while numeric parts are handled separately (e.g., via NUMERIC).                                                                                            | string                               |
| ONLY\_ALPHABETS\_FUZZY\_OPTIMISED | Same semantics as ONLY\_ALPHABETS\_FUZZY but uses optimized processing. Use this when you need the fuzzy alphabet-only behavior at production scale. [Zingg Enterprise Feature](#user-content-fn-1)[^1]                                                                                                                        | string                               |
|               MAPPING\_(FILENAME) | Maps input values to canonical values using a mapping file (e.g., nickname maps, company abbreviations, gender codes). Matching is tolerant to common variations defined in the mapping. See [Advanced Match Types](/latest/stepbystep/configuration/adv-matchtypes.md) for mapping file format and examples.                  | string                               |

[^1]: Zingg Enterprise is the suite of proprietary products licensed by Zingg. Please refer to <https://www.zingg.ai/product/zingg-entity-resolution-compare-versions> for individual tier features.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.zingg.ai/latest/stepbystep/configuration/field-definitions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
