Step-By-Step Guide
Instructions on how to install and use Zingg
Zingg needs a configuration file that defines the data and what kind of matching is needed. You can create the configuration file by following the instructions here.
Zingg builds a new set of models(blocking and similarity) for every new schema definition(columns and match types). This means running the findTrainingData and label phases multiple times to build the training dataset from which Zingg will learn. You can read more here.
The training data in Step 4 above is used to train Zingg and build and save the models. This is done by running the train phase. Read more here.
As long as your input columns and the field types are not changing, the same model should work and you do not need to build a new model. If you change the match type, you can continue to use the training data and add more labeled pairs on top of it.
Last modified 1yr ago