Last updated
Was this helpful?
Last updated
Was this helpful?
If you already have some training data that you want to start with, you can use that as well with Zingg. Add an attribute trainingSamples to the config and define the training pairs.
The training data supplied to Zingg should have a z_cluster column that groups the records together. The z_cluster uniquely identifies the group. We also need to add the z_isMatch column which is 1 if the pairs match or 0 if they do not match. The z_isMatch value has to be the same for all the records in the z_cluster group. They either match with each other or they don't.
An example is provided in . Here, the first column specifies the z_cluster, the second column specifies the z_isMatch value and the remaining columns are the ones which are used for training the model.
The above training data can be specified using
In addition, labeled data of one model can also be exported and used as training data for another model. For details, check out .
Note: It is advisable to still run and a few rounds to tune Zingg with the supplied training data as well as patterns it needs to learn independently.
Supplementing Zingg With Existing Training Data