Model Difference
Comparison of two outputs with different models
Let us take the case where we have an existing model where we have marked some fields as fuzzy and we then build a model and look at its match output. Now, we train another model where we've marked some of these attributes as exact or maybe added more match types or even change some field types, etc. Here, the primary key remains the same.
We want to understand how those changes are translating into either a better or worse model. Also, what other changes that we could make to get the model to the kind of accuracy that we are looking for.
Comparison of the two outputs becomes important in such a case and understanding which model is working better for us.
The model difference phase is run as follows:
./scripts/zingg.sh --phase diff –conf <path to original conf> --compareTo <path to new conf>
The output will be as follows -
zingg_modelDiff_originalModelId_newModelId
The output will contain records that have been impacted due to changes in clusters as a result of the new model trained.
Last updated
Was this helpful?