Finding Records For Training Set Creation

Pairs of records that match or dont so as to train Zingg

The findTrainingData phase prompts Zingg to search for edge cases in the data which can be labeled by the user and used for learning. During this phase, Zingg combs through the data samples and judiciously selects limited representative pairs that can be labelled by the user. Zingg is very frugal about the training so that user effort is minimized and models can be built and deployed quickly.

This findTrainingData job writes the edge cases to the folder configured through zinggDir/modelId in the config:

./zingg.sh --phase findTrainingData --conf config.json

The findTrainingData phase is run first and then the label phase is run and this cycle is repeated so that the Zingg models get smarter from user feedback.

PreviousWorking With Training Data NextLabeling Records

Last updated 2 months ago

Was this helpful?

Finding Records For Training Set Creation

Pairs of records that match or dont so as to train Zingg

This findTrainingData job writes the edge cases to the folder configured through zinggDir/modelId in the config:

./zingg.sh --phase findTrainingData --conf config.json

The findTrainingData phase is run first and then the label phase is run and this cycle is repeated so that the Zingg models get smarter from user feedback.

PreviousWorking With Training Data NextLabeling Records

Last updated 2 months ago

Was this helpful?