Input Data
Last updated
Was this helpful?
Last updated
Was this helpful?
An array of input data. Each array entry here refers to a Zingg Pipe.
For the Zingg Community Version, it is better to have the most important fields first so that the blocking model can be learnt more effectively. The Zingg Enterprise Version has a wider search space and field ordering is not that critical.
If the data is self-describing, for e.g. Avro or Parquet, there is no need to define the schema. Else field definitions with names and types need to be provided.
For example, for the CSV under examples/febrl/test.csv
Read more about Zingg Pipes for datastore connections here.