Zingg
search
⌘Ctrlk
Zingg
  • Welcome To Zingg
  • Step-By-Step Guide
  • Data Sources and Sinks
  • Working With Python
    • Community Python API
    • Enterprise Python API
  • Running Zingg On Cloud
  • Zingg Models
  • Improving Accuracy
  • Interpreting Output Scores
  • Explanation of Matches
  • Combining Different Match Models
  • Reporting Bugs And Contributing
  • Community
  • Frequently Asked Questions
  • Reading Material
  • Security And Privacy
gitbookPowered by GitBook
block-quoteOn this pagechevron-down
githubEdit
  1. Working With Python

Community Python API

hashtag
Community Zingg Entity Resolution Python Package

Community Zingg Python APIs for entity resolution, identity resolution, record linkage, data mastering and deduplication using ML (https://www.zingg.aiarrow-up-right)

NOTE

Requires python 3.6+; spark 3.5.0 Otherwise, zingg.client.Zingg()arrow-up-right cannot be executed

hashtag
API Documentation

  • zingg.client modulearrow-up-right

    • Classesarrow-up-right

      • zingg.client.Zinggarrow-up-right

      • zingg.client.ZinggWithSparkarrow-up-right

      • zingg.client.Argumentsarrow-up-right

      • zingg.client.ClientOptionsarrow-up-right

      • zingg.client.FieldDefinitionarrow-up-right

  • zingg.pipes modulearrow-up-right

    • Classesarrow-up-right

      • zingg.pipes.Pipearrow-up-right

      • zingg.pipes.CsvPipearrow-up-right

      • zingg.pipes.BigQueryPipearrow-up-right

      • zingg.pipes.SnowflakePipearrow-up-right

hashtag
Example API Usage

PreviousWorking With Pythonchevron-leftNextEnterprise Python APIchevron-right

Last updated 16 days ago

Was this helpful?

@2021 Zingg Labs, Inc.

  • Community Zingg Entity Resolution Python Package
  • API Documentation
  • Example API Usage

Was this helpful?

from zingg.client import *
from zingg.pipes import *

#build the arguments for zingg
args = Arguments()
#set field definitions
fname = FieldDefinition("fname", "string", MatchType.FUZZY)
lname = FieldDefinition("lname", "string", MatchType.FUZZY)
stNo = FieldDefinition("stNo", "string", MatchType.FUZZY)
add1 = FieldDefinition("add1","string", MatchType.FUZZY)
add2 = FieldDefinition("add2", "string", MatchType.FUZZY)
city = FieldDefinition("city", "string", MatchType.FUZZY)
areacode = FieldDefinition("areacode", "string", MatchType.FUZZY)
state = FieldDefinition("state", "string", MatchType.FUZZY)
dob = FieldDefinition("dob", "string", MatchType.FUZZY)
ssn = FieldDefinition("ssn", "string", MatchType.FUZZY)

fieldDefs = [fname, lname, stNo, add1, add2, city, areacode, state, dob, ssn]

args.setFieldDefinition(fieldDefs)
#set the modelid and the zingg dir
args.setModelId("100")
args.setZinggDir("models")
args.setNumPartitions(4)
args.setLabelDataSampleSize(0.5)

#reading dataset into inputPipe and setting it up in 'args'
schema = "id string, fname string, lname string, stNo string, add1 string, add2 string, city string, areacode string, state string, dob string, ssn  string"
inputPipe = CsvPipe("testFebrl", "examples/febrl/test.csv", schema)
args.setData(inputPipe)
outputPipe = CsvPipe("resultFebrl", "/tmp/febrlOutput")

args.setOutput(outputPipe)

options = ClientOptions([ClientOptions.PHASE,"match"])

#Zingg execution for the given phase
zingg = Zingg(args, options)
zingg.initAndExecute()