Zingg
  • Welcome To Zingg
  • Step-By-Step Guide
    • Installation
      • Docker
        • Sharing Custom Data And Config Files
        • Shared Locations
        • File Read/Write Permissions
        • Copying Files To And From The Container
      • Installing From Release
        • Single Machine Setup
        • Spark Cluster Checklist
        • Installing Zingg
        • Verifying The Installation
      • Enterprise Installation for Snowflake
        • Setting up Zingg
        • Snowflake Properties
        • Match Configuration
        • Running Asynchronously
        • Verifying The Installation
      • Compiling From Source
    • Hardware Sizing
    • Zingg Runtime Properties
    • Zingg Command Line
    • Configuration
      • Configuring Through Environment Variables
      • Data Input And Output
        • Input Data
        • Output
      • Field Definitions
      • User Defined Mapping Match Types
      • Deterministic Matching
      • Pass Thru Data
      • Model Location
      • Tuning Label, Match And Link Jobs
      • Telemetry
    • Working With Training Data
      • Finding Records For Training Set Creation
      • Labeling Records
      • Find And Label
      • Using Pre-existing Training Data
      • Updating Labeled Pairs
      • Exporting Labeled Data
    • Verification of Blocking Model
    • Building And Saving The Model
    • Finding The Matches
    • Adding Incremental Data
    • Linking Across Datasets
    • Explanation of Models
    • Approval of Clusters
    • Combining Different Match Models
    • Model Difference
    • Persistent ZINGG ID
  • Data Sources and Sinks
    • Zingg Pipes
    • Databricks
    • Snowflake
    • JDBC
      • Postgres
      • MySQL
    • AWS S3
    • Cassandra
    • MongoDB
    • Neo4j
    • Parquet
    • BigQuery
    • Exasol
  • Working With Python
    • Python API
  • Running Zingg On Cloud
    • Running On AWS
    • Running On Azure
    • Running On Databricks
    • Running on Fabric
  • Zingg Models
    • Pre-Trained Models
  • Improving Accuracy
    • Ignoring Commonly Occuring Words While Matching
    • Defining Domain Specific Blocking And Similarity Functions
  • Documenting The Model
  • Interpreting Output Scores
  • Reporting Bugs And Contributing
    • Setting Up Zingg Development Environment
  • Community
  • Frequently Asked Questions
  • Reading Material
  • Security And Privacy
Powered by GitBook

@2021 Zingg Labs, Inc.

On this page
  • Prerequisites
  • Download Assembled Exasol Spark Connector Jar
  • Update Zingg Configuration
  • Connector Settings

Was this helpful?

Edit on GitHub
  1. Data Sources and Sinks

Exasol

PreviousBigQueryNextWorking With Python

Last updated 6 months ago

Was this helpful?

is a in-memory database built for analytics. You can use it to interact with Zingg AI.

Prerequisites

To use the Exasol database with Zingg, first you should download the required dependencies and set them in the Zingg configuration file.

Download Assembled Exasol Spark Connector Jar

Please download the latest assembled Exasol jar file. Go to the Assets section below, and make sure you download the jar file with -assembly suffix in the file name.

Update Zingg Configuration

After downloading the Exasol spark-connector jar file, you should update the spark.jars parameter in so that it can find the Exasol dependencies.

For example:

spark.jars=spark-connector_2.12-1.3.0-spark-3.3.2-assembly.jar

If there are more than one jar files, please use comma as separator. Additionally, please change the version accordingly so that it matches your Zingg and Spark versions.

Connector Settings

Finally, create a configuration JSON file for Zingg, and update the data or output settings accordingly.

For example:

...
 "data": [
    {
        "name": "input",
        "format": "com.exasol.spark",
        "props": {
            "host": "10.11.0.2",
            "port": "8563",
            "username": "sys",
            "password": "exasol",
            "query": "SELECT * FROM DB_SCHEMA.CUSTOMERS"
        }
    }
 ],
 ...

Similarly, for output:

...
"output": [
   {
       "name": "output",
       "format": "com.exasol.spark",
       "props": {
           "host": "10.11.0.2",
           "port": "8563",
           "username": "sys",
           "password": "exasol",
           "create_table": "true",
           "table": "DB_SCHEMA.ENTITY_RESOLUTION",
       },
       "mode": "Append"
   }
],
...

Please note that, the host parameter should be the first internal node's IPv4 address.

As Zingg uses underneath, please also check out the and for more information.

Exasol
spark-connector
Zingg's runtime properties.
Exasol Spark connector
user guide
configuration options