Setting Zingg Development Environment

The following steps will help you set up the Zingg Development Environment. While the steps remain the same across different OS, we have provided detailed instructions for Ubuntu OS. Below examples have been created using Ubuntu 22.04.2 LTS

Make sure to update your ubutu installation

sudo apt update


Step 0 : Install Ubuntu on WSL2 on Windows

  • Install wsl: Type the following command in Windows PowerShell.

wsl --install
  • Download Ubuntu from Microsoft Store, Ubuntu 20.04 LTS

  • Configure Ubuntu with a username and password

  • Open Ubuntu 20.04 LTS and start working

sudo apt update
  • Follow this tutorial for more information.


Step 1 : Clone the Zingg Repository

  • Install and SetUp Git: sudo apt install git

  • Verify : git --version

  • Set up Git by following the tutorial.

  • Clone the Zingg Repository: git clone https://github.com/zinggAI/zingg.git

Note :- It is suggested to fork the repository to your account and then clone the repository.


Step 2 : Install JDK 1.8 (Java Development Kit)

  • Follow this tutorial to install Java8 JDK1.8 in Ubuntu.

  • For example:

sudo apt install openjdk-8-jdk openjdk-8-jre
javac -version
java -version

Step 3 : Install Apache Spark -

wget https://www.apache.org/dyn/closer.lua/spark/spark-3.5.0/spark-3.5.0-bin-hadoop3.tgz
tar -xvf spark-3.5.0-bin-hadoop3.tgz
rm -rf spark-3.5.0-bin-hadoop3.tgz
sudo mv spark-3.5.0-bin-hadoop3 /opt/spark

Make sure that spark version you have installed is compatible with java you have installed, and Zingg is supporting those versions.

Note :- Zingg supports Spark 3.5 and the corresponding Java version.


Step 4 : Install Apache Maven

  • Install the latest maven package.

  • For example for 3.8.8:

wget https://dlcdn.apache.org/maven/maven-3/3.8.8/binaries/apache-maven-3.8.8-bin.tar.gz
tar -xvf apache-maven-3.8.8-bin.tar.gz 
rm -rf apache-maven-3.8.8-bin.tar.gz 
cd apache-maven-3.8.8/
cd bin
./mvn --version

Step 5 : Update Env Variables

Open .bashrc and add env variables at end of file

vim ~/.bashrc

export SPARK_HOME=/opt/spark
export SPARK_MASTER=local[\*]
export MAVEN_HOME=/home/ubuntu/apache-maven-3.8.8
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin:$MAVEN_HOME/bin
export ZINGG_HOME=<path_to_zingg>/zingg/assembly/target
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

Save/exit and do source .bashrc so that they reflect

source ~/.bashrc

Verify:
echo $PATH
mvn --version

where <path_to_zingg> will be a directory where you clone the repository of the Zingg. Similarly, if you have installed spark on a different directory you can set SPARK_HOME accordingly.

Note :- If you have already set up JAVA_HOME and SPARK_HOME in the steps before you don't need to do this again.


Step 6 : Compile the Zingg Repository

  • Run the following to Compile the Zingg Repository -

git branch
(Ensure you are on main branch)
mvn initialize
* Run the following to Compile the Zingg Repository - **mvn initialize** and
* **mvn clean compile package -Dspark=sparkVer**

Note :- Replace the sparkVer with the version of spark you installed, For example, -Dspark=3.5 and if still facing error, include -Dmaven.test.skip=true with the above command.

Note :- substitute 3.3 with profile of the spark version you have installed. This is based on profiles specified in pom.xml


Step 7 : If had any issue with 'SPARK_LOCAL_IP'

  • Install net-tools using sudo apt-get install -y net-tools

  • Run command in the terminal ifconfig, find the IP address and paste the same in /opt/hosts IP address of your Pc-Name


Step 8 : Run Zingg to Find Training Data

  • Run this Script in terminal opened in zingg clones directory - ./scripts/zingg.sh --phase findTrainingData --conf examples/febrl/config.json


If everything is right, it should show Zingg Banner.

Last updated