Running On AWS
One option is to use the spark-submit
option with the Zingg config and phase. Please note that the config.json should be available locally at the driver for Zingg to use it.
A step-by-step is provided here. The guide mentions training locally using Zingg Docker, but the findTrainingData and label phases can be executed on EMR directly.
A second option is to run Zingg Python code in AWS EMR Notebooks
Last updated