ETL process definition, implementation and internal tuition for the client.

The project objective is to migrate data from SQL server databases to Hadoop environment. Used technologies are Cloudera CDH, Hive, Impala, Kudu and Spark.

CDH-spark-impala
CDH-spark-impala

Installation and configuration of the CDH version 6.3 which provided free licence for the CDH installation on all the on-premises cluster nodes.

CLOUDERA MANAGER