The mission in BNP Paribas Fortis was to further develop the data pipeline of the datahub of the bank. At the same time the focus was to ensure data new data was ingested and delivered to the multiple internal and external clients of the bank. A crucial part was the deployment of the software releases of each component of the pipeline, which included working both with batch and streaming data flows.

  • Support and manage the software release lifecycle of the bank for multiple projects.
    • CI/CD pipeline technologies: Ansible, Jenkins, Kubernetes, Artifactory and Gitlab.
    • Runbook creation and release data testing with Swagger API.
  • ETL supervision and incidence resolution for different CD clusters.
    • Spark cluster, Flink cluster, Kafka cluster and HDP/CDP cluster.
    • Workflows of the ETL implemented with Autosys.
Data Pipeline - Lambda Architecture

Data Pipeline – Lambda Architecture

  • Configuration and use of HDP/CDP platform with Hive, Tez and HDFS in on premises cluster.
  • Administration and monitoring of Kafka cluster and its interaction with Flink cluster using Splunk.
  • Scripting of streaming jobs in Scala using Flink and Kafka for internal client clusters with Kafka, IBM MQ, Cassandra DB and BI tools.