Lead Data Engineer

Role Overview

We are seeking an experienced Lead Data Engineer to build and maintain scalable, high-performance data pipelines and infrastructure for our next-generation data platform. The platform ingests and processes real-time and historical data from diverse industrial sources such as airport systems, sensors, cameras, and APIs. You will work closely with AI/ML engineers, data scientists, and DevOps to enable reliable analytics, forecasting, and anomaly detection use cases.

Responsibilities

Analyzing business and system requirements and define optimal data pipeline design for fulfilling them.
Building scalable, performant, supportable and reliable data pipelines
Ability to work on and optimize data systems as well as building them from the ground up
Building scalable, performant, supportable and reliable data pipelines
Ability to optimize data systems as well as building them from the ground up
Defining and implementing DevOps framework using CI/CD
Setting up new, and monitoring of existing metrics, analyzing data, and in cooperation with other Data & Analytics team members identifying and implementing system and process improvements
Supporting collection of the metadata into our data catalogue system where all available data is maintained and cataloged
Working closely with data architects, data analysts and other data warehouse engineers and data scientists
To collaborate across different teams/geographies/stakeholders/levels of seniority
Work with team in agile development methodology
Have Customer focus with an eye on continuous improvement
Supporting Agile way of working using SCRUM framework
To collaborate across different teams/geographies/stakeholders/levels of seniority
Have Customer focus with an eye on continuous improvement

Qualifications/Experience

The ideal candidate will have a Bachelor’s, Master’s Degree in Computer Science with at least 7 years of professional experience working on data sets and building data pipelines, and familiar with the following software /tools/techniques:

Programming skills in Python/Pyspark.
Expert SQL knowledge and experience with relational databases, query authoring (SQL), as well as working familiarity with a variety of databases
Ability to develop, maintain and distribute the code in modularized fashion
Experiences with Data Lake/Big Data Projects implementation, building data pipeline in Cloud and/or On-premises platforms:
Cloud technology stack: AWS, Azure or GCP, Databricks (proven experience is a big plus!) Data pipeline, Data Transformation, Data Storage, Data Quality Management, DevOps
Experience data pipeline keeping DevOps framework in mind
Basic knowledge on Data Warehousing concepts for large complex data sets – defining, building, and optimizing data models based on use case requirements
Good understanding of Software Development Lifecycle, source code management, code reviews, etc.
Experience in performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Energetic, enthusiastic, and results-oriented personality
Motivation and ability to perform as a consultant in data engineering projects
Ability to work independently but also within a Team
Strong will to overcome the complexities involved in developing and supporting data pipelines.