Job Description
Summary
he Data Engineer will play a key role in designing, developing, and maintaining our data infrastructure, enabling the organization to extract valuable insights from diverse datasets. The candidate will collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to ensure the seamless flow of data and facilitate data-driven decision-making.
Duties & Responsibilities:
- Implement scalable data lake/lakehouse solution on AWS.
- Design and implement big data pipelines using batch and streaming tools.
- Develop data modelling workflows with DBT.
- Orchestrate workflows with Airflow.
- Collaborate with global teams to ensure data governance standards are met.
- Build and deploy end-to-end CI/CD pipelines, incorporating DevOps best practices.
Requirements:
- 3+ years experience in a similar role.
- Bachelor's degree or higher in a quantitative/technical field (e.g. Computer Science, Engineering or similar).
- Great at communicating ideas and working in a team.
- Solid proficiency experience with Python to develop data ingestion pipelines.
- Advanced knowledge of REST APIs and Websockets.
- Experience with Delta Lake infrastructures and large datasets processing with Apache Spark / PySpark.
- AWS experience preferred, with proficiency in a wide range of AWS services (e.g., EC2, S3, Glue, Fargate, Lambda, IAM, EMR, Kinesis)
- Working knowledge of DBT for Data Modeling.
- Expert knowledge of SQL and experience in using cloud-based DWH, such as Redshift/Snowflake/Google Big Query.
- Knowledge designing CI/CD pipelines and IaC experience, Terraform preferred.
- Experience with building data pipeline processes and orchestration tools such as Airflow.
- Good understanding of cloud infrastructure platforms.
- Self-starter and detail-oriented. You can complete projects with minimal supervision.
Skills
- AWS
- Database Management
- Development
- Python
- Software Engineering
- SQL