Senior Data / ETL Engineer

Company: EazyML
Apply for the Senior Data / ETL Engineer
Location:
Job Description:

Company Description

EazyML is an innovative machine learning platform designed to predict outcomes from textual data with unparalleled transparency and ease of use. As the first of its kind, EazyML sets a new standard for user-friendly machine learning solutions. The platform empowers organizations to efficiently leverage machine learning without requiring extensive technical expertise.

Role Description

This full-time remote role for a Senior Data/ETL Engineer involves designing, developing, and maintaining ETL processes to ensure seamless data integration and transformation. The role includes collaborating with cross-functional teams to analyze data requirements, create data models, and optimize performance. The engineer will also troubleshoot and resolve data-related challenges while ensuring the integrity and accuracy of data pipelines.

We are seeking a highly skilled Senior Data Engineer / ETL Developer to design, develop, and optimize scalable data pipelines, APIs, and database solutions across modern data platforms.

The ideal candidate has deep hands-on expertise in PySpark, Python, SQL, and distributed data systems, along with exposure to AI/ML workflows. You will collaborate with cross-functional teams to build reliable, high-performance data infrastructure that powers analytics and machine learning solutions.

Qualifications

  • Strong expertise in Extract, Transform, Load (ETL) processes and proficiency with ETL tools
  • Proven experience in data integration and data modeling techniques
  • Solid analytical skills and a problem-solving mindset
  • Familiarity with machine learning platforms and data processing workflows is a plus
  • Bachelor’s degree in Computer Science, Data Science, or a related field
  • Ability to work effectively in a remote, collaborative environment
  • Strong communication and organizational skills

Required Qualifications

  • 5+ years of hands-on experience in Data Engineering.
  • Strong proficiency in Extract Transform Load (ETL) processes and experience with ETL Tools
  • Expertise in Data Integration and Data Modeling practices
  • Excellent Analytical Skills to interpret, structure, and manage complex datasets
  • Strong expertise in PySpark and distributed data processing frameworks (Spark, Databricks, Hive, etc.).
  • Advanced proficiency in Python for data engineering.
  • Deep knowledge of SQL and database performance tuning.
  • Strong understanding of data modeling, warehousing concepts, and ETL/ELT architectures.
  • Experience with cloud data platforms (AWS, Azure, or GCP).
  • Experience with orchestration tools (Airflow or similar).
  • Experience designing APIs for data services (FastAPI, Flask, etc.).
  • Familiarity with modern data stack tools and real-time streaming
  • architectures.

Key Responsibilities

Data Engineering & Pipeline Development

  • Design, develop, and optimize large-scale ETL/ELT pipelines using PySpark and distributed data processing frameworks.
  • Build high-performance data ingestion workflows from structured and unstructured sources.
  • Implement scalable data models, data marts, and enterprise data warehouse solutions.
  • Ensure data quality, reliability, lineage, and governance across pipelines.
  • Strong proficiency in Python, including libraries for data processing
  • Advanced knowledge of SQL and performance optimization techniques

Programming & Database Expertise

  • Write and optimize complex SQL queries, stored procedures, triggers, and functions.
  • Develop clean, modular, and efficient Python code for data processing and automation.
  • Work with relational and NoSQL databases (MySQL, PostgreSQL, SQL Server, MongoDB, etc.).
  • Manage database migrations, schema changes, and lifecycle processes.

AI/ML & Data Science Collaboration

  • Partner with Data Science teams to productionize machine learning models.
  • Integrate ML models into scalable data platforms.
  • Support model deployment and MLOps processes.

Architecture & Best Practices

  • Design scalable, cloud-based data architectures (AWS, Azure, or GCP).
  • Drive best practices in CI/CD, testing, performance optimization, and cloud deployments.
  • Work within Agile development environments using tools like Azure DevOps or GitHub.

Preferred Qualifications

  • Exposure to AI/ML workflows and MLOps tools (MLflow or similar).
  • Experience with ETL tools such as Talend, Apache NiFi, or Informatica.
  • Knowledge of API frameworks (Flask, FastAPI, Django REST Framework).
  • Familiarity with CI/CD and ALM tools (Azure DevOps, GitHub).

Education

  • Bachelors degree in Computer Science, Engineering, or related field (or equivalent experience).

Posted: March 10th, 2026