A legacy of excellence, driving innovation and personalized service to create exceptional customer experiences.
About H.E. Services:
At H.E. Services vibrant tech center in Hyderabad, youll have the opportunity to contribute to technology innovation for Holman Automotive, a leading American fleet management and automotive services company. Our goal is to continue investing in people, processes, and facilities to ensure expansion in a way that allows us to support our customers and develop new tech solutions.
Holman has come a long way during its first 100 years in business.The automotive markets Holman serves include fleet management and leasing; vehicle fabrication and upfitting; component manufacturing and productivity solutions; powertrain distribution and logistics services; commercial and personal insurance and risk management; and retail automotive sales as one of the largest privately owned dealership groups in the United States.
Join us and be part of a team that’s transforming the way Holman operates, creating a more efficient, data-driven, and customer-centric future.
Read more about us: https://talent500.co/holman/homeWe are seeking a hands-on Lead ETL / ELT Data Test Engineer to ensure the accuracy, reliability, and trustworthiness of enterprise data pipelines. This role is SQL-first and focuses on base data testing using Python (Pandas) to validate ETL / ELT pipelines built on Azure Data Lakebased analytics platforms.
The ideal candidate is comfortable working with ambiguous or lightly documented requirements, independently exploring data lakes, understanding data lineage, and translating business intent into clear, testable data quality validations.Experience with dbt and PySpark is beneficial but not required.Roles & Responsibilities:
ETL / ELT & Data Pipeline Testing:
- Lead hands-on data quality validation across ingestion, transformation, and analytics layers. Design and execute end-to-end ETL / ELT testing, including: Source-to-target reconciliation, Transformation and business-rule validation, Aggregations, metrics, and derived fields &Incremental vs. full load behavior.
- Validate historical loads, backfills, and reprocessing to ensure data accuracy and consistency over time.
Data Architecture & Modeling Validation:
- Validate structured datasets for, Primary and foreign key integrity, Join correctness and cardinality & Referential integrity across datasets.
- Test dimensional and analytical models, including Fact and dimension tables, Grain validation, Measures and aggregations & Slowly Changing Dimensions (SCDs).
- Ensure data quality strategies align with medallion architecture (Bronze, Silver, Gold).
SQL & Python (Pandas) Data Testing:
- Perform advanced SQL-based data validation, including complex joins, window functions, aggregations, and data profiling.
- Execute Python-based data testing using Pandas for Dataset profiling and exploratory analysis, Source-to-target reconciliation, Aggregation and business-rule validation & Schema, row-level, and regression checks
- Use Python primarily as a data testing and validation tool, not for application development.
- Build reusable, scalable Pandas-based validation utilities to support evolving pipelines.
- PySpark (nice to have) only when validating very large or distributed datasets
Must Have Skills:
- 710 years of QA or Data Quality experience, with 35+ years focused on data platforms.
- Strong hands-on experience validating ETL / ELT pipelines and data transformations.
- Expert-level SQL skills (complex joins, aggregations, window functions).
- Strong experience with Python for data testing, specifically using Pandas for validation and reconciliation.
- Experience working with Azure Data Lake (ADLS Gen2) or similar data lake platforms.
- Solid understanding of Relational data fundamentals, Fact and dimension modeling, Star schemas and analytical datasets & Medallion architecture concepts
- Proven ability to work effectively with ambiguous requirements and limited documentation.
Nice to Have:
- Experience with dbt Core or dbt Cloud (model validation, tests, lineage).
- Familiarity with Great Expectations or similar data quality frameworks.
- Experience using PySpark for validating large-scale or distributed datasets.
- Exposure to Databricks environments.
Education:
- B.Tech & Computer Science degree preferred.
Communicating / Planning:
- Plan and execute data testing strategies aligned with platform architecture.
- Identify data quality risks early and define mitigation strategies.
- Build repeatable, scalable validation approaches rather than one-off testing.
- Balance multiple priorities while maintaining accuracy and attention to detail.
- Adapt quickly to changing data models, pipelines, and business priorities.
- Operate independently with moderate guidance.
- Clearly explains complex data issues to technical and non-technical audiences.
- Asks effective questions to clarify unclear or evolving requirements.
- Communicates data risks, assumptions, and quality status with confidence.
- Builds trust through consistency, transparency, and data-driven recommendations.
- Collaborates effectively across engineering, analytics, and product teams.
Outputs & Impacts:
- Reliable, validated datasets across bronze, silver, and gold layers
- High confidence in ETL / ELT pipelines and analytical outputs
- Reusable SQL and Pandas-based Python validation utilities
- Trusted dimensional models ready for analytics and reporting
- Early detection of data quality issues before production impact
- Increased trust in enterprise data platforms across the organization
…