Sr. Data Scientist

Company: Genzeon Global
Apply for the Sr. Data Scientist
Location: Hyderabad
Job Description:

 Data Scientist

 

Job Overview We are seeking an experienced Data Engineer with strong AI engineering expertise to design, build, and optimize large-scale healthcare data platforms. The ideal candidate will have hands-on experience in data pipelines, orchestration frameworks, EHR data integration, and AI/ML model enablement—driving the transformation of healthcare data into actionable intelligence.

Key Responsibilities

  • Architect, build, and optimize scalable ETL/ELT pipelines to process structured and unstructured healthcare data.
  • Design data systems that directly support AI/ML workflows, including feature engineering, data versioning, and real-time model serving.
  • Ingest and normalize healthcare data from EHR systems (Epic, Cerner, etc.) with expertise in FHIR, HL7, and Epic Clarity extracts, ensuring HIPAA/HITRUST compliance.
  • Implement and manage workflows using Airflow, Prefect, or similar orchestration tools.
  • Collaborate with data scientists, ML engineers, and backend developers to deliver high-quality AI-ready datasets.
  • Implement monitoring, anomaly detection, and automated quality checks for large-scale healthcare datasets.

Minimum Qualifications

  • Bachelor’s degree in Computer Science, Data Engineering, AI/ML, or related field.
  • 4+ years of experience in data engineering with proficiency in Python, MongoDB, and distributed data systems.
  • Proven track record in building data pipelines for healthcare/EHR systems.
  • Experience with data lakes, cloud storage (Azure, AWS, or GCP), and scalable data architectures.

Highly Preferred Qualifications

  • Expertise in AI/ML data pipelines, including feature store design, model input pipelines, and real-time data streaming.
  • Proficiency with Spark, Databricks, or equivalent big data tools.
  • Hands-on experience with Azure Data Factory, Synapse, and AI/ML ecosystem tools.
  • Experience with RESTful APIs and microservices for exposing healthcare/AI data assets.
  • Knowledge of data quality, lineage, and governance frameworks (e.g., Great Expectations, DataHub).
  • Familiarity with vector databases, embeddings, and LLM integration for AI use cases

Posted: March 26th, 2026