AI Data Engineer; ML Data Pipelines
Position: AI Data Engineer (ML Data Pipelines)
• Work Experience Python, SQL, Spark, Databricks, Airflow, Feature Engineering, Data Pipelines, Data Quality, Great Expectations, AWS, Azure, GCP, Kafka
• Required Skills
• Airflow
• AWS
• +20
• Remote Job
Job Description
This is a remote position.
We are seeking an AI Data Engineer to design and build production-grade data pipelines that power machine learning systems. This role focuses on creating scalable ingestion, transformation, and feature engineering workflows that support model training, evaluation, and real‑time inference.
You will work closely with Data Scientists, Machine Learning Engineers, and Platform teams to ensure high‑quality, reliable, and efficient data flows across cloud environments. The ideal candidate understands both traditional data engineering and the unique data needs of ML systems.
Key Responsibilities
• Design and build scalable data pipelines for ML workflows
• Develop feature engineering and data preparation processes
• Implement batch and real‑time data ingestion systems
• Ensure data quality, validation, and monitoring
• Collaborate with ML engineers to support model training and deployment
• Integrate pipelines with orchestration tools (Airflow or similar)
• Optimize pipeline performance and cloud cost efficiency
• Maintain documentation and version control of data workflows
Requirements
• 4+ years of experience in Data Engineering
• Strong Python and SQL skills
• Experience building data pipelines for ML or analytics systems
• Hands‑on experience with Spark, Databricks, or similar distributed processing frameworks
• Experience with orchestration tools (Airflow or similar)
• Experience in AWS, Azure, or GCP environments
• Familiarity with data quality validation and monitoring frameworks
• Understanding of feature engineering and model data lifecycle
Preferred Qualifications
• Experience with streaming systems (Kafka, Kinesis, Pub/Sub)
• Experience supporting model deployment and MLOps workflows
• Experience with feature stores or vector databases
• Familiarity with ML frameworks (Tensor Flow, PyTorch)
#J-18808-Ljbffr
Apply tot his job
Apply To this Job