Senior AI/ML Engineer

  • Published on 06/04/2026
  • Mohali (052)
  • To be defined

Description:

The Role:


We are hiring a Senior Machine Learning Engineer to own the ML layer of an early-stage generative video product end-to- end. The role covers model selection and benchmarking, building the core generative agents, the video analysis pipeline, fine-tuning experiments, and the training-data strategy. A large share of early engineering effort sits in this role, and we are

looking for someone who can scope, prototype, and ship research-heavy work independently.


Key Responsibilities

• Model benchmarking - run a structured evaluation across candidate LLMs, video-generation models, TTS systems, and analysis VLMs, and deliver a clear recommendation for the production stack.

• Build the core generative agents - design, prototype, and ship the agents that drive the product, including prompt design, orchestration, tool use, and evaluation harnesses.

• Video analysis pipeline - evaluate and build the VLM-based layer that reviews, scores, and grades generated video outputs on technical and creative dimensions.

• Fine-tuning and LoRA experiments - run PEFT / LoRA experiments on the chosen video model and produce go / no-go recommendations with clear evidence behind them.

• Training-data strategy - document how data is sourced, cleaned, labelled, de-duplicated, and versioned for fine-tuning, and define the workflow the team will scale as the dataset grows.

• Deployment and inference - containerise models, deploy them to RunPod or an equivalent GPU cloud, and integrate them with internal services together with the Full Stack Engineer.

• Technical leadership - set conventions for experiment tracking, reproducibility, and code quality inside the ML codebase, and mentor junior hires as the team grows.


Required Skills & Experience:

• PyTorch - production experience building, training, and debugging transformer and diffusion models.

• HuggingFace ecosystem - fluency with transformers, diffusers, datasets, and accelerate across training and inference pipelines.

• PEFT / LoRA fine-tuning - practical experience fine-tuning large models with parameter-efficient methods, with at least one PEFT / LoRA model shipped to production or a production-grade demo.

• Video generation models - hands-on with Wan2.1, CogVideoX, or comparable open-source video diffusion models, including reading the paper, running the weights, and diagnosing failure modes.

• TTS and ASR - working knowledge of XTTS v2 (or equivalent) and Whisper, including fine-tuning, voice cloning, and quality evaluation.

• Vision-Language Models - experience querying and evaluating Gemini API, Qwen2-VL, or similar multimodal models for video understanding tasks.

• GPU cloud - experience running workloads on RunPod, Modal, Lambda Labs, or SageMaker, including provisioning, monitoring, and managing cost of training and inference jobs.

• 5+ years of applied ML engineering - with at least 2 years fine-tuning and deploying transformer-based models in a production or production-adjacent environment.

• Engineering fundamentals - clean Python, solid git hygiene, unit tests where they matter, and clear written communication.


Nice to Have:

• Published research, blog posts, or open-source contributions in video, multimodal, or generative models.

• Experience with distillation, quantization, or inference-cost optimisation for large generative models.

• MLOps tooling - Weights & Biases, MLflow, or DVC.

• Prior experience as a founding or early ML engineer at a startup.

The processing of personal data received will be carried out in accordance with applicable laws, including the UK General Data Protection Regulation (UK GDPR) and the Data Protection Act 2018.