Data Scientist

  • Published on 06/03/2026
  • Ludhiana (041)
  • To be defined

Description:

Data Scientist

  • Design and implement end-to-end evaluation frameworks to assess performance, reliability, and safety of multi-agent AI systems
  • Lead experimentation and A/B testing efforts to systematically test hypotheses, validate model improvements, and track performance across agent iterations
  • Curate and maintain high-quality ground truth datasets to enable accurate, reproducible evaluation of multi-agent outputs
  • Identify and address reliability and accuracy gaps across agent workflows, failure modes, and edge cases in production-like environments
  • Stay current on emerging research in agentic AI, LLM evaluation, and multi-agent coordination to continuously improve framework design

Technical Skills

  • Proficiency in Python and ML frameworks
  • Hands-on experience with LLM APIs and agentic frameworks (LangChain, LlamaIndex, Semetic KernalI)
  • Familiarity with evaluation tooling (Ragas, DeepEval, LangSmith, or similar)
  • Experience with data pipelines, experiment tracking (MLflow, W&B), and CI/CD for ML workflows
  • Strong foundation in statistics, NLP, prompt engineering, experimental design, and A/B testing methodology
  • Proficiency in Azure ML, Azure OpenAI Service, and Azure AI Foundry for model deployment, evaluation, and orchestration
  • Familiarity with Azure Monitor and Application Insights for tracking reliability and performance of deployed agent systems


The processing of personal data received will be carried out in accordance with applicable laws, including the UK General Data Protection Regulation (UK GDPR) and the Data Protection Act 2018.