Jebish Purbey

// Biography

I am an NLP researcher and machine learning engineer based in Kathmandu, Nepal, interested in language model behavior, evaluation, and multilingual NLP.

Evaluation is one of my main research tools. I use it to study model behavior across languages, cultures, modalities, and task settings, especially when benchmark results travel poorly outside their original setting. My work spans multilingual and multimodal evaluation, multilingual language modeling, AI-generated text detection, cultural representation in vision-language models, regulatory NLP, and agentic reasoning systems.

A side interest of mine is how these findings can eventually inform better model and system design for language technologies, alongside better ways of studying them.

I currently serve as ML Agents Community Lead at Cohere Labs. In parallel, I work as a Researcher at ZeroGrad.ai, and as an AI Engineer at Dogma Group. Previously, I was a Research Fellow at Traversaal.ai, and a Community Researcher at Cohere Labs. I have also taught AI, embedded systems, and programming at both the National College of Engineering and Pulchowk Campus. My recent publication record spans ICLR, NeurIPS, IJCNLP-AACL, COLING workshops, and CANDAR.

B.E. Electronics & Communication

Pulchowk Campus, Tribhuvan University

Valedictorian · 83.75%

// Experience

Professional Timeline

June 2025 - Present

ML Agents Community Lead

Cohere Labs

Currently working on Street-Navigation Agents and evaluation of reasoning robustness in mixed-motive scenarios, alongside benchmark design and LLM-as-judge pipelines for large-scale reasoning evaluation.

June 2025 - Present

AI Engineer

Dogma Group

Building production systems including a policy-compliance RAG agent, a feasibility-analysis agent on Azure, OCR plus LLM pipelines for operational automation, automated CAD analysis agents, automated ticket triaging agents, and exploring novel applications of LLMs and vision in operational settings.

Sep 2024 - Present

Researcher

ZeroGrad.ai

Leading multilingual and multimodal NLP research, including South-Asian LLMs for 23 Indic Languages, Mantra-14B, regulatory question answering, multilingual AI-text detection, and behavior shifts across languages, with $20,000+ in grant funding for research and evaluation work.

Dec 2024 - May 2025

Research Fellow

Traversaal.ai

Architected AgentPro, a REACT-based agentic framework for data-science workflows, and helped build DSBC for benchmarking agentic performance across eight task categories.

Apr 2021 - Apr 2025

Research Assistant & Undergraduate Researcher

Pulchowk Campus, IoE, Tribhuvan University

Worked across early research projects in hate speech detection, Nepali abstractive summarization, and cultural heritage detection, spanning dataset creation, low-resource NLP baselines, interpretability analysis, and computer vision systems. This period shaped my transition into multilingual NLP and contributed to work later published at CANDAR 2025.

Jan 2024 - Aug 2025

Instructor

National College of Engineering & Pulchowk Campus, TU

Taught AI, embedded systems, numerical methods, digital signal processing, and programming while supervising labs, projects, and final-year AI work.

Sep 2023 - May 2025

Community Researcher

Cohere Labs

Worked on INCLUDE, Kaleidoscope, and cultural representation studies in VLMs, helping establish a research path centered on evaluation, cross-context behavior, and model analysis.