NLP Researcher & Machine Learning Engineer
I am passionate about multilingual NLP, Human-centered LLMs/VLMs, and agentic AI systems. I am currently leading research at Cohere Labs Community and M2ai, working on large-scale language model evaluations and low-resource language processing.
My current aim is to study how language models generalize, reason, and adapt across contexts. At the same time, I also develop workflows and agents based systems dependent on these models at my workplace - Dogma International.
I am an AI researcher and machine learning engineer with expertise in multilingual NLP, vision-language models, and agentic AI systems. I hold a Bachelor of Engineering in Electronics and Communication (Valedictorian) from Tribhuvan University, Nepal.
Currently, I lead community research at Cohere Labs and spearhead multilingual NLP initiatives at M2ai, with fundings obtained from OpenAI, Anthropic, and Cohere. My research spans cultural representation in VLMs, multilingual benchmarking, regulatory NLP, and agentic AI frameworks.
Leading 20+ researchers on large-scale benchmarking of LLM reasoning across 50 task categories. Designed evaluation pipelines for 6 reasoning approaches (CoT, SoT, BoT, LtM, CoVE, GoT) across 8 language models.
Leading research on multilingual, multimodal, and low-resource NLP with $20,000+ grants from OpenAI, Anthropic, and Cohere. Developed Mantra-14B, a state-of-the-art Hindi-English bilingual model. Built AI-generated text detection across 23 languages and 12 generators.
Architected AgentPro, a REACT-based agentic framework for complex data science workflows. Constructed comprehensive benchmarks across 8 task categories to evaluate agentic architecture performance.
Led study on cultural knowledge disparities in VLMs across 200+ countries. Co-developed INCLUDE, multilingual benchmark for localized knowledge. Contributed to Kaleidoscope multimodal evaluation benchmark.
Co-developed Scottish Charity Agent with RAG-based pipeline for compliance reviews. Deployed Feasibility Analysis Agent on Azure. Engineered OCR and LLM pipeline for penalty charge extraction.
Instructed undergraduate students in Artificial Intelligence, Embedded Systems, Numerical Methods, Computer Programming, and Digital Signal Processing. Managed curriculum delivery and lab sessions.
Email: jebishpurbey@gmail.com
Location: Kathmandu, Nepal
LinkedIn: linkedin.com/in/jebishpurbey
Google Scholar: Google Scholar Profile
Github: github.com/jebish