Research Scientist - Model Evaluation Job at Lumicity, Santa Rosa, CA

dE91M3ZjNWxza3BUMG52d2d6U2xDdVR2NVE9PQ==
  • Lumicity
  • Santa Rosa, CA

Job Description

AI Benchmarking & Evaluation Engineer

Join a team at the forefront of AI model evaluation, setting the standard for how large language models are tested and validated. In this role, you'll assess the latest AI models, design new benchmarks, and develop advanced evaluation methodologies. You'll work closely with engineers, AI researchers, and enterprise clients to ensure cutting-edge AI systems meet the highest standards. This role is a bridge between research and practical implementation and will suit someone who enjoys taking academic papers and creating working models.

Key Responsibilities:

  • Analyze and benchmark newly released AI models (DeepSeek, Gemini, etc.)
  • Develop and implement novel evaluation frameworks
  • Build datasets, manage labeling processes, and publish findings
  • Enhance automated evaluation techniques for AI-generated content
  • Collaborate with top AI labs and enterprise partners to refine best practices

Who You Are:

  • MSc or PhD from leading Computer Science or Machine Learning school
  • At least 3 years of experience in applied AI, with a focus on benchmarking or model evaluation
  • Strong background in designing evaluation methodologies
  • Passion for advancing AI assessment standards
  • Solid Python, PyTorch/TensorFlow and Django

Make a real impact in AI research and development—apply today!

Job Tags

Similar Jobs

RemX | The Workforce Experts

Project Manager Job at RemX | The Workforce Experts

 ...What They are Looking For: * Proven experience as a Project Manager in construction or underground utilities. * Bilingual (Spanish/English) a MUST * Proficiency in Microsoft Office and QuickBooks. * Strong leadership, organizational, and communication skills. * Ability... 

Net2Source (N2S)

Administrative Assistant Job at Net2Source (N2S)

Job Title: Administrative Assistant Location: Boston, MA 02109 (Local Candidates only) Duration: 3 Months- Additional support, Possible for extension Shift: M-F 8am - 5pm Summary: As an Administrative Services Coordinator, you will provide administrative ...

Abel Richard

Boutique Client Advisor (Ultra-Luxury Division) Job at Abel Richard

 ...understanding of UHNW client expectations, luxury service culture, and cross-cultural etiquette. Multilingual fluency preferred (Mandarin, French, Arabic, or Spanish). Proficient in CRM systems, clienteling, and long-term relationship management. Compensation &... 

Hervé Léger

Keyholder Job at Hervé Léger

 ...Founded in 1985, Herve Leger is a French fashion house that offers an exquisite collection of Bodycon (Body Conscience) dresses and gowns in amazing styles and vivid colors. Herve Leger pioneered the Bodycon look and has been synonymous with the style ever since. Every... 

The Hunter College Foundation

College Assistant - Annual Giving Communications Job at The Hunter College Foundation

Job Title: College Assistant - Annual Giving Communications Location: Hunter College, New York, NY Part-time (20 hours/week) Reports to: Philanthropic Communications Officer Position Overview: Join a fast-paced, mission-driven team helping to power student...