Research Scientist - Model Evaluation Job at Lumicity, Santa Rosa, CA

dE91M3ZjNWxza3BUMG52d2d6U2xDdVR2NVE9PQ==
  • Lumicity
  • Santa Rosa, CA

Job Description

AI Benchmarking & Evaluation Engineer

Join a team at the forefront of AI model evaluation, setting the standard for how large language models are tested and validated. In this role, you'll assess the latest AI models, design new benchmarks, and develop advanced evaluation methodologies. You'll work closely with engineers, AI researchers, and enterprise clients to ensure cutting-edge AI systems meet the highest standards. This role is a bridge between research and practical implementation and will suit someone who enjoys taking academic papers and creating working models.

Key Responsibilities:

  • Analyze and benchmark newly released AI models (DeepSeek, Gemini, etc.)
  • Develop and implement novel evaluation frameworks
  • Build datasets, manage labeling processes, and publish findings
  • Enhance automated evaluation techniques for AI-generated content
  • Collaborate with top AI labs and enterprise partners to refine best practices

Who You Are:

  • MSc or PhD from leading Computer Science or Machine Learning school
  • At least 3 years of experience in applied AI, with a focus on benchmarking or model evaluation
  • Strong background in designing evaluation methodologies
  • Passion for advancing AI assessment standards
  • Solid Python, PyTorch/TensorFlow and Django

Make a real impact in AI research and development—apply today!

Job Tags

Similar Jobs

Clark International

Armed Security Officer Job at Clark International

 ...Job description- potential for full time. Urgent hire request. Hours: ~ Part time Fri/Sat 10pm-10am Armed Security Officer (ASO) Clark International is seeking a dedicated, self-motivated, and dependable Armed Security Officer (ASO) to join our team. This position... 

Medisys Health Network, Inc.

Epic Clindoc Analyst Job at Medisys Health Network, Inc.

 ...Description This position is a full-time/salaried-hybrid schedule opportunity based in Hicksville, Long Island. The EPIC System Clindoc Lead System Analyst is responsible for the development, building, implementation, and ongoing maintenance of all aspects of related... 

VA Temp

Paid Internship for a Junior Python Developer Job at VA Temp

 ...We are offering a paid Internship for a Junior Python Developer with the opportunity to become a full-time team member based on performance. Responsibilities Write clean, efficient, and reusable code Develop and maintain Python applications Collaborate with cross... 

NEW BERLIN GRADING, INC.

Accounts Payable/Admin Associate Job at NEW BERLIN GRADING, INC.

 ...with us.You must be self-driven and maintain a high degree of accountability.As with most construction offices, we enjoy a casual...  ...impeccable customer service. Responsibilities : ACCOUNTS PAYABLE : Process high volume of subcontractor, materials, and equipment... 

Artemis

Senior Financial Analyst Job at Artemis

An opportunity to join a high-performing FP&A team within a global leader in medical technology. This role offers exposure to senior leadership, visibility across multiple business units, and the chance to make an impact in a fast-moving, growth-focused environment. Responsibilities...