Research Scientist - Model Evaluation Job at Lumicity, Santa Rosa, CA

dE91M3ZjNWxza3BUMG52d2d6U2xDdVR2NVE9PQ==
  • Lumicity
  • Santa Rosa, CA

Job Description

AI Benchmarking & Evaluation Engineer

Join a team at the forefront of AI model evaluation, setting the standard for how large language models are tested and validated. In this role, you'll assess the latest AI models, design new benchmarks, and develop advanced evaluation methodologies. You'll work closely with engineers, AI researchers, and enterprise clients to ensure cutting-edge AI systems meet the highest standards. This role is a bridge between research and practical implementation and will suit someone who enjoys taking academic papers and creating working models.

Key Responsibilities:

  • Analyze and benchmark newly released AI models (DeepSeek, Gemini, etc.)
  • Develop and implement novel evaluation frameworks
  • Build datasets, manage labeling processes, and publish findings
  • Enhance automated evaluation techniques for AI-generated content
  • Collaborate with top AI labs and enterprise partners to refine best practices

Who You Are:

  • MSc or PhD from leading Computer Science or Machine Learning school
  • At least 3 years of experience in applied AI, with a focus on benchmarking or model evaluation
  • Strong background in designing evaluation methodologies
  • Passion for advancing AI assessment standards
  • Solid Python, PyTorch/TensorFlow and Django

Make a real impact in AI research and development—apply today!

Job Tags

Similar Jobs

FutureStitch

Director of Finance Job at FutureStitch

 ...Executes design, development, sourcing, procurement, supply chain management, logistics, and delivery for multiple sock and apparel...  ...Position Summary We are seeking a visionary Director of Finance to lead financial strategy and transformation across all divisions... 

Slows Bar BQ

Catering Delivery Drivers Job at Slows Bar BQ

 ...Join the Slows Bar BQ team as a Catering Delivery Driver! We're looking for individuals who are professional and punctual, embodying the Slows spirit while delivering our delicious food to customers. In this role, you'll be responsible for ensuring that orders reach their... 

Insight Global

Construction Project Manager Job at Insight Global

 ...Job Summary: The Project Manager will organize, manage, and plan complex projects for the organization and ensure that goals and...  ...Bachelors degree in related field, which may include Engineering, Construction Science, or Business, required. o At least five years of... 

PTR Global

UAT Analyst III Job at PTR Global

 ...Contract Job ID: 172382 Job Overview: ~ Role: UAT/End to End Test Analyst Qualifications: ~5 plus years of experience with...  ...new features. Perform testing of on-device experiences for users on a wide range of mobile applications. Write automated tests... 

Petite Paloma

Part Time Warehouse + Showroom Associate Job at Petite Paloma

 ...standards Inventory Management: Assist with receiving, counting, and organizing incoming inventory Maintain accurate records of stock levels and report discrepancies to warehouse manager and CFO Help restock shelves and prepare items for restocking Customer...