Back to Jobs

Data Scientist

Remote, USA Full-time Posted 2025-11-24
This description is a summary of our understanding of the job description. Click on 'Apply' button to find out more.

Role Description

We're seeking a data-driven analyst to conduct comprehensive failure analysis on AI agent performance across finance-sector tasks. You'll identify patterns, root causes, and systemic issues in our evaluation framework by analyzing task performance across multiple dimensions (task types, file types, criteria, etc.).

  • Statistical Failure Analysis : Identify patterns in AI agent failures across task components (prompts, rubrics, templates, file types, tags)
  • Root Cause Analysis : Determine whether failures stem from task design, rubric clarity, file complexity, or agent limitations
  • Dimension Analysis : Analyze performance variations across finance sub-domains, file types, and task categories
  • Reporting & Visualization : Create dashboards and reports highlighting failure clusters, edge cases, and improvement opportunities
  • Quality Framework : Recommend improvements to task design, rubric structure, and evaluation criteria based on statistical findings
  • Stakeholder Communication : Present insights to data labeling experts and technical teams

Qualifications

  • Statistical Expertise : Strong foundation in statistical analysis, hypothesis testing, and pattern recognition
  • Programming : Proficiency in Python (pandas, scipy, matplotlib/seaborn) or R for data analysis
  • Data Analysis : Experience with exploratory data analysis and creating actionable insights from complex datasets
  • AI/ML Familiarity : Understanding of LLM evaluation methods and quality metrics
  • Tools : Comfortable working with Excel, data visualization tools (Tableau/Looker), and SQL

Requirements

  • Experience with AI/ML model evaluation or quality assurance
  • Background in finance or willingness to learn finance domain concepts
  • Experience with multi-dimensional failure analysis
  • Familiarity with benchmark datasets and evaluation frameworks
  • 2-4 years of relevant experience
Apply To This Job

Similar Jobs

Experienced Part-Time Customer Service Phone Representative – Remote In-Home Opportunity with blithequark

Remote, USA Full-time

[Remote] Data Entry - Typist Part-Time - Remote

Remote, USA Full-time

Experienced Airport Customer Service Agent and Ramp Handler – Ensuring Safe and Timely Cargo Transportation at blithequark

Remote, USA Full-time

Virtual Benefits Rep (fully remote)

Remote, USA Full-time

Experienced Remote Data Entry and Live Chat Support Specialist – Flexible Part-Time Work from Home Opportunity with blithequark

Remote, USA Full-time

Senior Administrative Assistant job at Cribl in US National

Remote, USA Full-time

**Experienced Data Entry Specialist – Night Shift Remote Opportunity at blithequark**

Remote, USA Full-time

Registered Nurse, Case Manager - Hybrid/Remote, North Broward/Palm Beach FL

Remote, USA Full-time

Experienced Remote Data Entry Specialist – Database Management and Information Accuracy Expert

Remote, USA Full-time

Prior Authorization Specialist

Remote, USA Full-time

Experienced Online Data Entry and Customer Support Associate – Remote Work Opportunity with Flexible Hours and Competitive Pay

Remote, USA Full-time

Urgently Hiring: Benefit Specialist, Flexible Benefits - Remote

Remote, USA Full-time

Hiring Now: Remote Customer Support Representative

Remote, USA Full-time

Staff Accountant - Remote (Multi-Client Experience Required)

Remote, USA Full-time

Senior Manager-Policy Integration and Training - Full-time

Remote, USA Full-time

Careercusp Is Hiring A Jobs At Home Delta Airlines Nyc $25/hour

Remote, USA Full-time

**Experienced Data Entry Specialist – Remote Opportunity with FedEx**

Remote, USA Full-time

Entry Level Tax Preparer – Launch Your Career with Flexible Scheduling, Professional Development, and Exceptional Customer Service Experience

Remote, USA Full-time

Entry Level Work from Home Product Advisor – Flexible Remote Opportunity for Ambitious Individuals

Remote, USA Full-time

Advanced Products & Solutions (APS), Electro-Optical Infrared Solutions (EO/IRS) - Program Cost Controls Analyst - P2 - (Onsite) 3 Locations

Remote, USA Full-time