Perform various evaluations to input RAG data
Core types, schemas, and domain models for RAG evaluation
RAG evaluation metric scorers: faithfulness, relevance, context precision/recall
Dataset loading, validation, generation, and versioning for RAG evals
Cost tracking, pricing, budgeting, and reporting for RAG evaluations
LLM-as-judge with calibration, consensus voting, and cost tracking
Central evaluation orchestrator that ties metrics, judge, cost, gate, and dataset together
Structured logging, OpenTelemetry tracing, and metrics for RAG evaluations
CLI entry point and commands for the RAG evaluation toolkit
Quality gates and CI/CD regression checks for RAG evaluations
Evaluate RAG pipelines: retrieval precision, faithfulness, answer correctness. Multi-provider judge (Claude/OpenAI). Zero-config CLI.
MCP server exposing RAG evaluation tools (judge, suite, gate)
evaluate statically-analyzable expressions
Simple JavaScript expression evaluator
Evaluate node require() module content directly
Mathematical expression evaluator fork with exports map, prototype pollution and code injection security fixes
A flexible math expression evaluator
require or eval modules
Lightweight RAG evaluation metrics for CI/CD pipelines
Mark scopes for deopt which contain a direct eval call
Get callsites from the V8 stack trace API
JavaScript expression parsing and evaluation.
Alias for eval global.
CLI for installing agent harness + UI + RAG + eval patterns from attrition.sh into .claude/skills/ and AGENTS.md.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.