SWE-bench Pro evaluation harness for nexus-agents (731 multi-language instances, ScaleAI/SWE-bench_Pro)
Command-Line Interface scripts and releated utilities.
an override replacement for @protobufjs/inquire
Compile eval calls with string literals
Computer Algebra System in Coffeescript
<div align="center"> <img src="spassr.png" alt="routify" width="400" /><br> </div>
Evaluation CLI for AI Observability on Dynatrace
Corelay Mesh eval pipeline — test suites, LLM-judged scoring, deploy-gate thresholds.
Shared domain types and Zod schemas for agent-eval-harness
Pragmatic eval framework for LLM features. Runs eval files as Bun tests with scorers, baselines, and reporting.
Compile eval calls with string literals
Custom rules and configuration recommendations for the `axe-core` library for GitHub projects
Universal eval-guard for AI coding agents — thin alias for @holdpoint/cli
WebGL module for Gjs
hpcc-js - WASM Graphviz Cli
Alias for eval global.
  
Open-source testing and regression detection framework for AI agents. Golden baseline diffing, CI/CD integration, works with LangGraph, CrewAI, OpenAI, Claude, HuggingFace, Ollama, and MCP.
Cloudflare Workers MCP server wrapper: ai-eval
Statsig helps you move faster with feature gates (feature flags), and/or dynamic configs. It also allows you to run A/B/n tests to validate your new features and understand their impact on your KPIs. If you're new to Statsig, check out our product and cre
A Redis-backed leaky-bucket rate limiter
Patchwork is a responsive CSS framework using the benefits of HTML5's Flexbox.
JavaScript code execution context for the browser and wrapper around node vm module
REPL environment.