BETAmodules.com is in beta — open to partnerships & joint ventures.Build with us

cross-ecosystem search · live

Results for ruby_llm-evals

Found in 2 of 7 ecosystemsnpm 1–24 of 430 · 2 matches across other registries

How we search: free-text on npm, crates.io, RubyGems, NuGet and Maven. PyPI and Go do exact-name lookup only. Tip: click an ecosystem chip below to filter; click Show all ecosystems to come back.

Sort

Auto-load on scroll

npm matches

Showing 24 of 430 · JavaScript

See all npm →

mcp-evalsv2.0.1

npm

GitHub Action for evaluating MCP server tool calls using LLM-based scoring

Aging — last published 11 months ago — check before adopting.

promptfoov0.121.14

npm

LLM eval & testing toolkit

Maintained. Maintained, actively maintained.

@mastra/evalsv1.2.4

npm

No description provided.

Maintained. Maintained, actively maintained.

vitest-evalsv0.11.0

npm

Harness-backed AI testing on top of Vitest.

Maintained. Niche but maintained, actively maintained.

evalitev0.19.0

npm

Test your LLM-powered apps with a TypeScript-native, Vitest-based eval runner. No API key required.

Aging — last published 7 months ago — check before adopting.

openevalsv0.2.0

npm

Much like tests in traditional software, evals are an important part of bringing LLM applications to production. The goal of this package is to help provide a starting point for you to write evals for your LLM applications, from which you can write more c

Worth a look. Actively maintained and growing, actively maintained.

@arizeai/phoenix-evalsv1.0.3

npm

A library for running evaluations for AI use cases

Maintained. Maintained, actively maintained.

@outputai/evalsv0.6.0

npm

Offline evaluation framework for Output.ai workflows

Maintained. Maintained, actively maintained.

autoevalsv0.0.132

npm

Universal library for evaluating AI models

MaintenanceHealthy

PopularityWidely used

SecurityClean

DepsModerate

Solid choice. Well-adopted and active, actively maintained.

@harnessio/react-ai-evals-service-clientv0.24.0

npm

Harness AI Evals Service APIs integrated with react hooks

Maintained. Maintained, actively maintained.

axiomv0.52.2

npm

Axiom AI SDK provides - an API to wrap your AI calls with observability instrumentation. - offline evals - online evals

Worth a look. Actively maintained and growing, actively maintained.

agentvv4.31.3

npm

CLI entry point for AgentV

Maintained. Niche but maintained, actively maintained.

@agentuity/evalsv2.0.25

npm

No description provided.

Maintained. Maintained, actively maintained.

@agent-assistant/telemetryv0.4.35

npm

Usage, cost, and response telemetry primitives for Agent Assistant

Maintained. Maintained, actively maintained.

dirac-cliv0.3.44

npm

Autonomous coding agent CLI - capable of creating/editing files, running commands, using the browser, and more

Maintained. Niche but maintained, actively maintained.

@m4trix/evalsv0.34.3

npm

Ink-based evaluation console for m4trix

Maintained. Maintained, actively maintained.

@statsig/statsig-aiv0.0.1-beta.30

npm

> ⚠️ **Beta Notice** > > This SDK is currently in **beta**. All APIs are **experimental** and subject to change. > Please review the [release notes](https://github.com/statsig-io/statsig-ai-node/releases) for any **breaking changes** before upgrading.

Maintained. Maintained, actively maintained.

@deepagents/evalsv2.4.0

npm

A general-purpose LLM evaluation framework with dataset loading, scoring, run persistence, model comparison, and console reporting.

Maintained. Maintained, actively maintained.

@vitest-evals/harness-ai-sdkv0.11.0

npm

AI SDK harness adapter for vitest-evals.

Maintained. Maintained, actively maintained.

@cool-ai/beach-evalsv0.0.30

npm

Record, replay, and score evaluation primitives for Beach applications — built on the event log.

Maintained. Maintained, actively maintained.

Arize evals package

Maintained. Niche but maintained, actively maintained.

switchroomv0.14.43

npm

Run Claude Code 24/7 on your Claude Pro/Max subscription over Telegram. Open-source alternative to OpenClaw and NanoClaw — no API keys.

Maintained. Maintained, actively maintained.

@vitest-evals/harness-pi-aiv0.11.0

npm

pi-ai harness adapter with tool replay for vitest-evals.

Maintained. Maintained, actively maintained.

@lobu/promptfoo-providerv11.0.0

npm

promptfoo custom provider for running evals against a Lobu agent

Maintained. Niche but maintained, actively maintained.

RubyGems matches

2 matches · Ruby

ruby_llm-evalsv0.1.0

RubyGems

LLM evaluation engine for Rails.

Maintained. Niche but maintained, actively maintained.

ruby_llm-contractv0.10.1

RubyGems

Wraps RubyLLM::Chat with input/output contracts, business-rule validation, retry with model escalation on validation failure, pre-flight cost ceilings, and an evaluation framework. Sibling abstraction to RubyLLM::Agent — same niche (reusable class-based prompts), wider contract.

Maintained. Niche but maintained, actively maintained.