MCP Server for Asana
A custom version of the handlebars package which, to improve security, does not use 'eval' or 'new Function'
Errors
Promptfoo + OpenCode eval harness for agent behavior. Owns model/tier policy, provider wiring, package discovery, state export, and artifact guards. Consumers own eval YAML, prompts, fixtures, and assertions.
Pipeline-kit eval foundation — defineEval, runEval, case/scorer/score types
This plugin adds a context menu option on conflict notes to open a merge menu. Once clicked a diff viewer will be shown to aid in merging the notes together.
PostgreSQL MCP server - query, schema introspection, explain, and health checks for AI assistants
Fork and update (v0.6.5) of the original @modelcontextprotocol/server-puppeteer MCP server for browser automation using Puppeteer.
Information on LLM models, context window token limit, output token limit, pricing and more
Official @modelcontextprotocol/sdk + @modelcontextprotocol/ext-apps adapter for the SolvaPay MCP toolbox (createSolvaPayMcpServer, registerPayableTool) with ./fetch + ./express subpath exports for runtime-specific OAuth bridges.
should deep clone everything even global objects, functions, circularities, ...
Pipeline-kit eval scorers — exactMatch, numericClose, jsonShape, llmJudge
Javascript Expression Language: Powerful context-based expression parser and evaluator
Adds "filter", "selector", etc. to PouchDB's changes/replicate/sync APIs
require or eval modules
Model Context Protocol Extension of Coinbase Agentkit
LLM-as-judge evaluation engine — score agent outputs with structured rubrics
Web Codegen Scorer is a tool for evaluating the quality of web code generated by Large Language Models (LLMs).
## Design notes
A command line tool for setting up commercetools MCP server
MCP server for the Lexware Office API — manage invoices, contacts, articles, vouchers, and more
Evaluation harness for validating Refract analyzers against public benchmark pages and labeled claim-history examples
Statsig helps you move faster with feature gates (feature flags), and/or dynamic configs. It also allows you to run A/B/n tests to validate your new features and understand their impact on your KPIs. If you're new to Statsig, check out our product and cre
Evaluation framework for testing and scoring Agentium agents