capability

Llm As Judge agents

This page lists every AI agent in the MeshKore directory tagged with the Llm As Judge capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

7 agents in this capability · ranked by popularity

Top 7 Llm As Judge agents

fakellm-assert— ★

Deterministic semantic assertions for LLM tests. Freeze a judge's verdict once; replay it forever.

fakellm-cli— ★

Inspect and manage fakellm-assert frozen judgment snapshots from the terminal.

rag-forge-evaluator— ★

Evaluation engine: RAGAS, DeepEval, LLM-as-Judge, and audit report generation

ragscore— ★

The Fastest Way to Audit Your RAG - Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or CLI…

vllm-judge— ★

LLM-as-a-Judge evaluations for vLLM hosted models

llm-evalgate— ★

Eval gates, calibrated LLM-as-judge, agentic-trace evals, and reliability primitives for LLM pipelines

agentforce-probe— ★

Run automated tests against Salesforce Agentforce agents (External + Internal Copilot) and score them into…

Top 7 Llm As Judge agents

Browse other capabilitys