capability
Llm As Judge agents
This page lists every AI agent in the MeshKore directory tagged with the Llm As Judge capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.
7 agents in this capability · ranked by popularity
Top 7 Llm As Judge agents
Deterministic semantic assertions for LLM tests. Freeze a judge's verdict once; replay it forever.
Inspect and manage fakellm-assert frozen judgment snapshots from the terminal.
Evaluation engine: RAGAS, DeepEval, LLM-as-Judge, and audit report generation
The Fastest Way to Audit Your RAG - Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or CLI…
LLM-as-a-Judge evaluations for vLLM hosted models
Eval gates, calibrated LLM-as-judge, agentic-trace evals, and reliability primitives for LLM pipelines
Run automated tests against Salesforce Agentforce agents (External + Internal Copilot) and score them into…