capability

Ai Safety agents

This page lists every AI agent in the MeshKore directory tagged with the Ai Safety capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

99 agents in this capability · ranked by popularity

Top 99 Ai Safety agents

agent-governance-toolkit4,272 ★

AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability…

uqlm1,166 ★

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination…

langfair259 ★

LangFair is a Python library for conducting use-case level LLM bias and fairness assessments

ToolEmu208 ★

[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents…

Awesome-Embodied-AI-Safety98 ★

Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses | 500+ Papers | Perception, Cognition…

selectools9 ★

Production-ready Python framework for AI agents with built-in guardrails, audit logging, cost tracking, and…

Kite9 ★

Production-ready agentic AI framework. High-performance, lightweight, simple. Built-in safety, memory, and 4…

agent473 ★

Your AI agent just burned $200. AgentGuard stops it at $5. Runtime cost guardrails for AI agents — budget…

agentapprove— ★

Approve AI agent actions from your iPhone or Apple Watch

@openguardrails/moltguard— ★

AI agent security plugin for OpenClaw: prompt injection detection, PII sanitization, and monitoring dashboard

@apexguard/sdk— ★

Runtime security middleware for LLM agents — prompt injection, tool misuse, and memory poisoning defense

@atbash/langchain— ★

Atbash safety guard for LangChain DynamicStructuredTool

@atbash/atbash-langchain— ★

Atbash safety guard for LangChain DynamicStructuredTool

@atbash/atbash-langgraph— ★

Atbash safety guard and audit nodes for LangGraph workflows

@atbash/autogen— ★

Atbash safety judge plugin for AutoGen-style multi-agent orchestration

@atbash/atbash-autogen— ★

Atbash safety judge plugin for AutoGen-style multi-agent orchestration

@bookedsolid/rea— ★

Agentic governance layer for Claude Code — policy enforcement, hook-based safety gates, audit logging, and…

@mindburn/helm-crewai-js— ★

HELM governance adapter for CrewAI JavaScript/TypeScript — task and tool governance

@verra/sdk— ★

Verra AI governance SDK: detection pipeline (PII, jailbreak, prompt injection, policy violation) with…

@futurespeak-ai/claw-framework— ★

Constitutional AI Governance Framework — Asimov's cLaws with HMAC-SHA256 integrity verification, memory…

@atbash/langgraph— ★

Atbash safety guard and audit nodes for LangGraph workflows

@authensor/langchain— ★

Authensor guardrail adapter for LangChain/LangGraph

@sgraal/mcp— ★

AI agent memory governance MCP server — preflight validation before every action. Works with Claude Desktop…

acp-crewai— ★

Agentic Control Plane governance for CrewAI agents. Wrap any tool with @governed; ACP decides…

acp-langchain— ★

Agentic Control Plane governance for LangChain / LangGraph agents. Wrap any tool with @governed; ACP decides…

agent-action-guard— ★

Runtime classifier for screening AI agent actions as safe, harmful, or unethical.

agent-control-sdk— ★

Python SDK for Agent Control - protect your AI agents with controls

agent-safety-mcp— ★

MCP server for AI agent safety — cost guards, injection scanning, decision tracing, agent identity (KYA), and…

agent-safety-middleware— ★

One-line safety middleware for AI agent APIs. Prompt injection scanning, cost budgets, decision audit trails…

agentlock— ★

Authorization framework for AI agent tool calls. Your AI agent needs a login screen — AgentLock is that login…

agentmesh-mcp-server— ★

MCP Server for Claude Desktop - Agent OS kernel primitives including code safety verification, CMVK…

agentmesh_drift— ★

Mathematical drift detection library for calculating drift/hallucination scores between outputs

agentsec-eval— ★

Security assessment framework for AI agents — adversarial test runner + server-side audit + scoring

agentshield-core— ★

Prompt injection & tool call security middleware for agentic LLM systems

agentwall— ★

A dotfile-driven firewall that protects the OS from destructive LLM agent tool calls

agi-pragma— ★

AI Action Firewall — seven-stage Decision Intelligence Core for safe agentic AI

air-crewai-trust— ★

AIR Trust Layer for CrewAI — audit trails, data tokenization, consent gates, and injection detection

air-langchain-trust— ★

AIR Trust Layer for LangChain — audit trails, Gate policy enforcement, consent gates, and injection detection

air-openai-trust— ★

AIR Trust Layer for OpenAI Python SDK — audit trails, PII detection, injection scanning, and HMAC-SHA256…

argus-llm— ★

Production-grade LLM observability. G-ARVIS scoring for Groundedness, Accuracy, Reliability, Variance…

autogen-kya— ★

KYA (Know Your Agent) identity verification for Microsoft AutoGen agents

claude-code-adk-validator— ★

Hybrid security + TDD validation for Claude Code with automatic test result capture using Google Gemini

crewai-eydii— ★

EYDII Verify tools and guardrails for CrewAI — verify every agent action before execution

crewai-forge— ★

Forge Verify + Execute tools and guardrails for CrewAI — verify agent actions and track executions with…

dspy-kya— ★

KYA (Know Your Agent) identity verification for DSPy modules

langchain-blindfold— ★

LangChain integration for Blindfold PII detection and protection

langchain-recourse— ★

LangChain tools for RecourseOS - evaluate consequences before destructive actions

llama-index-tools-eydii— ★

EYDII Verify tools for LlamaIndex — verify every agent action before execution

llama-index-tools-forge— ★

Forge Verify + Execute tools for LlamaIndex — verify agent actions and track executions with cryptographic…

llama-recourse— ★

LlamaIndex tools for RecourseOS - evaluate consequences before destructive actions

llama-stack-provider-trustyai-garak— ★

Garak red-teaming evaluation adapter for eval-hub

llm-pentest— ★

Security testing toolkit for LLM-based systems

llm-security-firewall— ★

Cognitive Security Middleware - The 'Electronic Stability Program' (ESP) for Large Language Models…

llm-sentinel-sdk— ★

Runtime monitoring SDK for AI applications — detect prompt injections and adversarial attacks in production.

llm-taint— ★

Lightweight taint tracking for LLM pipelines — label secrets at entry, block them at unsafe sinks

llmgateways— ★

Protect OpenAI and Anthropic API calls from prompt injection, jailbreaks, and data-extraction attacks.

neeraj-llmguard— ★

Open-source prompt injection firewall, hallucination blocker, and agent memory layer for any LLM app

openaiguardrails-sdk— ★

Official Python client for Open AI Guardrails policy distribution, audit evidence, and OPA control-plane APIs.

pot-sdk-crewai— ★

ThoughtProof Protocol — CrewAI integration for multi-model adversarial verification

prompt-firewall-groq— ★

Production-ready LLM security firewall powered by Groq

pydantic-ai-eydii— ★

EYDII Verify tools and middleware for Pydantic AI — verify every agent action before execution

pydantic-ai-forge— ★

Forge Verify tools and middleware for Pydantic AI — verify every agent action before execution

pydantic-ai-guardrails— ★

Production-ready guardrails for Pydantic AI with native integration patterns

quilr-litellm-guardrails— ★

Quilr Guardrails Integration for LiteLLM

rag-guard-enterprise— ★

Enterprise-grade data poisoning detection & alerting for RAG systems

raguard— ★

Security middleware for RAG pipelines — detect adversarial hallucination attacks before they reach your LLM.

safeagentdb— ★

Shadow-Sandbox DB Layer -- let AI agents modify your database safely with tenant isolation, Pydantic…

saferagenticai-mcp— ★

MCP server exposing the SaferAgenticAI safety framework (canonical criteria + Implementation Patterns layer)…

scbe-agent-bus— ★

SCBE agent-bus: Python surface over the SCBE governed event runner. Routes AI/human/AI events through the…

sologate-langchain— ★

Governance gate for LangChain agents. Powered by Sentinel AI — pauses risky actions for human approval, logs…

stripllm— ★

LLM sanitization SDK — DOMPurify, but for LLM context windows.

swarm-safety— ★

SWARM: System-Wide Assessment of Risk in Multi-agent systems - A Distributional AGI Safety framework

ultraguard— ★

Enterprise-grade LLM security framework with 40+ scanners and programmable guardrails

weave-protocol-llamaindex— ★

Security scanning and monitoring for LlamaIndex applications - part of Weave Protocol

yuragi— ★

LLM Confidence Fragility Analyzer — Measure how fragile your AI's confidence really is

@flowdot.ai/guardian-agent— ★

TypeScript reference implementation of the guardian-agent spec: a runtime supervisor for tool-using LLM…

langgraph-sfm— ★

Causal intent monitoring for LangGraph agents using bundled Structural Final Models.

@atbash/mcp— ★

Atbash safety judge exposed as a standalone MCP server

@atbash/atbash-mcp— ★

Atbash safety judge exposed as a standalone MCP server

federated-agent-audit— ★

Privacy-preserving audit framework for multi-agent AI systems. Detects cross-agent data leaks, inference…

phantasm-llm— ★

PHANTASM: Invert LLM hallucination, confabulation, and uncertainty into productive features.

crewai-arcgate— ★

Arc Gate runtime governance for CrewAI agents

agentguard-kernel— ★

Constitutional Governance Kernel for AI Agents — trust scoring, approvals, audit trail

llm-fact-guard— ★

Zero-dependency LLM hallucination detection middleware with billing & dashboard — real-time fact-checking…

rampart-llm— ★

Policy-as-code guardrail enforcement for enterprise LLM applications

a2a-trustgate— ★

A2A TrustGate CLI — Safety, compliance, and governance for AI agents. Screen every action before it executes.

agentledger-llm— ★

Action-time proof and delegation verification for MCP agents

ai-sdk-guardrails— ★

Input and output guardrails middleware for Vercel AI SDK.

agentprobe-injection— ★

Harness for measuring LLM agent resistance to indirect prompt injection and comparing defense effectiveness.

anthropic-omega— ★

Anthropic Claude SDK with OmegaEngine governance - AI safety and compliance

autogen-omega— ★

OmegaEngine governance integration for autogen-omega

cohere-omega— ★

OmegaEngine governance integration for cohere-omega

crewai-omega— ★

OmegaEngine governance integration for crewai-omega

dspy-omega— ★

OmegaEngine governance integration for dspy-omega

gemini-omega— ★

Google Gemini SDK with OmegaEngine governance

langchain-omega— ★

LangChain integration with OmegaEngine governance - callbacks, tools, and safety chains

langgraph-omega— ★

OmegaEngine governance integration for langgraph-omega

llamaindex-omega— ★

LlamaIndex integration with OmegaEngine governance - RAG safety and compliance

mistral-omega— ★

OmegaEngine governance integration for mistral-omega

Browse other capabilitys