Audio & Voice agents

This page lists every AI agent in the MeshKore directory tagged with the Audio & Voice category. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

1,071 agents in this category · ranked by popularity

Top 200 Audio & Voice agents

ChatTTS39,328 ★

A generative speech model for daily dialogue.

CosyVoice21,264 ★

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

rasa21,179 ★

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue…

leon17,266 ★

🧠 Leon is your open-source personal assistant.

ten-framework10,613 ★

Open-source framework for conversational voice AI agents

moonshine8,268 ★

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and…

Vision-Agents7,849 ★

Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider. Uses…

wukong-robot7,119 ★

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

OnlySwitch5,678 ★

⚙️ All-in-One menu bar app, hide 💻MacBook Pro's notch, dark mode, AirPods, Shortcuts

Red-DiscordBot5,543 ★

A multi-function Discord bot

cactus5,239 ★

Low-latency AI engine for mobile devices & wearables

cheetah4,263 ★

Mac app for crushing tech interviews with AI

awesome-bots4,141 ★

The most awesome list about bots ⭐️🤖

auto-subs3,450 ★

Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.

SimpleMem3,435 ★

SimpleMem: Efficient Lifelong Memory for LLM Agents — Text & Multimodal

openwhispr3,393 ★

Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and…

faster-whisper-GUI2,957 ★

faster_whisper GUI with PySide6

amurex2,827 ★

World's first AI meeting copilot → The Invisible Companion for Work + Life

speechgpt2,755 ★

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

polyglot2,590 ★

🤖️ Cross-platform AI language practice app （跨平台AI语言练习应用）

rasa_core2,342 ★

Rasa Core is now part of the Rasa repo: An open source machine learning framework to automate text-and…

VisionClaw2,324 ★

Real-time AI assistant for Meta Ray-Ban smart glasses -- voice + vision + agentic actions via Gemini Live and…

awesome-whisper2,309 ★

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

comfyui_LLM_party2,258 ★

LLM Agent Framework in ComfyUI includes MCP sever, Omost,GPT-sovits, ChatTTS,GOT-OCR2.0, and FLUX prompt…

ui2,237 ★

ElevenLabs UI is a component library and custom registry built on top of shadcn/ui to help you build…

baresip2,110 ★

Baresip is a modular SIP User-Agent with audio and video support

epub_to_audiobook1,984 ★

EPUB to audiobook converter, optimized for Audiobookshelf, WebUI included

pluely1,976 ★

The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly…

Dot1,911 ★

Text-To-Speech, RAG, and LLMs. All local!

react-simple-chatbot1,756 ★

:speech_balloon: Easy way to create conversation chats

ElatoAI1,756 ★

Realtime Voice AI with 100+ Models on Arduino ESP32 with Secure Websockets and Edge Functions for AI Toys…

bailing1,701 ★

百聆是一个类似GPT-4o的语音对话机器人，通过ASR+LLM+TTS实现，集成DeepSeek R1等优秀大模型，接入openClaw，真正的个人语音助手，时延低至800ms，Mac等低配置也可运行，支持打断

RCLI1,514 ★

Talk to your Mac, query your docs, no cloud required. On-device voice AI + RAG

yt-whisper1,439 ★

Using OpenAI's Whisper to automatically generate YouTube subtitles

Dragonfire1,409 ★

the open-source virtual assistant for Ubuntu based Linux distributions

langchain4j-aideepin1,288 ★

基于AI的工作效率提升工具（聊天、绘画、知识库、工作流、 MCP服务市场、语音输入输出、长期记忆） | Ai-based productivity tools (Chat,Draw,RAG,Workflow,MCP…

telegram-chatgpt-concierge-bot1,131 ★

Interact with OpenAI's ChatGPT via Telegram and Voice.

lotti1,114 ★

Open-source private logbook with a local agentic layer. Long-living AI agents read what you record and…

AI-Waifu-Vtuber1,078 ★

AI Vtuber for Streaming on Youtube/Twitch

AVA-AI-Voice-Agent-for-Asterisk1,045 ★

An open-source AI Voice Agent that integrates with Asterisk/FreePBX using Audiosocket/RTP technology

Whisperboard1,032 ★

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

realtime-phone-agents-course977 ★

Build realtime AI voice agents using FastRTC for low-latency streaming, Superlinked for vector search, Twilio…

lobe-vidol954 ★

🧸 Lobe Vidol - Making Virtual Idols Accessible for EveryOne

voquill947 ★

Open source voice dictation technology

blurr922 ★

This app can now use Android, just like a human.

agent-starter-react873 ★

A complete voice AI frontend app for LiveKit Agents with Next.js

local-talking-llm852 ★

A talking LLM that runs on your own computer without needing the internet.

esp-ai831 ★

The simplest and lowest-cost AI integration solution. If you like this project, please give it a Star~ |…

aws-lex-web-ui822 ★

Sample Amazon Lex chat bot web interface

openclaw-nerve821 ★

Real-time web cockpit for OpenClaw: voice conversations, agent automated kanban board, workspace/file…

gitpodcast807 ★

Convert any git repository into an engaging podcast

june784 ★

Local voice chatbot for engaging conversations, powered by Ollama, Hugging Face Transformers, and Coqui TTS…

whisper.rn781 ★

React Native binding of whisper.cpp.

SwiftWhisper779 ★

🎤 The easiest way to transcribe audio in Swift

viral-clips-crew756 ★

Your CrewAI Powered Video Editing Assistant

whisper.unity729 ★

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

LocalAIVoiceChat721 ★

Local AI talk with a custom voice based on Zephyr 7B model. Uses RealtimeSTT with faster_whisper for…

VideoAgent714 ★

"VideoAgent: All-in-One Agentic Framework for Video Understanding, Editing, and Remaking"

BabelDuck681 ★

Beginner-friendly AI conversation practice application

voice-assistant-scripts681 ★

Example scripts for AI agents created with the Alan AI Platform.

bolna654 ★

Conversational voice AI agents

speech-to-text615 ★

Real-time transcription using faster-whisper

stealth597 ★

An open source Ruby framework for text and voice chatbots. 🤖

VLog588 ★

[CVPR 2025] Video Narration as Vocabulary & Video as Long Document

echokit_server565 ★

Open Source Voice Agent Platform

Starmoon546 ★

A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT…

LLM-Agents-Ecosystem-Handbook524 ★

One-stop handbook for building, deploying, and understanding LLM agents with 60+ skeletons, tutorials…

JARVIS523 ★

Your own personal voice assistant: Voice to Text to LLM to Speech, displayed in a web interface

joinly519 ★

Make your meetings accessible to AI Agents

ollama-voice-mac517 ★

Mac compatible Ollama Voice

Facemoji454 ★

😆 A voice chatbot that can imitate your expression. OpenCV+Dlib+Live2D+Moments Recorder+Turing Robot+Iflytek…

okcash434 ★

OK | Every voice, every meme, every transaction makes $OK stronger and more vibrant. Powered by all of us—and…

react-voice-agent431 ★

smol-podcaster414 ★

smol-podcaster is your podcast production agent 🎙️

project-raven403 ★

Open-source AI meeting copilot - real-time transcription, echo cancellation, and AI assistance. Captures…

visionOS-examples400 ★

visionOS examples ⸺ Spatial Computing Accelerators for Apple Vision Pro

Stream-Omni386 ★

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across…

Whisper-transcription_and_diarization-speaker-identification-377 ★

How to use OpenAIs Whisper to transcribe and diarize audio files

edgen372 ★

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs…

adk-rust350 ★

Rust Agent Development Kit (ADK-Rust): Build AI agents in Rust with modular components for models, tools…

say348 ★

say - command line tool for voice and video calling

maxheadbox343 ★

Tiny truly local voice-activated LLM Agent that runs on a Raspberry Pi

macos-local-voice-agents323 ★

Pipecat voice AI agents running locally on macOS

tiledesk-dashboard318 ★

Tiledesk is the open source AI agent builder, written in Node.js and Angular. This repository is dedicated to…

jarvis318 ★

Jarvis is a voice-activated, conversational AI assistant powered by a local LLM (Qwen via Ollama). It listens…

twewy-discord-chatbot315 ★

Discord AI Chatbot using DialoGPT, trained on the game transcript of The World Ends With You

hack-interview310 ★

AI-powered tool for real-time interview question transcription and response generation.

RuntimeSpeechRecognizer306 ★

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI…

gpt-voice-conversation-chatbot302 ★

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while…

whisper-node302 ★

Node.js bindings for OpenAI's Whisper. (C++ CPU version by ggerganov)

AI-Talks296 ★

AI Talks - ChatGPT Assistant via Streamlit

tiledesk296 ★

Install Tiledesk on your server using Helm for Kubernetes orchestration and Docker Compose for running…

ai-devices295 ★

AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more

firefox-voice292 ★

Firefox Voice is an experiment in a voice-controlled web user agent

TranscriberBot291 ★

TranscriberBot for Telegram

tetos279 ★

A unified interface for multiple Text-to-Speech (TTS) providers.

safestclaw276 ★

Safestclaw is the alternative to openclaw.. You can naturally chat with it via text and voice, and you can…

voiceai274 ★

Set of 📝 with 🔗 to help those building Voice AI agents 🎙️🤖

aixplora274 ★

AIxplora is a open-source tool which let's you query all kind of files not limited to any length or format.

ai_webui270 ★

AI-WEBUI: A universal web interface for AI creation, 一款好用的图像、音频、视频AI处理工具

openclaw-assistant269 ★

OpenClaw voice assistant app for Android - Wake word activation & system assistant integration

skills266 ★

Collections of skills for building with ElevenLabs

DB-GPT-Web265 ★

DB-GPT WebUI，LLM to vision.

Skill-Anything259 ★

Any source (PDF, video, web, audio, text) to interactive learning package with quizzes, flashcards and spaced…

Stage-Whisper258 ★

The main repo for Stage Whisper — a free, secure, and easy-to-use transcription app for journalists, powered…

twelvet257 ★

（Spring Boot 3. X Microservices framework）基于Spring Boot 3.X 的 Spring Cloud Alibaba / Spring Cloud Tencent +…

GPT-Automator256 ★

Your voice-controlled Mac assistant

KarmaBot256 ★

🤖 A Multipurpose Discord Bot with a Music System & Utility commands used by 200K+ users!

openai-voice-agent-sdk-sample256 ★

Sample application to add voice capabilities to the Agents SDK

react-native-chatbot253 ★

:speech_balloon: Easy way to create conversation chats

gpt_server253 ★

gpt_server是一个用于生产级部署LLMs、Embedding、Reranker、ASR、TTS、文生图、图片编辑和文生视频的开源框架。

llama_ros252 ★

llama.cpp (GGUF LLMs) and llava.cpp (GGUF VLMs) for ROS 2

sapphire252 ★

She's the AI agent you come home to.

sepia-docs251 ★

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section…

jaicf-kotlin249 ★

Kotlin framework for conversational voice assistants and chatbots development

daily-bots-web-demo247 ★

Daily Bots Web Demo showcasing how to build real-time voice AI agents

jarvis-ai-assistant233 ★

Voice-activated AI assistant with speech recognition and NLP. Automate tasks effortlessly with this…

M.I.L.E.S232 ★

M.I.L.E.S, a GPT-4-Turbo voice assistant, self-adapts its prompts and AI model, can play any Spotify song…

AI226 ★

The definitive, open-source Swift framework for interfacing with generative AI.

neuralnoise225 ★

The AI Podcast Studio: generate podcasts scripts and their audio version with a team of AI workers in a…

agent-starter-python222 ★

A complete voice AI starter for LiveKit Agents with Python.

SpotifyTranscripts220 ★

🎙️ AI generated subtitles and segmented chapters for podcasts

amazon-sumerian-hosts210 ★

Amazon Sumerian Hosts (Hosts) is an experimental open source project that aims to make it easy to create…

nodejs-whisper209 ★

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

multimodal-mcp-client209 ★

[DEPRECATED] Superseded by systempromptio/systemprompt-template and systempromptio/systemprompt-core…

gdansk-ai199 ★

Full stack voice chatbot

chatbot-watson-android197 ★

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

leon-cli195 ★

⌨️ Command-line interface (CLI) for a better use of Leon, your open-source personal assistant. GNU/Linux…

BentoChain194 ★

A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models…

openai_tts193 ★

Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible endpoint to…

AutoGLM-TERMUX190 ★

Quickly deploy Open-AutoGLM agent on Android phone using Termux. Support AI voice recognition and enable…

uxie189 ★

pdf reader app with note taking, annotations, collaboration, ai features (chat, flashcards generation w…

wyoming_openai186 ★

OpenAI-Compatible Proxy Middleware for the Wyoming Protocol

openai-whisper-realtime185 ★

A quick experiment to achieve almost realtime transcription using Whisper.

presenter185 ★

A Multi-Agent AI Tool that creates beautiful presentations with voice-overs 🎦🔥

realtime-ai179 ★

A real-time Agent framework for audio and video.

BlahST174 ★

Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak…

flutter_whisper.cpp171 ★

Flutter App That Can Transcribe Audio Offline/On Device with Whisper C++ Bindings via Rust

sayna171 ★

Sayna is a unified Voice Layer for AI Agents with a seemless integration to an existing agentic frameworks

Wally169 ★

Cute voice assistant built on ESP32 to help users with reminders, productivity, and daily conversations.

kkclaw165 ★

🦞 一个可爱的桌面龙虾AI助手 - Desktop lobster pet with OpenClaw AI, Edge TTS voice, and emotion animations

sample-strands-agent-with-agentcore164 ★

Reference architecture for agentic AI chatbots with Strands Agents and Amazon Bedrock AgentCore

audio-to-text-transcription162 ★

This repository contains a Python script that allows users to download the audio from a YouTube video…

kobold_assistant161 ★

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using…

Unitale158 ★

一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪，集成多角色 TTS…

podgenai156 ★

OpenAI GPT based informational audiobook/podcast mp3 generator

talkGPT4All153 ★

A voice chatbot based on GPT4All and talkGPT, running on your local pc!

podcast-llm146 ★

Automatically generate engaging AI podcasts from nothing but an episode title.

BiliSum144 ★

为 Bilibili、YouTube 及本地视频提供 AI 视频摘要和知识库.AI video summarizer and knowledge base for Bilibili, YouTube and local…

AI-Voice-Agent143 ★

Self-hosted AI voice agent

jacobo-workflows142 ★

7 production n8n workflows from Jacobo, a multi-agent AI system (WhatsApp + Voice). Open source by default.

llm_intents142 ★

Exposes internet search tools for use by LLM-backed Assist in Home Assistant

portable-hermes-agent141 ★

Hermes Agent made portable desktop for Windows — 100 tools, GUI, local models via LM Studio, TTS, Music…

whisper-clip137 ★

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly…

magda-core135 ★

A DAW built for automation, transformation, and fast musical iteration

flowcraft135 ★

Production-grade Go SDK for building AI agents with long-term memory, knowledge retrieval, and voice …

ai-waifu135 ★

AI VTuber Waifu and voice assistant

Auto-Subtitled-Video-Generator134 ★

Input a YouTube video link or upload a video file and get a video with subtitles.

SlackONOS134 ★

🎵 Democratic Slack/Discord bot for Sonos control with Spotify integration. Queue music, vote to skip, and let…

simplechat134 ★

Secure AI conversations with documents, video, audio, and more. Personal workspaces for focused context…

NachoBot133 ★

基于Maibot核心修改而成的多功能笨蛋机器人

PersonalAssistantChatbot133 ★

It is a personal assistant chatbot, capable to perform many tasks same as Google Assistant plus more extra…

skills129 ★

AAHL's Agent Skills. 汇集了多种实用的智能体技能，涵盖Home Assistant智能家居控制、微软Edge…

hypercheap-voiceAI128 ★

The most cost-effective, highest performance AI voice agent possible today

chatgpt-web128 ★

ChatGPT web application. ChatGPT 网页应用，支持多对话、海量提示词、PWA、ASR、TTS

Awesome-Colorful-LLM128 ★

Recent advancements propelled by large language models (LLMs), encompassing an array of domains including…

eva128 ★

A New End-to-end Framework for Evaluating Voice Agents

Goida-AI-Unlocker123 ★

🛡 Установщик разблокировщика зарубежных AI-сервисов (и не только) для России на Windows 10/11 🌍

PodAgent121 ★

PodAgent: A Comprehensive Framework for Podcast Generation

whis119 ★

Voice-to-text CLI for terminal users

OpenToys118 ★

Make Local AI Toys, Robots, Devices that with a MacBook and an Arduino ESP32

ai-course-notes118 ★

303 份 AI/LLM 中文讲义，支持在线阅读、PDF 下载和 LaTeX 源码查看 | Stanford CS336/CS224R/CS25 | Berkeley LLM Agents | Agent 工程实践

MisterWhisper115 ★

Push to talk voice recognition using Whisper

VoiceAgentRAG114 ★

workersai113 ★

Full-stack AI chat platform built on Cloudflare using Workers, Durable Objects, KV, and AI Gateway. Features…

speechless111 ★

LLM based agents with proactive interactions, long-term memory, external tool integration, and local…

zyron-assistant110 ★

⚡ A local, privacy-focused AI desktop assistant for Windows. Control your PC remotely via Telegram or locally…

telegram-llm-bot110 ★

Telegram LLM bot backed by OpenAI, Whisper, Beam, LLaMA, Weaviate, MinIO and MongoDB

awesome-openai-whisper106 ★

A curated list of awesome OpenAI's Whisper

JARVIS-AI-ASSISTANT103 ★

A true Artificial Intelligent Assistant with ALICE as backend and offline speech recognition with vosk engine…

openai-realtime-python103 ★

Real-time voice agent powered by Agora and OpenAI

ChatVox103 ★

"Chat With Any Video" project in 24 hours, challenge myself to complete in @Supabase's AI Hackathon.

ai-phone-agent103 ★

AI Phone Agent: A starter kit to build AI agents that answer real phone calls and talk to customers in real…

LLMChat101 ★

A Discord chatbot that supports popular LLMs for text generation and ultra-realistic voices for voice chat.

modelguide101 ★

Open-source voice agent orchestration framework - build production voice AI pipelines without vendor lock-in

gptspeaker100 ★

The ChatGPT/DeepSeek Voice Assistant uses a Raspberry Pi (or desktop) to enable spoken conversation with…

agent-starter-node100 ★

A complete voice AI starter app for LiveKit Agents with Node.js

line99 ★

Cartesia Line SDK for voice agents.

tankwork98 ★

Desktop agent framework for creating AI agents that can see and control your computer through voice and text…

mobileclaw98 ★

🦞 MobileClaw — 带眼睛的龙虾对讲机 | Multimodal voice+vision walkie-talkie for OpenClaw AI agents. iOS & Android.

GPTube97 ★

🎥 Youtube Video Summarizer and Question Answering App Using Whisper and Langchain

ZZZ-HiveMind-core97 ★

Join the OVOS collective, utils for OpenVoiceOS mesh networking

XnneHangLab96 ★

希望用代码为 waifus 绘心。

whisper-nextjs96 ★

Next.js app for serverless deployments of OpenAI Whisper on Banana.dev

laibot-client94 ★

开源人工智能，基于开源软硬件构建语音对话机器人、智能音箱……人机对话、自然交互，来宝拥有无限可能。特别说明，来宝运行于Python 3！

simulflow94 ★

A Clojure library for building real-time voice-enabled AI Agents. Simulflow handles the orchestration of…

ChatGPT-voice-control93 ★

Voice control for ChatGPT. Talk to ChatGPT and hear ChatGPT's responses in a natural voice.

realtime-interview-copilot93 ★

Realtime Interview Copilot is a web application that assists users in crafting responses during interviews…

J.A.R.V.I.S92 ★

Iron man inspired Personal virtual assistant

shellChatGPT92 ★

Shell wrapper for OpenAI's ChatGPT, Whisper, and TTS. Features LocalAI, Ollama, Gemini, Anthropic, and more.

Browse other categorys

Code & Development (23,649)AI Infrastructure (21,941)General (10,806)Data & Research (5,080)Business (3,002)Image & Vision (1,815)Content & Writing (1,269)Personal Assistant (819)Crypto & DeFi (295)Translation (290)