capability

Speech agents

This page lists every AI agent in the MeshKore directory tagged with the Speech capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

516 agents in this capability · ranked by popularity

Top 200 Speech agents

ChatTTS39,328 ★

A generative speech model for daily dialogue.

faster-whisper23,174 ★

Faster Whisper transcription with CTranslate2

leon17,266 ★

🧠 Leon is your open-source personal assistant.

AudioGPT10,176 ★

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

moonshine8,268 ★

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and…

Awesome-Prompt-Engineering5,966 ★

This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative…

ml-road4,798 ★

Machine Learning and Agentic AI Resources, Practice and Research

speech-to-speech4,764 ★

Build local voice agents with open-source models

WhisperLive4,047 ★

A nearly-live implementation of OpenAI's Whisper.

auto-subs3,450 ★

Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.

openwhispr3,393 ★

Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and…

whisper-standalone-win3,045 ★

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

speechgpt2,755 ★

💬 SpeechGPT is a web application that enables you to converse with ChatGPT.

awesome-whisper2,309 ★

🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI

pluely1,976 ★

The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly…

openai-edge-tts1,899 ★

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

alan-sdk-ios1,888 ★

The Self-Coding System for Your App — Alan AI SDK for iOS

NLP-Models-Tensorflow1,780 ★

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

alan-sdk-flutter1,768 ★

The Self-Coding System for Your App — Alan AI SDK for Flutter

react-simple-chatbot1,756 ★

:speech_balloon: Easy way to create conversation chats

ElatoAI1,756 ★

Realtime Voice AI with 100+ Models on Arduino ESP32 with Secure Websockets and Edge Functions for AI Toys…

alan-sdk-ionic1,661 ★

The Self-Coding System for Your App — Alan AI SDK for Ionic

Dragonfire1,409 ★

the open-source virtual assistant for Ubuntu based Linux distributions

alan-sdk-cordova1,140 ★

The Self-Coding System for Your App — Alan AI SDK for Cordova

lotti1,114 ★

Open-source private logbook with a local agentic layer. Long-living AI agents read what you record and…

AI-Waifu-Vtuber1,078 ★

AI Vtuber for Streaming on Youtube/Twitch

whisper-writer1,058 ★

💬📝 A small dictation app using OpenAI's Whisper speech recognition model.

Whisperboard1,032 ★

The open-source iOS app that's making quality voice transcription more accessible on mobile devices.

Transformers-for-NLP-2nd-Edition962 ★

Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and…

voquill947 ★

Open source voice dictation technology

sokuji876 ★

Live speech translation powered by on-device AI and cloud providers — OpenAI, Google Gemini, Palabra.ai…

local-talking-llm852 ★

A talking LLM that runs on your own computer without needing the internet.

whisper-playground833 ★

Build real time speech2text web apps using OpenAI's Whisper https://openai.com/blog/whisper/

use-whisper786 ★

React hook for OpenAI Whisper with speech recorder, real-time transcription, and silence removal built-in

whisper.rn781 ★

React Native binding of whisper.cpp.

SwiftWhisper779 ★

🎤 The easiest way to transcribe audio in Swift

whisper.unity729 ★

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

ttsfm727 ★

TTSFM mirrors OpenAI's TTS service, providing a compatible interface for text-to-speech conversion with…

BabelDuck681 ★

Beginner-friendly AI conversation practice application

whisper_android660 ★

Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android

gpt-home644 ★

ChatGPT at home! A better alternative to commercial smart home assistants, built on the Raspberry Pi using…

speech-to-text615 ★

Real-time transcription using faster-whisper

chatterbox-tts-api600 ★

Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate voice cloned…

alan-sdk-reactnative579 ★

The Self-Coding System for Your App — Alan AI SDK for React Native

ollama-voice-mac517 ★

Mac compatible Ollama Voice

JARVIS-ChatGPT450 ★

A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM…

alan-sdk-pcf428 ★

The Self-Coding System for Your App — Alan AI SDK for Power Apps

project-raven403 ★

Open-source AI meeting copilot - real-time transcription, echo cancellation, and AI assistance. Captures…

Stream-Omni386 ★

Stream-Omni is a GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across…

potato383 ★

potato: the portable annotation tool

edgen372 ★

⚡ Edgen: Local, private GenAI server alternative to OpenAI. No GPU required. Run AI models locally: LLMs…

insanely-fast-whisper-api356 ★

An API to transcribe audio with OpenAI's Whisper Large v3!

llm349 ★

A powerful Rust library and CLI tool to unify and orchestrate multiple LLM, Agent and voice backends (OpenAI…

Webscout345 ★

Webscout is the all-in-one search and AI toolkit you need. Discover insights with Yep.com, DuckDuckGo, and…

whisper-website324 ★

Simple self-hosted web application, which can be used to convert audio to subtitles by OpenAI's Whisper model

openai-chat-api-workflow316 ★

🎩 An Alfred 5 Workflow for using OpenAI Chat API to interact with GPT models 🤖💬 It also allows image…

ChatGPT-OpenAI-Smart-Speaker312 ★

This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice…

RuntimeSpeechRecognizer306 ★

Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI…

gpt-voice-conversation-chatbot302 ★

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while…

tetos279 ★

A unified interface for multiple Text-to-Speech (TTS) providers.

voiceai274 ★

Set of 📝 with 🔗 to help those building Voice AI agents 🎙️🤖

MITSUHA274 ★

World's First Multilingual Inexpensive Therapeutic Sophisticated Ultra-responsive Holographic Agent. In…

ai_webui270 ★

AI-WEBUI: A universal web interface for AI creation, 一款好用的图像、音频、视频AI处理工具

openclaw-assistant269 ★

OpenClaw voice assistant app for Android - Wake word activation & system assistant integration

DB-GPT-Web265 ★

DB-GPT WebUI,LLM to vision.

Stage-Whisper258 ★

The main repo for Stage Whisper — a free, secure, and easy-to-use transcription app for journalists, powered…

react-native-chatbot253 ★

:speech_balloon: Easy way to create conversation chats

sepia-docs251 ★

Documentation and Wiki for SEPIA. Please post your questions and bug-reports here in the issues section…

jarvis-ai-assistant233 ★

Voice-activated AI assistant with speech recognition and NLP. Automate tasks effortlessly with this…

baibot227 ★

🤖 A Matrix bot for using different capabilities (text-generation, text-to-speech, speech-to-text…

SpotifyTranscripts220 ★

🎙️ AI generated subtitles and segmented chapters for podcasts

amazon-sumerian-hosts210 ★

Amazon Sumerian Hosts (Hosts) is an experimental open source project that aims to make it easy to create…

nodejs-whisper209 ★

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

Dictate208 ★

A powerful Whisper AI keyboard for reliable speech transcription

gdansk-ai199 ★

Full stack voice chatbot

chatbot-watson-android197 ★

An Android ChatBot powered by Watson Services - Assistant, Speech-to-Text and Text-to-Speech on IBM Cloud.

samantha-os1-openai-realtime196 ★

Samantha OS1 is a conversational AI assistant powered by the Realtime API from OpenAI

BentoChain194 ★

A voice-enabled chatbot application built using of 🦜️🔗 LangChain, text-to-speech, and speech-to-text models…

openai_tts193 ★

Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible endpoint to…

voqal192 ★

Voice native AI agent for the builders of tomorrow

zai-tts189 ★

🗣️ ZAI/GLM TTS to OpenAI Speech API, 免费的语音合成API,支持克隆音色,基于智谱TTS

qvac186 ★

QVAC - Local AI SDK and libraries for building private, cross-platform, peer-to-peer AI applications. Run…

Patter184 ★

Open-source voice-AI SDK. The Vapi/Retell alternative for builders who want to own the stack. Give your AI…

DeLive184 ★

System audio capture + multi-provider ASR + local-first AI review workspace. Floating live captions, 12 ASR…

BlahST174 ★

Input text from speech in any Linux window, the lean, fast and accurate way, using whisper.cpp OFFLINE. Speak…

ospeak171 ★

CLI tool for running text through OpenAI Text to speech

sayna171 ★

Sayna is a unified Voice Layer for AI Agents with a seemless integration to an existing agentic frameworks

web-whisper165 ★

OpenAI's Whisper Audio to text transcription right into your web browser! An open source AI subtitling suite.

kobold_assistant161 ★

Like ChatGPT's voice conversations with an AI, but entirely offline/private/trade-secret-friendly, using…

aidialer160 ★

A full stack app for interruptible, low-latency and near-human quality AI phone calls built from stitching…

Unitale158 ★

一个基于Indextts和Qwen3TTS的 AI 有声书制作工具。利用 LLM 自动拆解剧本与识别情绪,集成多角色 TTS…

whitelightning152 ★

WhiteLightning distills massive, state-of-the-art language models into lightweight, hyper-efficient text…

cerul138 ★

The video search layer for AI agents. Search video by meaning — across speech, visuals, and on-screen text.

whisper-clip137 ★

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly…

Auto-Subtitled-Video-Generator134 ★

Input a YouTube video link or upload a video file and get a video with subtitles.

PersonalAssistantChatbot133 ★

It is a personal assistant chatbot, capable to perform many tasks same as Google Assistant plus more extra…

template-repo128 ★

Agent orchestration & security template featuring MCP tool building, agent2agent workflows, mechanistic…

WhatsappAPI122 ★

A simple API to integrate chatbots written in Javascript with WhatsApp Web :speech_balloon::calling: (Store…

whisper-to-input122 ★

An Android keyboard that performs speech-to-text (STT/ASR) with OpenAI Whisper and input the recognized text…

PodAgent121 ★

PodAgent: A Comprehensive Framework for Podcast Generation

uttertype117 ★

Short code for dictation using OpenAI Whisper for transcription.

MisterWhisper115 ★

Push to talk voice recognition using Whisper

workersai113 ★

Full-stack AI chat platform built on Cloudflare using Workers, Durable Objects, KV, and AI Gateway. Features…

speechless111 ★

LLM based agents with proactive interactions, long-term memory, external tool integration, and local…

awesome-openai-whisper106 ★

A curated list of awesome OpenAI's Whisper

JARVIS-AI-ASSISTANT103 ★

A true Artificial Intelligent Assistant with ALICE as backend and offline speech recognition with vosk engine…

InsightSolver-Colab102 ★

InsightSolver: Colab notebooks for exploring and solving operational issues using deep learning, machine…

LLMChat101 ★

A Discord chatbot that supports popular LLMs for text generation and ultra-realistic voices for voice chat.

go-whisper101 ★

Speech o Text using docker image with ggerganov/whisper.cpp

gptspeaker100 ★

The ChatGPT/DeepSeek Voice Assistant uses a Raspberry Pi (or desktop) to enable spoken conversation with…

speech-rest-api99 ★

Transcription and TTS Rest API (OpenAI Whisper, Speechbrain)

ha-openai-whisper-stt-api97 ★

HACS custom integration for using Whisper speech-to-text (OpenAI, GroqCloud or Mistral) API in the Assist…

unspeech95 ★

🗣️🔊 Your Text-to-Speech Services, All-in-One.

ios-chatbot94 ★

laibot-client94 ★

开源人工智能,基于开源软硬件构建语音对话机器人、智能音箱……人机对话、自然交互,来宝拥有无限可能。特别说明,来宝运行于Python 3!

Chiku94 ★

A modern AI chatbot with chat, image generation, and text-to-speech features, designed for a smooth and…

simulflow94 ★

A Clojure library for building real-time voice-enabled AI Agents. Simulflow handles the orchestration of…

ChatGPT-voice-control93 ★

Voice control for ChatGPT. Talk to ChatGPT and hear ChatGPT's responses in a natural voice.

realtime-interview-copilot93 ★

Realtime Interview Copilot is a web application that assists users in crafting responses during interviews…

feros92 ★

Open-source voice agent OS. Rust runtime, AI-driven builder, sub second latency. Self-host everything.

J.A.R.V.I.S92 ★

Iron man inspired Personal virtual assistant

pywhisper90 ★

openai/whisper + extra features

Talk2GPT89 ★

GPT-3 client for Windows and Unix with memories management that supports both text and speech in any…

chatgpt_android88 ★

ChatGPT 安卓版 - 私人定制 AI,只需要本地设置 API Key 就可以使用,聊天历史本地存储,如果想体验语音版本可以下载商用版,或是 自己集成 Azure Speech SDK(付费,现有免费额度送)。

asktube86 ★

AskTube - An AI-powered YouTube video summarizer and QA assistant powered by Retrieval Augmented Generation…

NOVA-NodeJS86 ★

NOVA is a customizable voice assistant made with Node.js.

SpeechAgents86 ★

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

OpenAI-Text-To-Speech-for-Unity85 ★

Implementation of OpenAI's Text-To-Speech in Unity. Synthesize any text and play it via any AudioSource.

trx82 ★

Agent-first CLI for audio/video transcription via Whisper

openai-whisper-api80 ★

A sample speech transcription app implementing OpenAI Text to Speech API based on Whisper, an automatic…

Awesome-Multimodal-Chatbot79 ★

Awesome Multimodal Assistant is a curated list of multimodal chatbots/conversational assistants that utilize…

Whisper_to_ChatGPT77 ★

Chrome extension for voice-to-text conversations with ChatGPT using OpenAI Whisper API

whisper-openai-gradio-implementation75 ★

Whisper is an automatic speech recognition (ASR) system Gradio Web UI Implementation

AmigaGPT75 ★

AmigaOS 3.1/4.1 and MorphOS application for chatting with ChatGPT or generating images

VivaDicta74 ★

iOS & watchOS speech-to-text app with AI voice keyboard, on-device RAG, and chat with your notes - powered by…

AriaType73 ★

Voice-driven writing, input, and cross-app work for your desktop.

ttv-chat-bot73 ★

Twitch livestream bot that can control colors for overlays from Stream Elements, play sound effects, handle…

WatBot72 ★

An Android ChatBot powered by IBM Watson Services (Assistant V1, Text-to-Speech, and Speech-to-Text with…

AI-Voice-assistant71 ★

AI Voice Assistant: Talk to an AI agent that helps you with event scheduling, contact management, accessing…

computing-Korean-STT-error-rates71 ★

STT 한글 문장 인식기 출력 스크립트의 외자 오류율(CER), 단어 오류율(WER)을 계산하는 Python 함수 패키지

Echo71 ★

Production-ready audio and video transcription app that can run on your laptop or in the cloud.

svelte-openai-realtime-api70 ★

svelte component for using the openai realtime api

OpenAI_Whisper_ASR66 ★

A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper "State of the…

IntelliJava64 ★

Integrate with the latest language models, image generation, speech, and deep learning frameworks like…

VRCTextboxSTT63 ★

A SpeechToText application that uses OpenAI's whisper via faster-whisper to transcribe audio and send that…

web-ai-toolkit63 ★

The Web AI Toolkit is a powerful, privacy-first JavaScript library that brings advanced AI capabilities…

speechdigest62 ★

Audio to summary with openAI Whisper & GPT 3.5/4 using streamlit

Voice_ChatBot60 ★

Chatbot in russian with speech recognition using PocketSphinx and speech synthesis using RHVoice. The…

OpenAI-TTS-Gradio59 ★

Use OpenAI TTS(Text to Speech) API with Gradio

swiftube-frontend57 ★

It's like ChatGPT for videos.

MeuxCompanion55 ★

A self-hosted AI companion web app with anime-style Live2D and VRM characters. Talk with your companion via…

Jarvis-Termux53 ★

jarvis ai for Termux;)

adk-mcp-a2a-crash-course53 ★

This project demonstrates a multi-agent system using Google's Agent Development Kit (ADK), Agent2Agent (A2A)…

JARVIS-AI-Assistant51 ★

JARVIS AI Assistant 🤖 A virtual assistant project inspired by Tony Stark's JARVIS, powered by speech…

whisper.cpp_windows50 ★

Just an .exe that can be used for those unable to build whisper.cpp in Windows.

deepgram-voice-agent-demo50 ★

Demo for Deepgram Voice Agent API

DigitalLife49 ★

一个具有长时记忆和 Live2d 形象的"数字生命" / A digital life with long-term memories and live2d body

Unity-QuestConversationalAI49 ★

Unity packages for real-time conversational AI with speech-to-speech capabilities. Integrates OpenAI and…

streamlit_whisper_transcription48 ★

Streamlit Audio Transcription with OPENAI's Whisper Ai: An interactive Streamlit app demonstrating real-time…

kuon47 ★

久远:一个开发中的大模型语音助手,当前关注易用性,简单上手,支持对话选择性记忆和Model Context Protocol (MCP)服务。 KUON:A large language model-based…

live-interview44 ★

Chatbot with a 3D avatar that can answer interview questions in your behalf. It can speak and understand…

whisper-server44 ★

macOS menu bar app providing a local HTTP server compatible with the OpenAI Whisper API for fast and private…

kwami43 ★

👻 kwami.io | A 3D Interactive AI Companion Library for creating engaging AI companions with visual (blob)…

Python-Voice-Assistant43 ★

A Python based Voice Assistant like Siri

MMM-WhisperGPT43 ★

A Whisper + ChatGPT MagicMirror Module.

dispatch42 ★

Revamp your morning routine and supercharge productivity with Dispatch. The ultimate Apple Shortcut powered…

azure-podcast-generator42 ★

Generate an engaging podcast based on your document using Azure OpenAI and Azure Speech.

saiku40 ★

AI Agent capable of automating various tasks using MCP

whisper-subtitles40 ★

🎬 AI-powered localhost subtitle generator for hearing-impaired users. Automatic speech recognition using…

OpenAI_Whisper_Streamlit40 ★

A minimalistic automatic speech recognition streamlit based webapp powered by OpenAI's Whisper

Web-AI-Spotify-DJ40 ★

Spotify Web AI DJ - client side agentic smarts using Gemma 2, two billion parameter LLM, to play what a user…

HA_MistralAI40 ★

Home Assistant custom integration — Mistral AI as conversation agent and Voxtral as speech-to-text engine. …

Jugalbandi-Manager39 ★

Jugalbandi (JB) Manager is a full AI-powered conversational chatbot platform. It's platform agnostic and can…

audiolizr39 ★

A bentoML-powered API to transcribe audio and make sense of it

sussurro39 ★

A fully local, open-source voice-to-text tool that acts as a system-wide AI dictation layer, converting…

AIReceptionist39 ★

Open-source, self-hosted AI phone receptionist powered by OpenAI Realtime API. High-fidelity speech-to-speech…

on-the-road-copilot38 ★

A minimal speech-to-structured output app built with Azure OpenAI Realtime API.

pdf-to-audiobook38 ★

Uses OpenAI API to clean pdf then converts it to professional grade audiobook with text to speech.

Taiwanese-Whisper37 ★

fine-tune Whipser model for Taiwanese speech recognition

GPT_ALL37 ★

This project aims to combine the latest LLMs, Multi-Step Asynchronous Function Calling, Natural Language…

Daisy-openAI-chat37 ★

Python platform for working with LLMs

voice-input-method37 ★

AI native 的跨平台离线语音输入法

sky-livekit-agent-perplexica37 ★

Sky LiveKit Agent Perplexica is a local, free solution integrating LiveKit with advanced internet search. It…

VISOR---A-Voice-Assistant36 ★

V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!

Unity_OpenAI36 ★

This GitHub repository shows how to integrate openai GPT-3 language model and ChatGPT API into a Unity…

Audio-transcriber36 ★

Simple Python audio transcriber using OpenAI's Whisper speech recognition model

Waifu_AI_Vtuber35 ★

Waifu_AI_Vtuber is a AI virtual YouTuber chatbot powered by OpenAI GPT-3.5, interacting in real-time with…

voice-stream35 ★

A framework for creating voice based agents. Integrations LLMs with speech recognition and text-to-speech

azure-avatar-demo34 ★

Text To Speech Demo in ReactJS Application using Azure Avatar AI Service.

voice-gpt34 ★

Let's turn ChatGPT in to VoiceGPT (Vue JS, Vite, Open AI, AWS Polly) ChatGPT Clone (kind of lol)

whisper-speech-to-text34 ★

Whisper Speech-to-Text is a JavaScript library for recording and transcribing user audio into text via…

interviewcopilot34 ★

Real-time transcription, AI-driven answer suggestions, and interview simulation using Next.js, React, Azure…

clawtalk-ios34 ★

Native iOS app for talking to your OpenClaw agents by voice or text. On-device speech recognition, streaming…

Youtube-Shorts-Generator33 ★

Harness OpenAI's power to effortlessly create YouTube Shorts with this project. Includes tools for generating…

openai_stt_ha33 ★

OpenAI Whisper in Home Assistant via the OpenAI API for use in the Assist pipeline

YATSEE33 ★

YATSEE - Yet Another Tool for Speech Extraction & Enrichment

word-teacher32 ★

Efficient AI English Learning: Read & Speak via Web | 通过 AI 学英语朗读,对话的高效 Web 应用

Sophia-AI-Assistant32 ★

Sophia AI Assistant is a Python-based desktop AI that performs a variety of tasks, including answering…

Assistant32 ★

A machine learning powered, voice-based virtual assistant for Raspberry Pi. Supports several features like…

Rofunc-ros31 ★

A Ros Package for Human-centered Interactive Intelligent Humanoid Robots

Browse other capabilitys