capability

Scraper agents

This page lists every AI agent in the MeshKore directory tagged with the Scraper capability. Agents are sourced from public platforms (GitHub, Hugging Face, npm, PyPI, awesome-list curations, and direct submissions), normalized by the MeshKore worker, and ranked by GitHub stars. Each card links to the agent's profile with details on capabilities, framework, language, freshness, and source attribution.

122 agents in this capability · ranked by popularity

Top 122 Scraper agents

firecrawl124,884 ★

🔥 Search, scrape, and clean the web for AI agents.

huginn49,334 ★

Create agents that monitor and act on your behalf. Your agents are standing by!

Jobs_Applier_AI_Agent_AIHawk29,814 ★

AIHawk aims to easy job hunt process by automating the job application process. Utilizing artificial…

Scrapegraph-ai26,170 ★

Python scraper based on AI

llm-scraper6,749 ★

Turn any webpage into structured data using LLMs

myGPTReader4,421 ★

A community-driven way to read and chat with AI bots - powered by chatGPT.

AnyCrawl3,173 ★

AnyCrawl 🚀: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP…

CyberScraper-20773,028 ★

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

oxylabs-ai-studio-py2,919 ★

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation…

weibo_terminater2,320 ★

Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator

thepipe1,526 ★

Get clean data from tricky documents, powered by vision-language models ⚡

apify-mcp-server1,275 ★

The Apify MCP server enables your AI agents to extract data from social media, search engines, maps…

RedBox998 ★

用AI创作高质量内容,用gpt-image-2创作的最佳生图工具,AI图片自动编排,小红书版Openclaw,自媒体创作者的AI工作台,小红书创作AI工具RedClaw,支持小红书图文下载、创作风格学习、小红书AI创作|…

x-tweet-fetcher842 ★

Fetch X/Twitter tweets, replies, timelines, and articles without login or API keys — field tool for AI agents.

firecrawl-app-examples741 ★

🔥 This repository contains complete application examples, including websites and other projects, developed…

ai-scraper-py662 ★

AI Scraper is a powerful scraping tool and scrape agent built to automate data extraction with unmatched…

AutoScraper485 ★

Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation"…

n8n-claw447 ★

OpenClaw-inspired autonomous AI agent built entirely in n8n. Adaptive RAG-powered memory, Skills via MCP…

scraperai405 ★

ScraperAI is an open-source, AI-powered tool designed to simplify web scraping for users of all skill levels.

resume_render_from_job_description405 ★

Resume_Builder_AIHawk is a powerful Python tool that allows you to automatically customize your resume based…

extractor317 ★

Use LLMs to robustly extract web data

reader308 ★

📚 This is an adapted version of Jina AI's Reader for local deployment using Docker. Convert any URL to an…

gpt4V-scraper302 ★

AI agent that can SEE 👁️, control, navigate, & do stuff for you on your browser.

llm-reader290 ★

Turn Webpage to LLM friendly input text. Similar to Firecrawl and Jina Reader API. Makes RAG, AI web…

knowledge-gpt290 ★

Extract knowledge from all information sources using gpt and other language models. Index and make Q&A…

teracrawl263 ★

High-performance web crawler API optimized for LLMs. Turn any search or website into clean Markdown using…

lego-ai-parser239 ★

Lego AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.

search-result-scraper-markdown239 ★

This project provides a powerful web scraping tool that fetches search results and converts them into…

AI-Resume-Analyzer-and-LinkedIn-Scraper-using-Generative-AI205 ★

Developed an AI application using LLM to analyze user resumes and provided the summarization, strengths…

unofficial-claude-api200 ★

Unofficial Claude API supporting direct HTTP chat creation/deletion/retrieval, messages with multiple file…

BrowserPilot149 ★

Open‑source alternative to Perplexity Comet, director.ai and firecrawl combined

Upwork-AI-jobs-applier143 ★

AI tool for automating Upwork job applications using AI agents to find and qualify jobs, write personalized…

web-scout-mcp129 ★

A powerful MCP server extension providing web search and content extraction capabilities. Integrates…

crw116 ★

Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search API with MCP server for AI…

wxpath111 ★

wxpath - declarative web crawling with XPath; a Web Query Language (WQL)

x-twitter-scraper96 ★

Twitter scraper API skill for tweet search, advanced Twitter search, profile tweets, follower export, media…

anansi88 ★

A self-healing web scraper built for hostile sites: selectors repair themselves, browser rendering kicks in…

RAG-based-job-search-assistant87 ★

linkedin-jobs-RAG

WebScraper87 ★

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation…

ai-web-scraper78 ★

AI web scraper built with Crawl4AI for extracting structured leads data from websites.

advanced-sitemap-parser76 ★

XML sitemap parser designed to extract and process millions of URLs while bypassing most modern anti-bot…

Custom-MCP-Server75 ★

MCP server for scraping LinkedIn, Facebook, Instagram profiles and Google search.

Website-Crawler74 ★

Extract data from websites in LLM ready JSON or CSV format. Crawl or Scrape entire website with Website…

OpenAver73 ★

Modern JAV metadata manager — multi-source scraping, Jellyfin integration, and AI-ready API. Built with…

ytfetcher72 ★

⚡ Build structured YouTube datasets at scale — effortlessly fetch transcripts and rich metadata for NLP, ML…

reddit_karma_farmer_auto_commentator_with_AI64 ★

Reddit_Commentator_AIHawk is a Python project showcasing the power of artificial intelligence in social media…

bedrock-agents-webscraper59 ★

This repo provides guidance on setting up a bedrock agent to webscrape and internet search via action groups

slither59 ★

A simple, easy to use framework for adding randomized, anonymous IP addresses and user-agents to web…

oxylabs-ai-studio-js47 ★

Structured data gathering from any website using AI-powered scraper, crawler, and browser automation…

Reddit-AI-Agent37 ★

Reddit AI Agent is an intelligent tool that helps you explore Reddit like never before! 🔎 It allows you to…

langchain-webscraper-demo35 ★

A chatbot demo that scrapes a website and stores the result in a vector db, which can then be queried via…

gptauto33 ★

ChatGPT selenium scraper written in Python

crawl4ai-skill31 ★

Web scraping skill for Claude AI. Crawl websites, extract structured data with CSS/LLM strategies, handle…

OpenCometAI29 ★

Open Comet is an autonomous AI agent integrated into your Chrome browser. It enables safe, transparent, and…

git-repo-parser29 ★

A tool to scrape all files from a GitHub repository and turn it into a JSON or TXT file, Useful for AI and…

x-scraper28 ★

A Twitter/X scraper built with Playwright for browser automation and OpenAI GPT-4 for AI-powered tweet…

web-crawling-guides27 ★

How to guides on web-crawling or scraping

perplexity-ai-export27 ★

Grabs all your Perplexity conversations data, spits it out into a nice file folder structure and allows you…

AI-web_scraper27 ★

Just mention want you want and it will extract/scrape data from the Web. Useful to create AI web…

spider-clients26 ★

Python, Javascript, and Rust libraries for the Spider Cloud API.

wikibot26 ★

A :robot: which provides features from Wikipedia like summary, title searches, location API etc.

openai-scraper26 ★

This is a template repository for building a web scraper with OpenAI support. The repository provides a basic…

Agent-WebCloak26 ★

[IEEE S&P'26] WebCloak: Characterizing and Mitigating the Threats of LLM-Driven Web Agents as Intelligent…

pricing-page-scraper25 ★

Parse SaaS pricing page using Open AI - GPT-3.5

zero-gtm23 ★

web-extract-with-chatgpt22 ★

A Python project that extracts data from websites with the option to process the data through @openai's…

liquidation-cluster-signal-scraper21 ★

A bot that scrapes open-interest and liquidation heatmaps to alert traders when a "Short Squeeze" or "Long…

wechat-article-to-md20 ★

Claude Code Skill - 抓取微信公众号文章并转换为 Markdown,自动下载图片 | WeChat Article to Markdown Converter

graph19 ★

⚡️ Real-time Knowledge Graph for AI Agents. Connect LLMs to verified weather, stock, and currency data via…

GPT-auto-webscraping19 ★

olostep-mcp-server17 ★

MCP server for Olostep — the web scraping, crawling, and search infrastructure used by top AI companies…

Product-Matching17 ★

The topic is about product matching via Machine Learning. This involves using various machine learning…

SCAPO17 ★

🧘 Reddit-powered AI optimization tips | Save your time and credits | can cover 380+ services | Real tips from…

llmweb-rs17 ★

Webpage to structured data in Rust & LLM

base44-docs-tool16 ★

Instant, local access to complete Base44 documentation with AI assistant integration

AURORA16 ★

AURORA (Artificial Unified Responsive Optimized Reasoning Agent) uses lobes and web research for RAG based…

DocsScraper.jl14 ★

Efficient RAG knowledge pack creator from online Julia documentation

scraper14 ★

RAG-based Web Scraping

Shop-filterOpenCode11 ★

ShopFilter 是由 OpenCode AI…

dingtalk-ai-robot11 ★

钉钉智能机器人,支持AI问答、知识库检索、JIRA管理和服务器维护、周报日报总结、快捷创建工单等等

ai-docs-vector-db-hybrid-scraper11 ★

Retrieval-augmented docs ingestion stack: Firecrawl + Crawl4AI + Qdrant vector search with FastAPI and MCP…

headlines-gpt10 ★

Scrapes headlines from CNN and FOX, then has ChatGPT do cross-analysis

InstaWaves10 ★

Telegram bot which helps in promoting Instagram accounts

chatgpt-presentation-generator-bot10 ★

Telegram bot utilizing OpenAI's GPT to generate presentations and abstracts in PPTX and DOCX formats.

Destiny-job-scout9 ★

🦅 DestinyScout: 一款基于 Agent-Native LLM + Boss直聘的 L3 级自主个性化求职引擎。告别机械搬运,它能深度注入你的私人职场 DNA…

omniwire9 ★

Infrastructure layer for AI agent swarms — 88 MCP tools · A2A · OmniMesh VPN · Scrapling scraper · COC sync ·…

Job-Prep9 ★

This is the repository for a Streamlit application that helps with job applications. This app integrates…

linkedin-job-hunting-assistant8 ★

A Python tool that automates LinkedIn job search, ranking, and export by combining Bright Data's LinkedIn Job…

stackoverflow-scraper-messenger-bot8 ★

A messenger bot that answers messages by scraping stackoverflow questions and answers

llm-scraper-py8 ★

Python implementation of https://github.com/mishushakov/llm-scraper

flyscrape8 ★

The Most Powerful Open-source LLM Friendly Typescript Web Crawler & Scraper

ai-news-scraper8 ★

AI News Scraper & Semantic Search: A Python application that scrapes news articles, uses GenAI to generate…

promobot7 ★

PromoBot - A web scraper that monitors promotion sites by searching keywords and reporting to a Telegram…

Real-Time-Social-Media-Content-Retrievel-System7 ★

The Real Time Social Media Content Retrieval System fetches real-time LinkedIn posts based on user queries…

Awesome-Auto-Research6 ★

Tracking the systems that automate scientific research — from literature scrapers to full paper-writing…

sdk6 ★

Lightfeed SDK to search and filter web data

pangolinfo-amazon-scraper-cli5 ★

Pangolinfo 亚马逊爬虫与数据采集工具:基于 Pangolinfo Amazon Scrape API / 数据 API 实现 Amazon 实时数据采集(商品详情、关键词、评论、榜单、类目/利基),输出 AI…

WebScraperToolkit5 ★

AI-first web scraping engine with stealth bypass, MCP server, and multimodal output (Markdown, JSON, PDF) for…

Scraper5 ★

A robust, local-first intelligence application for scraping and analyzing Dark Web data.

paper-reading-agent5 ★

Paper Reading Agent Team…

renderscholar5 ★

Tired of LLMs citing fake papers? renderscholar is a Google Scholar scraper (inspired by Andrej Karpathy’s…

Free-API5 ★

🆓 Access a collection of free, public JSON APIs with no limits or authentication required. Explore…

simple-chatgpt-wrapper5 ★

A simple npm package to perform requests as a user on the OpenAI ChatGPT page.

scraper-flow5 ★

redditlens4 ★

Find real pain points on Reddit and draft value-first replies. CLI + Claude Code skill. Serper + Reddit…

startups-from-ai4 ★

This AI bot goes online, gathers information about AI startups, and posts updates about them on X and Dev.to.

Scrapo4 ★

AI-native, agent-first web scraping for Python — cost-aware tiered fetching (HTTP → browser → stealth →…

Stone_Scraper4 ★

Stone Scraper is an AI-powered tool for automated web data extraction. Built with Streamlit, Langchain, and…

Job-Resume-Generator-using-OpenAI-LangChain4 ★

This is the repository for a Streamlit application that helps with job applications. This app integrates…

scraperapi-mcp4 ★

This MCP server enables LLMs to retrieve and process web scraping requests using ScraperAPI.

purify3 ★

Single-binary web scraper for AI agents. Headless Chrome + Readability → clean Markdown. Up to 99% token …

openai-sdk-with-web-unlocker3 ★

Integrating OpenAI Agents SDK with Bright Data Web Unlocker, enabling AI agents to access, extract, and…

mcp-webscraper3 ★

Local MCP Server with Claude Desktop (Windows + WSL) with scrapping and crawling tools.

otodom_scraper_and_information_retrieval3 ★

Otodom scraper and information retrieval

crewai_multiagent3 ★

CrewAI Multiagent is an AI-powered automation suite for research, news, poetry, code execution, and PDF…

Linkedin-MCP-Server3 ★

A lightweight MCP server for LinkedIn automation. Supports profile, job, company and post scraping. Enables…

wechat-article-to-markdown3 ★

Convert WeChat Official Account articles to clean Markdown with metadata extraction, image download, and code…

agent-search-cli3 ★

Enable AI agents to search, crawl, and extract web data with IP rotation, CAPTCHA handling, and rate limit…

job-materializer2 ★

Scrapes newly posted jobs from a variety of sites and an ai agent filters them based off of the users resume.

apex-growth-openclaw-skill2 ★

Autonomous AI agent skill for aggressive lead generation and growth hacking.

nitjsr-hub2 ★

A Student hub, real-time plugin based web-application featuring a chat, marketplace, video conferencing, and…

x2md2 ★

Convert an X (Twitter) URL into clean markdown — tweets, photos, videos, long-form X Articles, and top…

Browse other capabilitys