Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 web-scraper Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Monkey-DL (Anime Downloader)
Bulk download your favourite anime episodes from your favourite anime websites
-
spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. (by postmodern)
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
web-scraping
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
-
google-maps-scraper
scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place (by gosom)
-
summarizer
A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built algorithm to rank words and sentences.
-
facebook_page_scraper
Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV
-
CobWeb-lnx
CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.
-
tagalog-dictionary-scraper
Builds a Tagalog dictionary by collecting Tagalog words from tagalog.pinoydictionary.com
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Work on a personal project. There's a list of 100 sample projects at https://github.com/arpit-omprakash/100ProjectsOfCode
Use Lightnovel crawler on a computer in terminal or in their discord bot to find series across multiple LN / webnovel sites then choose the format to download (epub,pdf, txt, and many more)
It's been a cool learning experience making a Product Hunt listing, a small demo video, and allll the social posts (Twitter, LinkedIn, etc).
Project mention: AI Report #4: AutoGPT And Open-source lags behind Part 2 | news.ycombinator.com | 2023-06-15> The google search function is also limited. For comparison, SerpAPI masterfully scrapes Google Search using a proxy network and very intelligent parsing. In experiments using SerpAPI in combination with Microsoft’s guidance module, I got much farther than AutoGPT.
Thanks for your kind words. We are working on SerpApi integration for Auto-GPT: https://github.com/serpapi/public-roadmap/issues/905
Project mention: [OpenSource] I am building high performance Plex alternative in Go for Movies and TV Show | /r/golang | 2023-06-02I also build a similar tool, it let's you choose and play movies. I used webtorrent behind the scenes. https://github.com/qascade/yast
web-scraper related posts
-
Show HN: A Google Maps Scraper
-
Google Maps Scraper in Golang
-
I'm trying and failing to compile someone else's project to wasm.
-
Help with Paperback IOS.
-
Fired from an internship after 2 weeks
-
Need help thinking of a personal project
-
Multiparadigmatic Web Scraping Tool!
-
A note from our sponsor - InfluxDB
www.influxdata.com | 23 May 2024
Index
What are some of the best open-source web-scraper projects? This list will help you:
Project | Stars | |
---|---|---|
1 | awesome-crawler | 6,167 |
2 | 100ProjectsOfCode | 2,965 |
3 | soup | 2,133 |
4 | lightnovel-crawler | 1,304 |
5 | stealth | 997 |
6 | Monkey-DL (Anime Downloader) | 810 |
7 | spidr | 793 |
8 | web-scraping | 678 |
9 | google-maps-scraper | 649 |
10 | PHP Scraper | 498 |
11 | basketball_reference_web_scraper | 411 |
12 | crawler | 300 |
13 | summarizer | 267 |
14 | awesome-web-scraper | 240 |
15 | facebook_page_scraper | 200 |
16 | cascadia | 134 |
17 | Senpwai | 132 |
18 | get-sauce | 113 |
19 | public-roadmap | 42 |
20 | CobWeb-lnx | 38 |
21 | yast | 29 |
22 | reddit-bots | 23 |
23 | tagalog-dictionary-scraper | 23 |
Sponsored