web-scraper

Open-source projects categorized as web-scraper

Top 23 web-scraper Open-Source Projects

  • awesome-crawler

    A collection of awesome web crawler,spider in different languages

  • 100ProjectsOfCode

    A list of practical knowledge-building projects.

  • Project mention: Fired from an internship after 2 weeks | /r/cscareerquestions | 2023-06-02

    Work on a personal project. There's a list of 100 sample projects at https://github.com/arpit-omprakash/100ProjectsOfCode

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • soup

    Web Scraper in Go, similar to BeautifulSoup

  • lightnovel-crawler

    Generate and download e-books from online sources.

  • Project mention: Help with Paperback IOS. | /r/mangapiracy | 2023-06-18

    Use Lightnovel crawler on a computer in terminal or in their discord bot to find series across multiple LN / webnovel sites then choose the format to download (epub,pdf, txt, and many more)

  • stealth

    :rocket: Stealth - Secure, Peer-to-Peer, Private and Automateable Web Browser/Scraper/Proxy

  • Monkey-DL (Anime Downloader)

    Bulk download your favourite anime episodes from your favourite anime websites

  • spidr

    A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. (by postmodern)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • web-scraping

    Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist

  • Project mention: web-scraping: NEW Data - star count:554.0 | /r/algoprojects | 2023-09-25
  • google-maps-scraper

    scrape data data from Google Maps. Extracts data such as the name, address, phone number, website URL, rating, reviews number, latitude and longitude, reviews,email and more for each place (by gosom)

  • Project mention: Show HN: A Google Maps Scraper | news.ycombinator.com | 2023-12-03
  • PHP Scraper

    A universal web-util for PHP.

  • basketball_reference_web_scraper

    NBA Stats API via Basketball Reference

  • crawler

    Library for Rapid (Web) Crawler and Scraper Development (by crwlrsoft)

  • summarizer

    A Reddit bot that summarizes news articles written in Spanish or English. It uses a custom built algorithm to rank words and sentences.

  • awesome-web-scraper

    A collection of awesome web scaper, crawler.

  • facebook_page_scraper

    Scrapes facebook's pages front end with no limitations & provides a feature to turn data into structured JSON or CSV

  • cascadia

    Go cascadia package command line CSS selector

  • Senpwai

    A desktop app for tracking and batch downloading anime

  • Project mention: Building W-9 Crafter | dev.to | 2024-03-28

    It's been a cool learning experience making a Product Hunt listing, a small demo video, and allll the social posts (Twitter, LinkedIn, etc).

  • get-sauce

    A command line program to download Hentai videos and images from multiple websites

  • public-roadmap

    Public roadmap for SerpApi, LLC (https://serpapi.com) (by serpapi)

  • Project mention: AI Report #4: AutoGPT And Open-source lags behind Part 2 | news.ycombinator.com | 2023-06-15

    > The google search function is also limited. For comparison, SerpAPI masterfully scrapes Google Search using a proxy network and very intelligent parsing. In experiments using SerpAPI in combination with Microsoft’s guidance module, I got much farther than AutoGPT.

    Thanks for your kind words. We are working on SerpApi integration for Auto-GPT: https://github.com/serpapi/public-roadmap/issues/905

  • CobWeb-lnx

    CobWeb is a Python library for web scraping. The library consists of two classes: Spider and Scraper.

  • Project mention: Quem já contribuiu e quem já usou projectos open-source? | /r/devpt | 2023-06-30
  • yast

    Yet Another Streaming Tool

  • Project mention: [OpenSource] I am building high performance Plex alternative in Go for Movies and TV Show | /r/golang | 2023-06-02

    I also build a similar tool, it let's you choose and play movies. I used webtorrent behind the scenes. https://github.com/qascade/yast

  • reddit-bots

    A collection of Reddit bots that I use to enhance the subreddits I manage.

  • tagalog-dictionary-scraper

    Builds a Tagalog dictionary by collecting Tagalog words from tagalog.pinoydictionary.com

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

web-scraper related posts

Index

What are some of the best open-source web-scraper projects? This list will help you:

Project Stars
1 awesome-crawler 6,167
2 100ProjectsOfCode 2,965
3 soup 2,133
4 lightnovel-crawler 1,304
5 stealth 997
6 Monkey-DL (Anime Downloader) 810
7 spidr 793
8 web-scraping 678
9 google-maps-scraper 649
10 PHP Scraper 498
11 basketball_reference_web_scraper 411
12 crawler 300
13 summarizer 267
14 awesome-web-scraper 240
15 facebook_page_scraper 200
16 cascadia 134
17 Senpwai 132
18 get-sauce 113
19 public-roadmap 42
20 CobWeb-lnx 38
21 yast 29
22 reddit-bots 23
23 tagalog-dictionary-scraper 23

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com