With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js. Learn more →
Top 23 JavaScript Scraper Projects
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
-
freeDictionaryAPI
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
website-scraper-puppeteer
Plugin for website-scraper which returns html for dynamic websites using puppeteer
-
fredy
:heart: Fredy - [F]ind [R]eal [E]states [D]amn Eas[y] - Fredy will constantly search for new listings on sites like Immoscout or Immowelt and send new results to you, so that you can focus on more important things in life ;)
-
amazon_scraper
Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt
-
instagram-without-api-node
A simple Node.js code to get unlimited instagram public pictures by every user without api, without credentials.
-
html_tag_annotator
A Machine Learning tool to create the training dataset very quickly & easily by using a smart chrome extension
-
nba-topshop-scraper
Node script that will use Selenium to scrape card information from NBA Topshot including card names, rarity, and lowest cost at the moment. Data is scraped once per day.
-
reddit-in-valve-games
Desktop app to automate getting text-only posts from reddit & binding them to keyboard keys in a Valve game to share them in the chat. Made with ElectronJS & Ruby.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
ytdl-core for streaming from Youtube
Project mention: A site that tracks the price of a Big Mac in every US McDonald's | news.ycombinator.com | 2024-01-13Yes, there is a lot written about it. Here is one link I have saved:
https://github.com/niespodd/browser-fingerprinting
Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07Agreed!
Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).
We currently use this at Magic Loops[2] and it works _most_ of the time.
The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).
Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.
[0] https://apify.com/apify/website-content-crawler
[1] https://github.com/extractus/article-extractor
[2] https://magicloops.dev/
[3] https://reworkd.ai/
Templater scraper scripts
JavaScript Scraper related posts
-
Plug-in for formatting saved websites directly into Obsidian
-
Simple Youtube Downloader in under 50 Javascript lines
-
Nextjs ytdl-core youtube downloader
-
Built a website to help you find... pocket knives!
-
Feedback on new game
-
How can I get all the words in this dictionary api?
-
How can I get all the words in this dictionary api?
-
A note from our sponsor - SurveyJS
surveyjs.io | 5 Jun 2024
Index
What are some of the best open-source Scraper projects in JavaScript? This list will help you:
Project | Stars | |
---|---|---|
1 | node-ytdl-core | 4,337 |
2 | scrape-it | 3,988 |
3 | browser-fingerprinting | 3,938 |
4 | freeDictionaryAPI | 2,427 |
5 | google-play-scraper | 2,244 |
6 | node-website-scraper | 1,519 |
7 | article-extractor | 1,426 |
8 | website-scraper-puppeteer | 307 |
9 | fredy | 206 |
10 | obsidian-scrapers | 125 |
11 | chorus | 114 |
12 | amazon_scraper | 76 |
13 | instagram-without-api-node | 64 |
14 | easy-reddit-downloader | 58 |
15 | itchio-godot-scraper | 28 |
16 | trawler | 22 |
17 | vlrgg-api | 20 |
18 | XboxStoreAPI | 13 |
19 | html_tag_annotator | 12 |
20 | nba-topshop-scraper | 11 |
21 | awscraper | 8 |
22 | reddit-in-valve-games | 7 |
23 | tumblweed | 6 |