Top 9 JavaScript Readability Projects

percollate

14 4,143 5.7 JavaScript

A command-line tool to turn web pages into readable PDF, EPUB, HTML, or Markdown docs.

Project mention: The Case Against AI Everything, Everywhere, All at Once | news.ycombinator.com | 2023-10-19

You can still choose automation. The easier route for me is to use wallabag to save the article. Then on my remarkable tablet I can grab a very readable document with https://github.com/koreader/koreader.
The other option is to use https://github.com/danburzo/percollate to convert a webpage to a nice document directly. I use both tools depending on my needs.

article-extractor

3 1,423 7.1 JavaScript

To extract main article from given URL with Node.js

Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07

Agreed!
Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).
We currently use this at Magic Loops[2] and it works _most_ of the time.
The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).
Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.
[0] https://apify.com/apify/website-content-crawler
[1] https://github.com/extractus/article-extractor
[2] https://magicloops.dev/
[3] https://reworkd.ai/

SurveyJS

surveyjs.io featured

Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
Just-Read

5 1,179 7.3 JavaScript

A customizable read mode web extension.
apca-w3

1 141 5.1 JavaScript

The APCA version, to be licensed for use with guidelines: W3/AGWG.
stutter

1 131 0.0 JavaScript

RSVP for browsers (by jamestomasino)
retext-readability

1 89 6.3 JavaScript

plugin to check readability
readability-extractor

0 33 5.3 JavaScript

Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
line-length

1 4 6.7 JavaScript

Measure lengths of text on a page
validate-access

3 3 0.0 JavaScript

Parse a & Validate a given directory with multiple entries

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

JavaScript Readability related posts

How do Instapaper and Pocket apps extract the content of the articles?

1 project | /r/opensource | 4 Dec 2023
Share my down(load) function!

1 project | /r/commandline | 22 May 2023
Reverse Engineering or Recreating the Chrome Extension?

1 project | /r/RemarkableTablet | 21 Jan 2023
How do I enabled right click menu and developer console on a site that disabled it?

1 project | /r/uBlockOrigin | 13 Oct 2022
software or browser extension to reformat text?

1 project | /r/TBI | 11 Oct 2022
Reading web articles on the reMarkable

1 project | /r/RemarkableTablet | 30 Aug 2022
Pa. commission proposes adding and increasing fees, axing gas tax to fund transportation needs.

1 project | /r/pittsburgh | 14 Jul 2022
A note from our sponsor - SurveyJS
surveyjs.io | 29 May 2024

With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js. Learn more →

Index

What are some of the best open-source Readability projects in JavaScript? This list will help you:

	Project	Stars
1	percollate	4,143
2	article-extractor	1,423
3	Just-Read	1,179
4	apca-w3	141
5	stutter	131
6	retext-readability	89
7	readability-extractor	33
8	line-length	4
9	validate-access	3