With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js. Learn more →
Top 9 JavaScript Readability Projects
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
-
readability-extractor
Javascript/Node wrapper around Mozilla's Readability library so that ArchiveBox can call it as a oneshot CLI command to extract each page's article text.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Project mention: The Case Against AI Everything, Everywhere, All at Once | news.ycombinator.com | 2023-10-19You can still choose automation. The easier route for me is to use wallabag to save the article. Then on my remarkable tablet I can grab a very readable document with https://github.com/koreader/koreader.
The other option is to use https://github.com/danburzo/percollate to convert a webpage to a nice document directly. I use both tools depending on my needs.
Project mention: ScrapeGraphAI: Web scraping using LLM and direct graph logic | news.ycombinator.com | 2024-05-07Agreed!
Apify's Website Content Crawler[0] does a decent job of this for most websites in my experience. It allows you to "extract" content via different built-in methods (e.g. Extractus [1]).
We currently use this at Magic Loops[2] and it works _most_ of the time.
The long-tail is difficult though, and it's not uncommon for users to back out to raw HTML, and then have our tool write some custom logic to parse the content they want from the scraped results (fun fact: before GPT-4 Turbo, the HTML page was often too large for the context window... and sometimes it still is!).
Would love a dedicated tool for this. I know the folks at Reworkd[3] are working on something similar, but not sure how much is public yet.
[0] https://apify.com/apify/website-content-crawler
[1] https://github.com/extractus/article-extractor
[2] https://magicloops.dev/
[3] https://reworkd.ai/
JavaScript Readability related posts
-
How do Instapaper and Pocket apps extract the content of the articles?
-
Share my down(load) function!
-
Reverse Engineering or Recreating the Chrome Extension?
-
How do I enabled right click menu and developer console on a site that disabled it?
-
software or browser extension to reformat text?
-
Reading web articles on the reMarkable
-
Pa. commission proposes adding and increasing fees, axing gas tax to fund transportation needs.
-
A note from our sponsor - SurveyJS
surveyjs.io | 29 May 2024
Index
What are some of the best open-source Readability projects in JavaScript? This list will help you:
Project | Stars | |
---|---|---|
1 | percollate | 4,143 |
2 | article-extractor | 1,423 |
3 | Just-Read | 1,179 |
4 | apca-w3 | 141 |
5 | stutter | 131 |
6 | retext-readability | 89 |
7 | readability-extractor | 33 |
8 | line-length | 4 |
9 | validate-access | 3 |
Sponsored