Show HN: Newser, utility written in go to generate a pdf with news content

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • newser

    Newser is a simple utility to generate a pdf with you favorite news articles

  • I've gotten myself a Supernote A5X (awesome device btw) and since it doesn't have a web browser or anything I've wanted to have a way to read news on it. I've hacked together this utility in a couple of days and it works wonders for me personally so I thought it might be interesting to others. It can also be used as a noise free newspaper generator as it removes images/ads/links and other noisy stuff.

    https://github.com/lnenad/newser

    (there is a screenshot of the first page of the generated pdf)

    It scrapes (news) websites for content and puts it into a pdf. For me the pdf location is my dropbox supernote directory so my setup is to run this thing daily and have a fresh pdf with news whenever I want it.

    It's rough around the edges probably (currently added crawl support for verge, ars, engadget) but I think it's a good base so if anyone wants to contribute feel free. Some of the stuff I want to add is pictures (maybe), maybe parse the text html to include font styling and other stuff.

    I've tried to generalize it as much as possible so the crawling is pretty much automatic and is controlled by a config file where you define "rules" on how to parse the website.

  • ftr-site-config

    Site-specific article extraction rules to aid content extractors, feed readers, and 'read later' applications.

  • This is great!

    If it's useful, I work on a project where we maintain a repository of XPath selectors for extracting article content from many different sites: https://github.com/fivefilters/ftr-site-config - they're based on the original public Instapaper rules.

    We also have PDF generation, but it's not really for crawling, and wasn't created for reading on a device like the Supernote, more for printing and reading: https://pdf.fivefilters.org/simple-print/

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • rss2kindle

    Convert RSS feed to a PDF for reading on Kindle

  • I had a similar setup for creating PDF files from RSS feeds (https://github.com/adityam/rss2kindle). I was simply downloading the webpage, using pandoc to convert HTML to ConTeXt, and typesetting it via ConTeXt (this gave me a lot of control over the formatting and took care of including external images as well). I had a separate script which emailed the PDF to my kindle address.

    The script worked reliably for multiple years until I stopped using the kindle. I now have a SuperNote A6X and both pandoc and context have improved significantly in the last decade, so I should give this another shot.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • can someone suggest a good rss reader for android please?

    2 projects | /r/rss | 12 Jul 2023
  • Best RSS experience?

    1 project | /r/selfhosted | 29 Aug 2021
  • Which iOS app is best for grabbing news and RSS feeds when you have infrequent connectivity?

    1 project | /r/rss | 6 May 2021
  • Help Finding the Best RSS App Mac/iOS

    2 projects | /r/rss | 4 Mar 2023
  • How to rebuild social media on top of RSS

    3 projects | news.ycombinator.com | 13 Dec 2022