-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
git link
There are a billion things that you need to consider when building a decent web crawler, especially interacting with pages in the modern web. For example, a lot of content is dynamically loaded by the browser nowadays, and won't show up if you make a simple HTTP request. Open your browser devtools and look at the network tab after you make a request, and you'll see it makes loads of auxiliary requests. Some content is also only loaded after you interact with it (e.g. hover, click). For that reason I'd recommend using something like chromedp and do browser based crawling, even if it's much slower.