What's the best tool to build pipelines from REST APIs?

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

memphis

52 3,171 9.9 Go

Memphis.dev is a highly scalable and effortless data streaming platform

Great recommendations by the rest of the members here. I would love to learn more about your use case if possible, as we are adding a native REST, websocket and gRPC support to our message broker (Memphis. Let’s chat if possible, would love to work on this together

xidel

18 653 5.6 Pascal

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Xidel for extraction and pagination

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
gnu-parallel

23 25 10.0 Perl

A clone of GNU Parallel (git://git.savannah.gnu.org/parallel.git)

GNU Parallel for parallelism, retry, and resumption

jq

306 25,063 0.0 C

Discontinued Command-line JSON processor [Moved to: https://github.com/jqlang/jq] (by stedolan)

JQ for JSON processing

aws_lambda_reddit_api

2 0 4.2 Python

Batch data processing project using data from the reddit api.

I have a small project like this i done before. Which i am gonna shamelessly plug in lol. https://github.com/PanzerFlow/aws_lambda_reddit_api

Mage

77 7,202 9.9 Python

🧙 The modern replacement for Airflow. Mage is an open-source data pipeline tool for transforming and integrating data. https://github.com/mage-ai/mage-ai

AWS: deploy using maintained Terraform scripts.

CoinCap-firehose-s3-DynamicPartitioning

1 0 10.0 TypeScript

AWS CDK project using typescript. Services: Lambda, Kinesis Firehose, Glue, Quicksight.

I agree with the Cron triggered Lambda approach. For inspiration I have a small project where a lambda pulls data from a public api and writes it to a firehose which buffers the data and writes it to s3. There is also a cron job on Glue which catalogues the data. https://github.com/TrygviZL/CoinCap-firehose-s3-DynamicPartitioning

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
astro-sdk

7 324 8.5 Python

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

I have an example here using COVID data. basically you just write a python function that reads the API and returns a dataframe (or any number of dataframes) and downstream tasks can then read the output as either a dataframe or a SQL table.

Rudderstack

83 3,964 9.8 Go

Privacy and Security focused Segment-alternative, in Golang and React

RudderStack is an open-source tool to build data pipelines with high-availability and high-precision event ordering. It is suitable for your use case as

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Show HN: Excel to Python Compiler

3 projects | news.ycombinator.com | 23 May 2024
What codegen is (actually) good for

2 projects | news.ycombinator.com | 28 Sep 2023
Pandas AI – The Future of Data Analysis

7 projects | news.ycombinator.com | 17 May 2023
Recursos para iniciantes em dados

1 project | /r/DadosBrasil | 17 Mar 2023
Sub de dados Brasil.

1 project | /r/brdev | 17 Mar 2023

What's the best tool to build pipelines from REST APIs?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering
Data Python Data Science Xquery customer-data-platform
Post date: 8 Oct 2022

memphis

xidel

InfluxDB

gnu-parallel

jq

aws_lambda_reddit_api

Mage

CoinCap-firehose-s3-DynamicPartitioning

SaaSHub

astro-sdk

Rudderstack

Related posts

Show HN: Excel to Python Compiler

What codegen is (actually) good for

Pandas AI – The Future of Data Analysis

Recursos para iniciantes em dados

Sub de dados Brasil.

What's the best tool to build pipelines from REST APIs?

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering Data Python Data Science Xquery customer-data-platform Post date: 8 Oct 2022

Related posts

Show HN: Excel to Python Compiler

What codegen is (actually) good for

Pandas AI – The Future of Data Analysis

Recursos para iniciantes em dados

Sub de dados Brasil.

This page summarizes the projects mentioned and recommended in the original post on /r/dataengineering
Data Python Data Science Xquery customer-data-platform
Post date: 8 Oct 2022