Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 9 Java ETL Projects
-
kestra
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Smooks
Extensible data integration Java framework for building XML and non-XML fragment-based applications
-
ReplicaDB
ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases
-
kafka-connect-file-pulse
🔗 A multipurpose Kafka Connect connector that makes it easy to parse, transform and stream any file, in any format, into Apache Kafka
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
contube
ConTube: A scalable data connector framework that facilitates efficient data transfer between diverse systems.
Project mention: SQL Convertor for Easy Migration from Presto, Trino, ClickHouse, and Hive to Apache Doris | dev.to | 2024-05-27Apache Doris is an all-in-one data platform that is capable of real-time reporting, ad-hoc queries, data lakehousing, log management and analysis, and batch data processing. As more and more companies have been replacing their component-heavy data architecture with Apache Doris, there is an increasing need for a more convenient data migration solution. That's why the Doris SQL Convertor is made.
Kestra's communication is asynchronous and based on a queuing mechanism. It leverages the Micronaut framework and offers two runners: one that uses a database (JDBC) for both the message queue and resource storage, and another that uses Kafka as the message queue and Elasticsearch as the resource storage. The platform is fully extensible and plugin-based, providing a rich set of plugins for various workflow tasks, triggers, and data storage options. For those interested, the GitHub repository is available here: https://github.com/kestra-io/kestra
Project mention: Kafka Connect Filepulse 2.13.0 is now available! This version includes support for SFTP and Alibaba OSS. It also contains many bug fixes and improvements. 🚀 | /r/apachekafka | 2023-09-15
Project mention: Show HN: ConTube – A Scalable Data Connect Framework for Pulsar/Kafka Ecosystems | news.ycombinator.com | 2023-12-04
Java ETL related posts
-
Kafka Connect Filepulse 2.13.0 is now available! This version includes support for SFTP and Alibaba OSS. It also contains many bug fixes and improvements. 🚀
-
Best ‘E’TL tools for extracting data from on-prem SQL databases
-
Maven unable to resolve a dependency given in pom.xml. I've instead tried manually downloading installing the jar, but now maven cannot find the package.
-
Download json and csv file from github repository with apache kafka
-
Streaming data into Kafka S01/E04 — Loading Log files using Grok Expression
-
A note from our sponsor - InfluxDB
www.influxdata.com | 1 Jun 2024
Index
What are some of the best open-source ETL projects in Java? This list will help you:
Project | Stars | |
---|---|---|
1 | doris | 11,547 |
2 | kestra | 6,803 |
3 | zingg | 895 |
4 | Smooks | 386 |
5 | ReplicaDB | 371 |
6 | kafka-connect-file-pulse | 308 |
7 | neo4j-jdbc | 127 |
8 | contube | 10 |
9 | dcc-import | 1 |
Sponsored