Java data-integration

Open-source Java projects categorized as data-integration

Top 4 Java data-integration Projects

  • seatunnel

    SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

  • Project mention: SeaTunnel – super high-performance, distributed data integration tool | news.ycombinator.com | 2024-04-28
  • kestra

    Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

  • Project mention: A High-Performance, Java-Based Orchestration Platform | /r/java | 2023-10-11

    Kestra's communication is asynchronous and based on a queuing mechanism. It leverages the Micronaut framework and offers two runners: one that uses a database (JDBC) for both the message queue and resource storage, and another that uses Kafka as the message queue and Elasticsearch as the resource storage. The platform is fully extensible and plugin-based, providing a rich set of plugins for various workflow tasks, triggers, and data storage options. For those interested, the GitHub repository is available here: https://github.com/kestra-io/kestra

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • hudi

    Upserts, Deletes And Incremental Processing on Big Data.

  • Project mention: Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog | dev.to | 2023-12-18

    Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake.

  • bitsail

    BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java data-integration related posts

  • SeaTunnel – super high-performance, distributed data integration tool

    1 project | news.ycombinator.com | 28 Apr 2024
  • Apache SeaTunnel: Next-generation high-performance, distributed integration tool

    1 project | news.ycombinator.com | 27 Apr 2024
  • Questions Regarding design DW

    1 project | /r/dataengineering | 24 Jun 2023
  • SeaTunnel Zeta engine, the first choice for massive data synchronization, is officially released!

    1 project | dev.to | 5 Jan 2023
  • Major Release! SeaTunnel 2.3.0-beta supports the self-innovate SeaTunnel Engine and more connectors!

    1 project | dev.to | 3 Nov 2022
  • SeaTunnel Will Support CDC As A Feature Soon!

    1 project | /r/u_SeaTunnel | 3 Nov 2022
  • SeaTunnel Will Support CDC As A Feature Soon!

    1 project | dev.to | 3 Nov 2022
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 17 May 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source data-integration projects in Java? This list will help you:

Project Stars
1 seatunnel 7,431
2 kestra 6,605
3 hudi 5,102
4 bitsail 1,584

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com