Java Data

Open-source Java projects categorized as Data

Top 15 Java Data Projects

  • Presto

    The official home of the Presto distributed SQL query engine for big data

  • Project mention: Multi-Database Support in DuckDB | news.ycombinator.com | 2024-01-28

    We have some of this functionality in Presto (https://github.com/prestodb/presto), but it takes fair bit of work to implement it for all the different backends.

  • kestra

    Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

  • Project mention: A High-Performance, Java-Based Orchestration Platform | /r/java | 2023-10-11

    Kestra's communication is asynchronous and based on a queuing mechanism. It leverages the Micronaut framework and offers two runners: one that uses a database (JDBC) for both the message queue and resource storage, and another that uses Kafka as the message queue and Elasticsearch as the resource storage. The platform is fully extensible and plugin-based, providing a rich set of plugins for various workflow tasks, triggers, and data storage options. For those interested, the GitHub repository is available here: https://github.com/kestra-io/kestra

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • data-transfer-project

    The Data Transfer Project makes it easy for people to transfer their data between online service providers. We are establishing a common framework, including data models and protocols, to enable direct transfer of data both into and out of participating online service providers.

  • Project mention: Apple TV, now with more Tailscale | news.ycombinator.com | 2023-09-18

    I would argue that it is exactly in line with Apple's brand identity.

    Pretty much everybody agrees that you need to backup your cloud storage as well as your local computer, and Apple even backs up your i-devices to the cloud, and yet, there is no automated way of backing up your iCloud storage.

    About a decade ago, Google initiated the Data Transfer Framework[1] that allows you to transfer data from one cloud provider to another, directly from provider to provider instead of downloading it first. It sadly appears to not have gotten enough traction to be of any use.

    [1]: https://github.com/google/data-transfer-project

  • proteus

    Proteus : A JSON based LayoutInflater for Android

  • Project mention: Am i safe by sticking with Java and XML for years ahead ? | /r/androiddev | 2023-06-04

    I guess it wouldn't be a first https://github.com/flipkart-incubator/proteus , but

  • nessie

    Nessie: Transactional Catalog for Data Lakes with Git-like semantics

  • Project mention: A deep dive into the concept and world of Apache Iceberg Catalogs | dev.to | 2024-03-01

    Nessie is an innovative open-source catalog that extends beyond the traditional catalog capabilities in the Apache Iceberg ecosystem, introducing git-like features to data management. This catalog not only tracks table metadata but also allows users to capture commits at a holistic level, enabling advanced operations such as multi-table transactions, rollbacks, branching, and tagging. These features provide a new layer of flexibility and control over data changes, resembling version control systems in software development.

  • jimmer

    A revolutionary ORM framework for both java and kotlin.

  • micronaut-data

    Ahead of Time Data Repositories

  • Project mention: [Mictonaut] Accessing SQL Server 2 | dev.to | 2024-01-19

    micronaut-data/doc-examples/r2dbc-example-java - GitHub

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • riot

    ๐Ÿงจ Get data in & out of Redis with RIOT (by redis)

  • rapiddweller-benerator-ce

    BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.

  • ModelRunner

    No-code, model driven, natural language data access platform

  • Db4o-gpl

    new Db4o GPL Source Code for Java7+ & .netstardard2.0 Android Xamarin..., the best database project to help you to learn how to make databases

  • nextcloud-tables

    ๐Ÿ“Š Android client for nextcloud tables app

  • Project mention: โŸณ 2 apps added, 54 updated at f-droid.org | /r/FDroidUpdates | 2023-06-11

    Nextcloud Tables (version 1.0.7): Companion app for Nextcloud Tables

  • SheetsIO

    Small configurable Java app that pulls data from a Google Spreadsheet (using v4 api) and writes to files and a local webserver.

  • Data-Structures-and-Algorithms

    Solutions to Arrays, Strings, Lists, Sorting, Stacks, Trees and General DS problems using JAVA. (by anishkumar127)

  • SparkDB

    CSV-to-database-structure project (by NaDeSys)

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Data related posts

  • A deep dive into the concept and world of Apache Iceberg Catalogs

    1 project | dev.to | 1 Mar 2024
  • Apple releases Pkl โ€“ onfiguration as code language

    14 projects | news.ycombinator.com | 3 Feb 2024
  • Multi-Database Support in DuckDB

    3 projects | news.ycombinator.com | 28 Jan 2024
  • Why is Hive Metastore everywhere? (Especially Iceberg)

    1 project | /r/dataengineering | 30 Jun 2023
  • Missouri trans 'snitch form' down after people spammed it with the 'Bee Movie' script

    4 projects | /r/politics | 22 Apr 2023
  • Uploading Data from a CSV file

    1 project | /r/redis | 22 Feb 2023
  • Is it safe to update docker/docker-compose?

    2 projects | /r/synology | 9 Feb 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 2 May 2024
    SaaSHub helps you find the best software and product alternatives Learn more โ†’

Index

What are some of the best open-source Data projects in Java? This list will help you:

Project Stars
1 Presto 15,591
2 kestra 6,340
3 data-transfer-project 3,548
4 proteus 1,293
5 nessie 834
6 jimmer 630
7 micronaut-data 457
8 riot 227
9 rapiddweller-benerator-ce 128
10 ModelRunner 57
11 Db4o-gpl 29
12 nextcloud-tables 26
13 SheetsIO 20
14 Data-Structures-and-Algorithms 12
15 SparkDB 3

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com