Five Apache projects you probably didn't know about

This page summarizes the projects mentioned and recommended in the original post on dev.to

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Apache Spark

    Apache Spark - A unified analytics engine for large-scale data processing

  • Apache SeaTunnel is a data integration platform that offers the three pillars of data pipelines: sources, transforms, and sinks. It offers an abstract API over three possible engines: the Zeta engine from SeaTunnel or a wrapper around Apache Spark or Apache Flink. Be careful, as each engine comes with its own set of features.

  • skywalking

    APM, Application Performance Monitoring System

  • Apache SkyWalking is an APM tool, focusing on microservices, Cloud Native apps, and Kuernetes architectures. It builds its architecture on four kinds of components:

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • shardingsphere-elasticjob-ui

    Administrator console of ElasticJob

  • ShardingSphere claims to offer an ecosystem able to transform any database into a distributed database system. It acts as a proxy between your code and your database(s). It comes in two flavors:

  • seatunnel

    SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

  • Apache SeaTunnel is a data integration platform that offers the three pillars of data pipelines: sources, transforms, and sinks. It offers an abstract API over three possible engines: the Zeta engine from SeaTunnel or a wrapper around Apache Spark or Apache Flink. Be careful, as each engine comes with its own set of features.

  • Nginx

    An official read-only mirror of http://hg.nginx.org/nginx/ which is updated hourly. Pull requests on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes to nginx is via the nginx development mailing list, see http://nginx.org/en/docs/contributing_changes.html

  • APISIX is an API Gateway. It builds upon OpenResty, a Lua layer built on top of the famous nginx reverse-proxy. APISIX adds abstractions to the mix, e.g., Route, Service, Upstream, and offers a plugin-based architecture.

    Apache SeaTunnel is a data integration platform that offers the three pillars of data pipelines: sources, transforms, and sinks. It offers an abstract API over three possible engines: the Zeta engine from SeaTunnel or a wrapper around Apache Spark or Apache Flink. Be careful, as each engine comes with its own set of features.

  • doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

  • Apache Doris is a real-time data warehouse.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • apisix-ingress-controller

    APISIX Ingress Controller for Kubernetes

  • In early 2021, I started to work on the Apache APISIX project. I have to admit that I had never heard about it before. In this post, I'd like to introduce some Apache projects that are less well-known than HTTPD or Kafka.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Apache Iceberg as storage for on-premise data store (cluster)

    3 projects | /r/dataengineering | 16 Mar 2023
  • Uber Interview Experience/Asking Suggestions

    4 projects | /r/dataengineering | 1 Feb 2023
  • What is the separation of storage and compute in data platforms and why does it matter?

    3 projects | dev.to | 29 Nov 2022
  • What are your favourite GitHub repos that shows how data engineering should be done?

    4 projects | /r/dataengineering | 18 Nov 2022
  • 5 Reasons Your Data Lakehouse should Embrace Dremio Cloud

    2 projects | dev.to | 9 Aug 2022