-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
prosto
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
-
PostgreSQL
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitting_a_Patch
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I share the author's point of view, which led me to start a new relational programming language that compiles to SQL. If that sounds interesting, you can find it here: https://github.com/erezsh/Preql
The only way out that I can see is to design embedded domain specific languages (EDSLs) that inherit the expressiveness, composability and type safety from the host language. That's what Opaleye and Rel8 (Postgres EDSLs for Haskell do. Haskell is particularly good for this. The query language can be just a monad and therefore users can carry all of their knowledge of monadic programming to writing database queries.
This approach doesn't resolve all of the author's complaints but it does solve many.
Disclaimer: I'm the author of Opaleye. Rel8 is built on Opaleye. Other relational query EDSLs are available.
[1] https://github.com/tomjaguarpaw/haskell-opaleye/
One alternative to SQL (type of thinking) is Column-SQL [1] which is based on a new data model. This model is relies on two equal constructs: sets (tables) and functions (columns). It is opposed to the relational algebra which is based on only sets and set operations. One benefit of Column-SQL is that it does not use joins and group-by for connectivity and aggregation, respectively, which are known to be quite difficult to understand and error prone in use. Instead, many typical data processing patterns are implemented by defining new columns: link columns instead of join, and aggregate columns instead of group-by.
More details about "Why functions and column-orientation" (as opposed to sets) can be found in [2]. Shortly, problems with set-orientation and SQL are because producing sets is not what we frequently need - we need new columns and not new table. And hence applying set operations is a kind of workaround due the absence of column operations.
This approach is implemented in the Prosto data processing toolkit [0] and Column-SQL[1] is a syntactic way to define its operations.
[0] https://github.com/asavinov/prosto Prosto is a data processing toolkit - an alternative to map-reduce and join-groupby
[1] https://prosto.readthedocs.io/en/latest/text/column-sql.html Column-SQL (work in progress)
[2] https://prosto.readthedocs.io/en/latest/text/why.html Why functions and column-orientation?
[1] https://scattered-thoughts.net/writing/against-sql/
[2] https://www.postgresql.org/
I've also tried to rewrite some of my challenging queries [0] in my hypothetical syntax and while I think your observation about pipeline length is correct, the result still came out much better than SQL. Frankly, even in F#, most of my pipelines are around 5 functions too. In my view, pipelines are just a convenient mental model. I'd love to see your sketches, here are my (very WIP) concepts: [1]
[0]: for example this monstrosity: https://gitlab.com/dvdkon/jrutil/-/blob/dbb971c18526e68dcc97...