Internet Object – A JSON alternative data serialization format

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

zed

13 1,322 9.4 Go

A novel data lake based on super-structured data (by brimdata)

This is a very real problem being addressed here and I am intrigued by all the great comments in this thread.
In the Zed project, we've been thinking about and iterating on a better data model for serialization for a few years, and have concluded that schemas kind of get in the way (e.g., the way Parquet, Avro, and JSON Schema define a schema then have a set of values that adhere to the schema). In Zed, a modern and fine-grained type system allows for a structure that is a superset of both the JSON and the relational models, where a schema is simply a special case of the type system (i.e., a named record type).
If you're interested, you can check out the Zed formats here... https://github.com/brimdata/zed/tree/main/docs/formats

hujson

10 575 0.0 Go

HuJSON: JSON for Humans (JWCC: JSON w/ comments and trailing commas)

One of the variants that permit comments: https://github.com/tailscale/hujson

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Dixy

3 28 10.0 Swift

Data format based on dictionaries

YAML and its "Arrays" are really broken. The problem I see with Internet Object is that it's also implying this kind of mechanism.
Every time I read about new formats, they seem to get either the 1-n relations or the n-n relations implemented well, but not both. I guess that's what's so hard about map/reduce...
Regarding YAML: somebody on HN mentioned his project DIXY a couple years ago, and it's much much _much_ easier to parse than YAML. [1] I'm using this over YAML pretty much everywhere now.
[1] https://github.com/kuyawa/Dixy

jsonschema-key-compression

1 95 8.9 TypeScript

Compress json-data based on its json-schema while still having valid json

So the plain data is smaller because some information comes from the schema instead of the object. Guess what, you can do the same with json already [1]
[1] https://github.com/pubkey/jsonschema-key-compression

simdjson

65 18,570 9.2 C++

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

That's true, but the main argument made by the website is about the space advantage, so it's very relevant that that space advantage is basically nullified by the widespread use of compression.
If your worry is parsing speed, then JSON not only has battle-tested parsers, but also has SIMD-assisted parsers which can process gigabytes a second on a single core (e.g. https://github.com/simdjson/simdjson). It would take Internet Object years to develop parsers as performant as that, even if it did, by some miracle, achieve wide uptake. So the notional advantage afforded by not having keys on each row is neither here nor there.
And incidentally, as someone who's written a handful of parsers, I suspect that this scheme would not be particularly easy to parse. You need lookahead because of optional fields, as well as maintaining state and a lookup table for mapping positions to keys, etc. I can draw up a quick parser in pseudocode or Python to explain, if you disagree.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Tips on adding JSON output to your command line utility. (2021)

2 projects | news.ycombinator.com | 20 Apr 2024
1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words

4 projects | news.ycombinator.com | 9 Mar 2024
Training great LLMs from ground zero in the wilderness as a startup

3 projects | news.ycombinator.com | 6 Mar 2024
simdjson: Parsing Gigabytes of JSON per Second

1 project | news.ycombinator.com | 23 Jan 2024
Simdjson: Parsing Gigabytes of JSON per Second

1 project | news.ycombinator.com | 30 Nov 2023

Internet Object – A JSON alternative data serialization format

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
JSON json-parser NoSQL Simd Compression
Post date: 24 Oct 2021

zed

hujson

InfluxDB

Dixy

jsonschema-key-compression

simdjson

SaaSHub

Related posts

Tips on adding JSON output to your command line utility. (2021)

1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words

Training great LLMs from ground zero in the wilderness as a startup

simdjson: Parsing Gigabytes of JSON per Second

Simdjson: Parsing Gigabytes of JSON per Second

Internet Object – A JSON alternative data serialization format

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com JSON json-parser NoSQL Simd Compression Post date: 24 Oct 2021

zed

hujson

InfluxDB

Dixy

jsonschema-key-compression

simdjson

SaaSHub

Related posts

Tips on adding JSON output to your command line utility. (2021)

1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words

Training great LLMs from ground zero in the wilderness as a startup

simdjson: Parsing Gigabytes of JSON per Second

Simdjson: Parsing Gigabytes of JSON per Second

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
JSON json-parser NoSQL Simd Compression
Post date: 24 Oct 2021