Internet Object – A JSON alternative data serialization format

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • zed

    A novel data lake based on super-structured data (by brimdata)

  • This is a very real problem being addressed here and I am intrigued by all the great comments in this thread.

    In the Zed project, we've been thinking about and iterating on a better data model for serialization for a few years, and have concluded that schemas kind of get in the way (e.g., the way Parquet, Avro, and JSON Schema define a schema then have a set of values that adhere to the schema). In Zed, a modern and fine-grained type system allows for a structure that is a superset of both the JSON and the relational models, where a schema is simply a special case of the type system (i.e., a named record type).

    If you're interested, you can check out the Zed formats here... https://github.com/brimdata/zed/tree/main/docs/formats

  • hujson

    HuJSON: JSON for Humans (JWCC: JSON w/ comments and trailing commas)

  • One of the variants that permit comments: https://github.com/tailscale/hujson

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Dixy

    Data format based on dictionaries

  • YAML and its "Arrays" are really broken. The problem I see with Internet Object is that it's also implying this kind of mechanism.

    Every time I read about new formats, they seem to get either the 1-n relations or the n-n relations implemented well, but not both. I guess that's what's so hard about map/reduce...

    Regarding YAML: somebody on HN mentioned his project DIXY a couple years ago, and it's much much _much_ easier to parse than YAML. [1] I'm using this over YAML pretty much everywhere now.

    [1] https://github.com/kuyawa/Dixy

  • jsonschema-key-compression

    Compress json-data based on its json-schema while still having valid json

  • So the plain data is smaller because some information comes from the schema instead of the object. Guess what, you can do the same with json already [1]

    [1] https://github.com/pubkey/jsonschema-key-compression

  • simdjson

    Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

  • That's true, but the main argument made by the website is about the space advantage, so it's very relevant that that space advantage is basically nullified by the widespread use of compression.

    If your worry is parsing speed, then JSON not only has battle-tested parsers, but also has SIMD-assisted parsers which can process gigabytes a second on a single core (e.g. https://github.com/simdjson/simdjson). It would take Internet Object years to develop parsers as performant as that, even if it did, by some miracle, achieve wide uptake. So the notional advantage afforded by not having keys on each row is neither here nor there.

    And incidentally, as someone who's written a handful of parsers, I suspect that this scheme would not be particularly easy to parse. You need lookahead because of optional fields, as well as maintaining state and a lookup table for mapping positions to keys, etc. I can draw up a quick parser in pseudocode or Python to explain, if you disagree.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Tips on adding JSON output to your command line utility. (2021)

    2 projects | news.ycombinator.com | 20 Apr 2024
  • 1BRC Merykitty's Magic SWAR: 8 Lines of Code Explained in 3k Words

    4 projects | news.ycombinator.com | 9 Mar 2024
  • Training great LLMs from ground zero in the wilderness as a startup

    3 projects | news.ycombinator.com | 6 Mar 2024
  • simdjson: Parsing Gigabytes of JSON per Second

    1 project | news.ycombinator.com | 23 Jan 2024
  • Simdjson: Parsing Gigabytes of JSON per Second

    1 project | news.ycombinator.com | 30 Nov 2023