Ask HN: How do your ML teams version datasets and models?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • dvc

    🦉 ML Experiments and Data Management with Git

  • dud

    A lightweight CLI tool for versioning data alongside source code and building data pipelines.

  • I've used DVC in the past and generally liked its approach. That said, I wholeheartedly agree that it's clunky. It does a lot of things implicitly, which can make it hard to reason about. It was also extremely slow for medium-sized dataset (low 10s of GBs).

    In response, I created a command-line tool that addresses these issues[0]. To reduce the comparison to an analogy: Dud : DVC :: Flask : Django.

    [0]: https://github.com/kevin-hanselman/dud

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Git-annex – Managing large files with Git

    2 projects | news.ycombinator.com | 15 Jan 2022
  • Git Version Controlled Datasets in S3

    1 project | news.ycombinator.com | 25 Oct 2023
  • Where do I best store my test data when using github for code?

    1 project | /r/learnmachinelearning | 9 May 2023
  • Using git to version control experimental data (not code)?

    1 project | /r/git | 19 Apr 2023
  • Introduction to Data Version Control

    1 project | dev.to | 1 Apr 2023