Stuff we figured out about AI in 2023

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • llama2.c

    Inference Llama 2 in one file of pure C

  • FOr inference, less than 1KLOC of pure, dependency-free C is enough (if you include the tokenizer and command line parsing)[1]. This was a non-obvious fact for me, in principle, you could run a modern LLM 20 years ago with just 1000 lines of code, assuming you're fine with things potentially taking days to run of course.

    Training wouldn't be that much harder, Micrograd[2] is 200LOC of pure Python, 1000 lines would probably be enough for training an (extremely slow) LLM. By "extremely slow", I mean that a training run that normally takes hours could probably take dozens of years, but the results would, in principle, be the same.

    If you were writing in C instead of Python and used something like Llama CPP's optimization tricks, you could probably get somewhat acceptable training performance in 2 or 3 KLOC. You'd still be off by one or two orders of magnitude when compared to a GPU cluster, but a lot better than naive, loopy Python.

    [1] https://github.com/karpathy/llama2.c

    [2] https://github.com/karpathy/micrograd

  • llama

    Inference code for Llama models

  • > Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!

    actually its not just a basic version. Llama 1/2's model.py is 500 lines: https://github.com/facebookresearch/llama/blob/main/llama/mo...

    Mistral (is rumored to have) forked llama and is 369 lines: https://github.com/mistralai/mistral-src/blob/main/mistral/m...

    and both of these are SOTA open source models.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • mistral-src

    Reference implementation of Mistral AI 7B v0.1 model.

  • > Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!

    actually its not just a basic version. Llama 1/2's model.py is 500 lines: https://github.com/facebookresearch/llama/blob/main/llama/mo...

    Mistral (is rumored to have) forked llama and is 369 lines: https://github.com/mistralai/mistral-src/blob/main/mistral/m...

    and both of these are SOTA open source models.

  • micrograd

    A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API

  • FOr inference, less than 1KLOC of pure, dependency-free C is enough (if you include the tokenizer and command line parsing)[1]. This was a non-obvious fact for me, in principle, you could run a modern LLM 20 years ago with just 1000 lines of code, assuming you're fine with things potentially taking days to run of course.

    Training wouldn't be that much harder, Micrograd[2] is 200LOC of pure Python, 1000 lines would probably be enough for training an (extremely slow) LLM. By "extremely slow", I mean that a training run that normally takes hours could probably take dozens of years, but the results would, in principle, be the same.

    If you were writing in C instead of Python and used something like Llama CPP's optimization tricks, you could probably get somewhat acceptable training performance in 2 or 3 KLOC. You'd still be off by one or two orders of magnitude when compared to a GPU cluster, but a lot better than naive, loopy Python.

    [1] https://github.com/karpathy/llama2.c

    [2] https://github.com/karpathy/micrograd

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Mistral 7B vs. Mixtral 8x7B

    1 project | dev.to | 26 Mar 2024
  • How to have your own ChatGPT on your machine (and make him discussed with himself)

    1 project | dev.to | 24 Jan 2024
  • How to Serve LLM Completions in Production

    1 project | dev.to | 18 Jan 2024
  • Mistral website was just updated

    3 projects | /r/LocalLLaMA | 11 Dec 2023
  • Mistral AI – open-source models

    1 project | news.ycombinator.com | 8 Dec 2023