mistral-src vs llama2.c

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

mistral-src		llama2.c
	Project
9	Mentions	14
8,732	Stars	16,071
4.1%	Growth	-
7.3	Activity	9.2
about 2 months ago	Latest Commit	13 days ago
Jupyter Notebook	Language	C
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

mistral-src

Posts with mentions or reviews of mistral-src. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-01.

Mistral 7B vs. Mixtral 8x7B
1 project | dev.to | 26 Mar 2024

A French startup, Mistral AI has released two impressive large language models (LLMs) - Mistral 7B and Mixtral 8x7B. These models push the boundaries of performance and introduce a better architectural innovation aimed at optimizing inference speed and computational efficiency.
How to have your own ChatGPT on your machine (and make him discussed with himself)
1 project | dev.to | 24 Jan 2024

However, some models are publicly available. It’s the case for Mistral, a fast, and efficient French model which seems to outperform GPT4 on some tasks. And it is under Apache 2.0 license 😊.
How to Serve LLM Completions in Production
1 project | dev.to | 18 Jan 2024

I recommend starting either with llama2 or Mistral. You need to download the pretrained weights and convert them into GGUF format before they can be used with llama.cpp.
Stuff we figured out about AI in 2023
5 projects | news.ycombinator.com | 1 Jan 2024

> Instead, it turns out a few hundred lines of Python is genuinely enough to train a basic version!
actually its not just a basic version. Llama 1/2's model.py is 500 lines: https://github.com/facebookresearch/llama/blob/main/llama/mo...
Mistral (is rumored to have) forked llama and is 369 lines: https://github.com/mistralai/mistral-src/blob/main/mistral/m...
and both of these are SOTA open source models.
How Open is Generative AI? Part 2
8 projects | dev.to | 19 Dec 2023

MistralAI, a French startup, developed a 7.3 billion parameter LLM named Mistral for various applications. Committed to open-sourcing its technology under Apache 2.0, the training dataset details for Mistral remain undisclosed. The Mistral Instruct model was fine-tuned using publicly available instruction datasets from the Hugging Face repository, though specifics about the licenses and potential constraints are not detailed. Recently, MistralAI released Mixtral 8x7B, a model based on the sparse mixture of experts (SMoE) architecture, consisting of several specialized models (likely eight, as suggested by its name) activated as needed.
Mistral website was just updated
3 projects | /r/LocalLLaMA | 11 Dec 2023
Mistral AI – open-source models
1 project | news.ycombinator.com | 8 Dec 2023
Mistral 8x7B 32k model [magnet]
6 projects | news.ycombinator.com | 8 Dec 2023
Ask HN: Why the LLaMA code base is so short
2 projects | news.ycombinator.com | 22 Nov 2023

I was getting into LLM and I pick up some projects. I tried to dive into the code to see what is secret sauce.
But the code is so short to the point there is nothing to really read.
https://github.com/facebookresearch/llama
I then proceed to check https://github.com/mistralai/mistral-src and suprsingly it's same.
What is exactly those codebases? It feels like just download the models.

llama2.c

Posts with mentions or reviews of llama2.c. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-01-01.

Stuff we figured out about AI in 2023
5 projects | news.ycombinator.com | 1 Jan 2024

FOr inference, less than 1KLOC of pure, dependency-free C is enough (if you include the tokenizer and command line parsing)[1]. This was a non-obvious fact for me, in principle, you could run a modern LLM 20 years ago with just 1000 lines of code, assuming you're fine with things potentially taking days to run of course.
Training wouldn't be that much harder, Micrograd[2] is 200LOC of pure Python, 1000 lines would probably be enough for training an (extremely slow) LLM. By "extremely slow", I mean that a training run that normally takes hours could probably take dozens of years, but the results would, in principle, be the same.
If you were writing in C instead of Python and used something like Llama CPP's optimization tricks, you could probably get somewhat acceptable training performance in 2 or 3 KLOC. You'd still be off by one or two orders of magnitude when compared to a GPU cluster, but a lot better than naive, loopy Python.
[1] https://github.com/karpathy/llama2.c
[2] https://github.com/karpathy/micrograd
Minimal neural network implementation
4 projects | /r/C_Programming | 6 Dec 2023

A bit off topic but ML-guru Mr Karpathy has implemented a state-of-art Llama2 model in a plain C with no dependencies on 3rd party/freeware libraries. See repo.
WebLLM: Llama2 in the Browser
4 projects | news.ycombinator.com | 28 Aug 2023

Related. I built karpathy’s llama2.c (https://github.com/karpathy/llama2.c) without modifications to WASM and run it in the browser. It was a fun exercise to directly compare native vs. Web perf. Getting 80% of native performance on my M1 Macbook Air and haven’t spent anytime optimizing the WASM side.
Demo: https://diegomarcos.com/llama2.c-web/
Code:
Lfortran: Modern interactive LLVM-based Fortran compiler
2 projects | news.ycombinator.com | 28 Aug 2023

Would be cool for there to be a `llama2.f`, similar to https://github.com/karpathy/llama2.c, to demo it's capabilities
Llama2.c L2E LLM – Multi OS Binary and Unikernel Release
4 projects | news.ycombinator.com | 25 Aug 2023

This is a fork of https://github.com/karpathy/llama2.c
karpathy's llama2.c is like llama.cpp but it is written in C and the python training code is available in that same repo. llama2.c's goal is to be a elegant single file C implementation of the inference and an elegant python implementation for training.
His goal is for people to understand how llama 2 and LLM's work, so he keeps it simple and sweet. As the project progresses, so will features and performance improvements added.
Currently it can infer baby (small) Story models trained by Karpathy at a fast pace. It can also infer Meta LLAMA 2 7b models, but at a very slow rate such as 1 token per second.
So currently this can be used for learning or as a tech preview.
Our friendly fork tries to make it portable, performant and more usable (bells and whistles) over time. Since we mirror upstream closely, the inference capabilities of our fork is similar but slightly faster if compiled with acceleration. What we try to do different is that we try to make this bootable (not there yet) and portable. Right now you can get binary portablity - use the same run.com on any x86_64 machine running on any OS, it will work (possible due to cosmopolitan toolchain). The other part that works is unikernels - boot this as unikernel in VM's (possible due unikraft unikernel & toolchain).
See our fork currently as a release early and release often toy tech demo. We plan to build it out into a useful product.
FLaNK Stack Weekly for 14 Aug 2023
32 projects | dev.to | 14 Aug 2023
Adding LLaMa2.c support for Web with GGML.JS
2 projects | /r/LocalLLaMA | 14 Aug 2023

In my latest release of ggml.js, I've added support for Karapathy's llama2.c model.
Beginner's Guide to Llama Models
2 projects | news.ycombinator.com | 12 Aug 2023

I really enjoyed Anrej Kaparthy's llama2.c project (https://github.com/karpathy/llama2.c), which runs through creating and running a miniature Llama2 architecture model from scratch.
How to scale LLMs better with an alternative to transformers
1 project | news.ycombinator.com | 27 Jul 2023

- https://github.com/karpathy/llama2.c
I think there may be some applications in this limited space that are worth looking into. You won’t replicate GPT-anything but it may be possible to solve some nice problems very much more efficiently that one would expect at first.
A simple guide to fine-tuning Llama 2
1 project | news.ycombinator.com | 27 Jul 2023

It does now: https://github.com/karpathy/llama2.c#metas-llama-2-models

What are some alternatives?

When comparing mistral-src and llama2.c you can also consider the following projects:

ReAct - [ICLR 2023] ReAct: Synergizing Reasoning and Acting in Language Models

llama2.c - Llama 2 Everywhere (L2E)

lida - Automatic Generation of Visualizations and Infographics using Large Language Models

fastGPT - Fast GPT-2 inference written in Fortran

ragas - Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

CML_AMP_Churn_Prediction_mlflow - Build an scikit-learn model to predict churn using customer telco data.

vllm - A high-throughput and memory-efficient inference and serving engine for LLMs

feldera - Feldera Continuous Analytics Platform

llama - Inference code for Llama models

awesome-data-temporality - A curated list to help you manage temporal data across many modalities 🚀.

text-generation-webui-colab - A colab gradio web UI for running Large Language Models

dify - Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.