llama-mistral
megablocks
llama-mistral | megablocks | |
---|---|---|
5 | 6 | |
373 | 1,083 | |
- | 3.0% | |
8.4 | 8.7 | |
6 months ago | 8 days ago | |
Python | Python | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama-mistral
- Inference code for Mistral and Mixtral hacked up
-
French AI startup Mistral secures €2B valuation
No. Without the inference code, the best we can have are guesses on its implementation, so the benchmark figures we can get could be quite wrong. It does seem better than Llama2-70B in my tests, which rely on the work done by Dmytro Dzhulgakov[0] and DiscoResearch[1].
But the point of releasing on bittorrent is to see the effervescence in hobbyist research and early attempts at MoE quantization, which are already ongoing[2]. They are benefitting from the community.
[0]: https://github.com/dzhulgakov/llama-mistral
[1]: https://huggingface.co/DiscoResearch/mixtral-7b-8expert
[2]: https://github.com/TimDettmers/bitsandbytes/tree/sparse_moe
- Code to run Mistral - mixtral-8x7b-32kseqlen
-
New Mistral models just dropped (magnet links)
Someone made this. https://github.com/dzhulgakov/llama-mistral
-
Mistral 8x7B 32k model [magnet]
If anyone can help running this, would be appreciated. Resources so far:
- https://github.com/dzhulgakov/llama-mistral
megablocks
- FLaNK AI - 01 April 2024
- Mistrsal has released a new 87GM model
-
Megablocks-Public
This is a fork of https://github.com/stanford-futuredata/megablocks
should link to the original when possible, per HN posting guidelines
-
New Mistral models just dropped (magnet links)
I guess with +40 GBs of VRAM (until quantized) and megablocks as run time. https://github.com/stanford-futuredata/megablocks
- MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
What are some alternatives?
llama.cpp - LLM inference in C/C++
speedb - A RocksDB compliant high performance scalable embedded key-value store
megablocks-public
lapdev - Self-Hosted Remote Dev Environment
tracecat - 😼 The open source alternative to Tines / Splunk SOAR. Build AI-assisted workflows, orchestrate alerts, and close cases fast.