gemma.cpp
xpk
gemma.cpp | xpk | |
---|---|---|
8 | 1 | |
5,582 | 57 | |
8.4% | - | |
9.3 | 9.0 | |
about 20 hours ago | 1 day ago | |
C++ | Python | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gemma.cpp
-
LLaMA Now Goes Faster on CPUs
For C++, also check out our https://github.com/google/gemma.cpp/blob/main/gemma.cc, which has direct calls to MatVec.
- FLaNK Stack 26 February 2024
-
Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models
Looks like they're working on it: https://github.com/google/gemma.cpp/issues/16
- Source code of Google Gemma model in C++
-
Gemma: New Open Models
They have implemented the model also on their own C++ inference engine: https://github.com/google/gemma.cpp
xpk
-
Gemma: New Open Models
There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.
https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main
and
https://github.com/google/xpk (a bit more focused on HPC, but includes AI)
and
https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)
The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.
What are some alternatives?
llamafile - Distribute and run LLMs with a single file.
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
mud-pi - A simple MUD server in Python, for teaching purposes, which could be run on a Raspberry Pi
gemma_pytorch - The official PyTorch implementation of Google's Gemma models
gemma - Open weights LLM from Google DeepMind.
htmx - </> htmx - high power tools for HTML
plantuml - Generate diagrams from textual description
prql - PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
lnav - Log file navigator
ibis - the portable Python dataframe library
lotion - An open-source Notion UI built with Vue 3
FLiPStackWeekly - FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...