Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

gemma.cpp

8 5,600 9.3 C++

lightweight, standalone C++ inference engine for Google's Gemma models.

Yes - thanks for pointing that out. The README is being updated, you can see an updated WIP in the dev branch: https://github.com/google/gemma.cpp/tree/dev?tab=readme-ov-f...

EasyDeL

2 158 9.9 Python

Accelerate your training with this open-source library. Optimize performance with streamlined training and serving options with JAX. 🚀

This is not a production backend, as it says in the readme.
There are some very interesting efforts in JAX/TPU land like https://github.com/erfanzar/EasyDeL

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
llamafile

38 16,040 9.6 C++

Distribute and run LLMs with a single file.

llama.cpp has integrated gemma support. So you can use llamafile for this. It is a standalone executable that is portable across most popular OSes.
https://github.com/Mozilla-Ocho/llamafile/releases
So, download the executable from the releases page under assets. You want either just main or just server. Don't get the huge ones with the model inlined in the file. The executable is about 30MB in size,
https://github.com/Mozilla-Ocho/llamafile/releases/download/...

highway

67 3,691 9.8 C++

Performance-portable, length-agnostic SIMD with runtime dispatch

Thanks so much!
Everyone working on this self-selected into contributing, so I think of it less as my team than ... a team?
Specifically want to call out: Jan Wassenberg (author of https://github.com/google/highway) and I started gemma.cpp as a small project just a few months ago + Phil Culliton, Dan Zheng, and Paul Chang + of course the GDM Gemma team.

gemma-cpp-python

1 36 8.2 Python

A Python wrapper for gemma.cpp
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4

3 projects | news.ycombinator.com | 31 Mar 2024
Permuting Bits with GF2P8AFFINEQB

1 project | news.ycombinator.com | 27 Sep 2023
Six times faster than C

4 projects | news.ycombinator.com | 6 Jul 2023
AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now

1 project | news.ycombinator.com | 24 Jun 2023
10~17x faster than what? A performance analysis of Intel' x86-SIMD-sort(AVX-512)

3 projects | news.ycombinator.com | 10 Jun 2023

Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Simd flax simd-instructions Jax simd-programming
Post date: 23 Feb 2024

gemma.cpp

EasyDeL

InfluxDB

llamafile

highway

gemma-cpp-python

SaaSHub

Related posts

Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4

Permuting Bits with GF2P8AFFINEQB

Six times faster than C

AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now

10~17x faster than what? A performance analysis of Intel' x86-SIMD-sort(AVX-512)

Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Simd flax simd-instructions Jax simd-programming Post date: 23 Feb 2024

gemma.cpp

EasyDeL

InfluxDB

llamafile

highway

gemma-cpp-python

SaaSHub

Related posts

Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4

Permuting Bits with GF2P8AFFINEQB

Six times faster than C

AMD EPYC 97x4 “Bergamo” CPUs: 128 Zen 4c CPU Cores for Servers, Shipping Now

10~17x faster than what? A performance analysis of Intel' x86-SIMD-sort(AVX-512)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Simd flax simd-instructions Jax simd-programming
Post date: 23 Feb 2024