Gemma: New Open Models

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • gemma_pytorch

    The official PyTorch implementation of Google's Gemma models

  • https://github.com/google/gemma_pytorch/blob/main/tokenizer/...

    I decoded this model protobuf in Python and here is the diff with the Llama 2 tokenizer:

  • ollama

    Get up and running with Llama 3, Mistral, Gemma, and other large language models.

  • Already available in Ollama v0.1.26 preview release, if you'd like to start playing with it locally:

    - https://github.com/ollama/ollama/releases/tag/v0.1.26

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • gemma.cpp

    lightweight, standalone C++ inference engine for Google's Gemma models.

  • They have implemented the model also on their own C++ inference engine: https://github.com/google/gemma.cpp

  • gemma

    Open weights LLM from Google DeepMind.

  • We've documented the architecture (including key differences) in our technical report here (https://goo.gle/GemmaReport), and you can see the architecture implementation in our Git Repo (https://github.com/google-deepmind/gemma).

  • llama.cpp

    LLM inference in C/C++

  • It should be possible to run it via llama.cpp[0] now.

    [0] https://github.com/ggerganov/llama.cpp/pull/5631

  • text-to-text-transfer-transformer

    Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

  • Google released the T5 paper about 5 years ago:

    https://arxiv.org/abs/1910.10683

    This included full model weights along with a detailed description of the dataset, training process, and ablations that led them to that architecture. T5 was state-of-the-art on many benchmarks when it was released, but it was of course quickly eclipsed by GPT-3.

    Following GPT-3, it became much more common for labs to not release full details or model weights. Prior to that, it was common practice from Google (BERT, T5), Meta (BART), OpenAI (GPT1, GPT2) and others to release full training details and model weights.

  • ai-on-gke

  • There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.

    https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main

    and

    https://github.com/google/xpk (a bit more focused on HPC, but includes AI)

    and

    https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)

    The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • xpk

    xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.

  • There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.

    https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main

    and

    https://github.com/google/xpk (a bit more focused on HPC, but includes AI)

    and

    https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)

    The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.

  • ml-engineering

    Machine Learning Engineering Open Book

  • There is a lot of work to make the actual infrastructure and lower level management of lots and lots of GPUs/TPUs open as well - my team focuses on making the infrastructure bit at least a bit more approachable on GKE and Kubernetes.

    https://github.com/GoogleCloudPlatform/ai-on-gke/tree/main

    and

    https://github.com/google/xpk (a bit more focused on HPC, but includes AI)

    and

    https://github.com/stas00/ml-engineering (not associated with GKE, but describes training with SLURM)

    The actual training is still a bit of a small pool of very experienced people, but it's getting better. And every day serving models gets that much faster - you can often simply draft on Triton and TensorRT-LLM or vLLM and see significant wins month to month.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ask HN: Voice Equivalent to "This Person Does Not Exist"?

    1 project | news.ycombinator.com | 22 May 2024
  • SB-1047 will stifle open-source AI and decrease safety

    2 projects | news.ycombinator.com | 29 Apr 2024
  • Sequence-to-Sequence Toolkit Written in Python

    1 project | news.ycombinator.com | 30 Mar 2024
  • Gemma doesn't suck anymore – 8 bug fixes

    3 projects | news.ycombinator.com | 11 Mar 2024
  • Show HN: LlamaGym – fine-tune LLM agents with online reinforcement learning

    2 projects | news.ycombinator.com | 10 Mar 2024