Llama-cpp-python Alternatives

Similar projects and alternatives to llama-cpp-python

text-generation-webui

876 36,827 9.9 Python llama-cpp-python VS text-generation-webui

A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
llama.cpp

778 57,984 10.0 C++ llama-cpp-python VS llama.cpp

LLM inference in C/C++
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
ollama

209 64,536 9.9 Go llama-cpp-python VS ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.
gpt4all

139 64,901 9.8 C++ llama-cpp-python VS gpt4all

gpt4all: run open-source LLMs anywhere
mlc-llm

89 17,150 9.9 Python llama-cpp-python VS mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
LocalAI

83 20,076 9.9 C++ llama-cpp-python VS LocalAI

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
FastChat

83 34,514 9.6 Python llama-cpp-python VS FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
ggml

69 9,802 9.8 C llama-cpp-python VS ggml

Tensor library for machine learning
aider

64 9,914 9.9 Python llama-cpp-python VS aider

aider is AI pair programming in your terminal
khoj

50 4,885 9.9 Python llama-cpp-python VS khoj

Your AI second brain. A copilot to get answers to your questions, whether they be from your own notes or from the internet. Use powerful, online (e.g gpt4) or private, local (e.g mistral) LLMs. Self-host locally or use our web app. Access from Obsidian, Emacs, Desktop app, Web or Whatsapp.
alpaca_lora_4bit

41 529 8.6 Python llama-cpp-python VS alpaca_lora_4bit
text-generation-inference

29 7,995 9.6 Python llama-cpp-python VS text-generation-inference

Large Language Model Text Generation Inference
refact

34 1,428 9.8 JavaScript llama-cpp-python VS refact

WebUI for Fine-Tuning and Self-hosting of Open-Source Large Language Models for Coding
basaran

22 1,281 10.0 Python llama-cpp-python VS basaran

Discontinued Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-based text generation models.
continue

18 11,309 10.0 TypeScript llama-cpp-python VS continue

⏩ Open-source VS Code and JetBrains extensions that enable you to easily create your own modular AI software development system
TensorRT-LLM

14 6,705 8.4 C++ llama-cpp-python VS TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
intel-extension-for-pytorch

16 1,365 9.7 Python llama-cpp-python VS intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
localLLM_guidance

3 147 4.2 Jupyter Notebook llama-cpp-python VS localLLM_guidance

Local LLM ReAct Agent with Guidance
gpt4all-chat

3 1,188 9.3 C++ llama-cpp-python VS gpt4all-chat

Discontinued gpt4all-j chat
KoboldAI

58 150 8.6 Python llama-cpp-python VS KoboldAI
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better llama-cpp-python alternative or higher similarity.

Suggest an alternative to llama-cpp-python

llama-cpp-python reviews and mentions

Posts with mentions or reviews of llama-cpp-python. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-28.

Ollama v0.1.33 with Llama 3, Phi 3, and Qwen 110B
11 projects | news.ycombinator.com | 28 Apr 2024

There's a Python binding for llama.cpp which is actively maintained and has worked well for me: https://github.com/abetlen/llama-cpp-python
FLaNK AI for 11 March 2024
46 projects | dev.to | 11 Mar 2024
OpenAI: Memory and New Controls for ChatGPT
4 projects | news.ycombinator.com | 13 Feb 2024

I'll share the core bit that took a while to figure out the right format, my main script is a hot mess using embeddings with SentenceTransformer, so I won't share that yet. E.g: last night I did a PR for llama-cpp-python that shows how Phi might be used with JSON only for the author to write almost exactly the same code at pretty much the same time. https://github.com/abetlen/llama-cpp-python/pull/1184
TinyLlama LLM: A Step-by-Step Guide to Implementing the 1.1B Model on Google Colab
2 projects | dev.to | 6 Jan 2024

Python Bindings for llama.cpp
Mistral-8x7B-Chat
4 projects | news.ycombinator.com | 10 Dec 2023
Running Mistral LLM on Apple Silicon Using Apple's MLX Framework Is Much Faster
2 projects | news.ycombinator.com | 6 Dec 2023

If the model could be made to work with llama.cpp, then https://github.com/abetlen/llama-cpp-python might be more compact. llama.cpp only supports a limited list of model types though.
Run ChatGPT-like LLMs on your laptop in 3 lines of code
9 projects | news.ycombinator.com | 6 Sep 2023
Code Llama, a state-of-the-art large language model for coding
4 projects | news.ycombinator.com | 24 Aug 2023

https://github.com/abetlen/llama-cpp-python has a web server mode that replicates openai's API iirc and the readme shows it has docker builds already.
Meta: Code Llama, an AI Tool for Coding
18 projects | news.ycombinator.com | 24 Aug 2023

LocalAI https://localai.io/ and LMStudio https://lmstudio.ai/ both have fairly complete OpenAI compatibility layers. llama-cpp-python has a FastAPI server as well: https://github.com/abetlen/llama-cpp-python/blob/main/llama_... (as of this moment it hasn't merged GGUF update yet though)
First steps with llama
2 projects | dev.to | 31 Jul 2023

I went with Python, llama-cpp-python, since my goal is just to get a small project up and running locally.
A note from our sponsor - InfluxDB
www.influxdata.com | 13 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →