Is there any open source app to load a model and expose API like OpenAI?

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llm-api

2 145 6.6 Python

Run any Large Language Model behind a unified API

1b5d/llm-api: Run any Large Language Model behind a unified API (github.com)

litellm

28 8,696 10.0 Python

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

I use this with ollama and works perfectly https://github.com/BerriAI/litellm

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
LocalAI

83 20,346 9.9 C++

:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
text-generation-inference

29 7,995 9.6 Python

Large Language Model Text Generation Inference
server

24 7,414 9.5 Python

The Triton Inference Server provides an optimized cloud and edge inferencing solution. (by triton-inference-server)
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

best way to serve llama V2 (llama.cpp VS triton VS HF text generation inference)

3 projects | /r/LocalLLaMA | 25 Sep 2023
Hugging Face reverts the license back to Apache 2.0

1 project | news.ycombinator.com | 8 Apr 2024
FLaNK Stack 05 Feb 2024

49 projects | dev.to | 5 Feb 2024
AI Code assistant for about 50-70 users

4 projects | /r/LocalLLaMA | 6 Dec 2023
"A matching Triton is not available"

1 project | /r/StableDiffusion | 15 Oct 2023

Is there any open source app to load a model and expose API like OpenAI?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
Inference Bloom GPU NLP Machine Learning
Post date: 9 Dec 2023

llm-api

litellm

InfluxDB

LocalAI

text-generation-inference

server

SaaSHub

Related posts

best way to serve llama V2 (llama.cpp VS triton VS HF text generation inference)

Hugging Face reverts the license back to Apache 2.0

FLaNK Stack 05 Feb 2024

AI Code assistant for about 50-70 users

"A matching Triton is not available"

Is there any open source app to load a model and expose API like OpenAI?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA Inference Bloom GPU NLP Machine Learning Post date: 9 Dec 2023

llm-api

litellm

InfluxDB

LocalAI

text-generation-inference

server

SaaSHub

Related posts

best way to serve llama V2 (llama.cpp VS triton VS HF text generation inference)

Hugging Face reverts the license back to Apache 2.0

FLaNK Stack 05 Feb 2024

AI Code assistant for about 50-70 users

"A matching Triton is not available"

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA
Inference Bloom GPU NLP Machine Learning
Post date: 9 Dec 2023