llama3
promptfoo
llama3 | promptfoo | |
---|---|---|
19 | 5 | |
20,272 | 328 | |
15.0% | - | |
8.9 | 10.0 | |
8 days ago | 11 months ago | |
Python | TypeScript | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
llama3
-
Hindi-Language AI Chatbot for Enterprises Using Qdrant, MLFlow, and LangChain
Now, let's start building the next part of the chatbot. In this part, we will be using the LLM from Ollama and integrating it with the chatbot. More particularly, we will be using the Llama-3 model. Llama-3 is Meta's latest and most advanced open-source large language model (LLM). It is the successor to the previous Llama 2 model and represents a significant improvement in performance across a variety of benchmarks and tasks. Llama 3 comes in two main versions - an 8 billion parameter model and a 70 billion parameter model. Llama 3 supports longer context lengths of up to 8,000 tokens.
- FLaNK AI-April 22, 2024
- Meta Llama 3 GitHub
- Mark Zuckerberg himself appears in the list of direct contributors to Llama 3
- Mark Zuckerberg: Llama 3, $10B Models, Caesar Augustus, Bioweapons [video]
-
Llama 3 in [8B and 70B] sizes is out
What is fascinating is how the smaller 8B version outperformed the bigger previus-gen 70B model in every benchmark listed on the model card:
- Llama 3 GitHub Repository
- Meta Llama 3
promptfoo
-
Ollama v0.1.33 with Llama 3, Phi 3, and Qwen 110B
Jumping in because I'm a big believer in (1) local LLMs, and (2) evals specific to individual use cases.
[0] https://github.com/typpo/promptfoo
- Meta Llama 3
-
Launch HN: Talc AI (YC S23) – Test Sets for AI
Congrats on the launch!
I've been interested in automatic testset generation because I find that the chore of writing tests is one of the reasons people shy away from evals. Recently landed eval testset generation for promptfoo (https://github.com/typpo/promptfoo), but it is non-RAG so more simplistic than your implementation.
Was also eyeballing this paper https://arxiv.org/abs/2401.03038, which outlines a method for generating asserts from prompt version history that may also be useful for these eval tools.
-
GPT-Prompt-Engineer
Thanks for the promptfoo mention. For anyone else who might prefer deterministic, programmatic evaluation of LLM outputs, I've been building promptfoo: https://github.com/typpo/promptfoo
Example asserts include basic string checks, regex, is-json, cosine similarity, etc.
What are some alternatives?
rebuff - LLM Prompt Injection Detector
gpt-engineer - Specify what you want it to build, the AI asks for clarification, and then builds it.
ChainForge - An open-source visual programming environment for battle-testing prompts to LLMs.
gateway - A Blazing Fast AI Gateway. Route to 100+ LLMs with 1 fast & friendly API.
shap-e - Generate 3D objects conditioned on text or images
sugarcane-ai - npm like package ecosystem for Prompts 🤖
plandex - AI driven development in your terminal. Designed for large, real-world tasks.
gpt-prompt-engineer
cloudseeder - One-click install internet appliances that operate on your terms. Transform your home computer into a sovereign and secure cloud.
ollama - Get up and running with Llama 3, Mistral, Gemma, and other large language models.
llama-chat - Implements a simple REPL chat with a locally running instance of Ollama.
TensorRT-LLM - TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.