LLMTest_NeedleInAHaystack
open_router
LLMTest_NeedleInAHaystack | open_router | |
---|---|---|
4 | 1 | |
1,065 | 58 | |
- | - | |
8.4 | 6.5 | |
23 days ago | 9 days ago | |
Jupyter Notebook | Ruby | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LLMTest_NeedleInAHaystack
- Claude 3 beats GPT-4 on Aider's code editing benchmark – aider
- Our next-generation model: Gemini 1.5
-
GPT-4 vs Claude-2 context recall analysis
This research follows the “haystack test” Greg Kamradt published when the update GPT-4 came out (twitter, code). That test provided useful insight into (the lack of) context recall performance. But it was performed on a very small sample test (limiting its statistical significance) and was initially limited to GPT-4 (he has since published an updated version that also uses Claude 2.1). Moreover, the test data consists of essays that were likely already used pretraining LLMs, and the results were evaluated by GPT-4, potentially introducing confounding variables into the mix.
- Analysis to test in-context retrieval ability of GPT-4-128K context
open_router
-
Claude 3 beats GPT-4 on Aider's code editing benchmark – aider
I’ve been using it in production and it works great. Makes a world of open source models just as easy to use as OpenAI.
Here’s my Ruby gem for it. https://github.com/OlympiaAI/open_router
What are some alternatives?
rag-stack - 🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All.
SillyTavern - LLM Frontend for Power Users.
Lobe Chat - LobeChat is a open-source, extensible (Function Calling), high-performance chatbot framework.It supports one-click free deployment of your private ChatGPT/LLM web application.
OpenCodeInterpreter - OpenCodeInterpreter is a suite of open-source code generation systems aimed at bridging the gap between large language models and sophisticated proprietary systems like the GPT-4 Code Interpreter. It significantly enhances code generation capabilities by integrating execution and iterative refinement functionalities.
ChatterUI - Simple frontend for LLMs built in react-native.