LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy (by gkamradt)
rag-stack
🤖 Deploy a private ChatGPT alternative hosted within your VPC. 🔮 Connect it to your organization's knowledge base and use it as a corporate oracle. Supports open-source LLMs like Llama 2, Falcon, and GPT4All. (by psychic-api)
LLMTest_NeedleInAHaystack | rag-stack | |
---|---|---|
4 | 4 | |
1,065 | 1,416 | |
- | 1.6% | |
8.4 | 8.3 | |
23 days ago | 8 months ago | |
Jupyter Notebook | TypeScript | |
GNU General Public License v3.0 or later | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LLMTest_NeedleInAHaystack
Posts with mentions or reviews of LLMTest_NeedleInAHaystack.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-03-27.
- Claude 3 beats GPT-4 on Aider's code editing benchmark – aider
- Our next-generation model: Gemini 1.5
-
GPT-4 vs Claude-2 context recall analysis
This research follows the “haystack test” Greg Kamradt published when the update GPT-4 came out (twitter, code). That test provided useful insight into (the lack of) context recall performance. But it was performed on a very small sample test (limiting its statistical significance) and was initially limited to GPT-4 (he has since published an updated version that also uses Claude 2.1). Moreover, the test data consists of essays that were likely already used pretraining LLMs, and the results were evaluated by GPT-4, potentially introducing confounding variables into the mix.
- Analysis to test in-context retrieval ability of GPT-4-128K context
rag-stack
Posts with mentions or reviews of rag-stack.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2024-02-15.
-
Our next-generation model: Gemini 1.5
Retrieval augmented generation.
> Retrieval Augmented Generation (RAG) is a technique where the capabilities of a large language model (LLM) are augmented by retrieving information from other systems and inserting them into the LLM’s context window via a prompt.
(stolen from: https://github.com/psychic-api/rag-stack)
-
Anyone here managed to setup the self hosted version of Rag-Stack ?
I spent hours trying to make it work on Ubuntu, solving one error message after another. But i can't make it... Anyone managed ? https://github.com/psychic-api/rag-stack
-
Show HN: ChatMyFiles, Open Source ChatPDF
- (Optional) Open-source LLM such as Falcon, Llama, or GPT4All
to your virtual private cloud (VPC).
We used Langchain to interface with open source LLMs and Ragstack to deploy to Google Cloud: https://github.com/psychic-api/rag-stack
- Show HN: RAGstack – private ChatGPT for enterprise VPCs, built with Llama 2
What are some alternatives?
When comparing LLMTest_NeedleInAHaystack and rag-stack you can also consider the following projects:
open_router - Ruby library for OpenRouter API