-
deepeval
Discontinued Unit Testing For LLMs [Moved to: https://github.com/confident-ai/deepeval] (by mr-gpt)
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
agentops
Open source Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen
-
promptfoo
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
-
agenta
The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.
-
ai-notes
notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under the /Resources folder.
I'd add ours too, although we're trying to be an end-to-end one-stop platform.
https://github.com/agenta-ai/agenta
added to my notes! https://github.com/swyxio/ai-notes/