Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
Why do you think that https://github.com/tlkh/tf-metal-experiments is a good alternative to llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
Why do you think that https://github.com/tlkh/tf-metal-experiments is a good alternative to llm-colosseum