Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out

Scout Monitoring - Free Django app performance insights with Scout Monitoring

Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

www.scoutapm.com

featured

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

outlines

33 6,086 9.7 Python

Structured Text Generation

As I was playing with the Outlines library (https://outlines-dev.github.io/outlines/), I discussed with my friend Maxime how funny it would be if we set up a way to pair LLMs in chess matches till one wins. The first time I tried it, it required substantial prompt engineering to get some of those LLMs to propose valid moves. Large language models can mostly stay focused and even play rather well; see https://news.ycombinator.com/item?id=37616170 for example. However small language models aren't as easy to convince.
Some of those LLMs have seen very little chess notation and so after the first few opening moves there aren't any valid tactics, let alone strategy, so they would end up either repeating the same move, or hallucinate moves that are not valid (Kxe5, but there would be a queen on e5!)
Then Outlines came along and we could force them to pick valid moves with little cost! Maxime worked super fast and got a first version of this idea as a gradio space.
I think it is pretty fun to see the (mostly terrible, but otherwise valid) chess that those LLMs play. Maybe it will even be instructive to how we can create small LLMs that can play much better than the ones on the leaderboard.
Anyway, you can check it out here:
https://huggingface.co/spaces/mlabonne/chessllm
What is interactive about it: you can pick the LLMs from available models on HuggingFace (within reason, small LLMs are preferable so that the space does not crash) or push one of your own small models to HF and have it fight with others. At the end of the game the leaderboard is updated.
Hope you find it fun!

Scout Monitoring

www.scoutapm.com featured

Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

CERN Root

3 projects | news.ycombinator.com | 1 Jun 2024
Shellgpt: Chat with LLM in your terminal, be it shell generator, story teller

1 project | news.ycombinator.com | 1 Jun 2024
Omost: A project to convert LLM's coding capability to image generation

1 project | news.ycombinator.com | 31 May 2024
Take control! Run ChatGPT and Github Copilot yourself!

3 projects | dev.to | 31 May 2024
The DevRel Digest May 2024: Documentation and the Developer Journey

1 project | dev.to | 31 May 2024

Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Post date: 14 Mar 2024

outlines

Scout Monitoring

Related posts

CERN Root

Shellgpt: Chat with LLM in your terminal, be it shell generator, story teller

Omost: A project to convert LLM's coding capability to image generation

Take control! Run ChatGPT and Github Copilot yourself!

The DevRel Digest May 2024: Documentation and the Developer Journey