A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Here you can share your experience with the project you are suggesting or its comparison with alpaca_farm. Optional.
A valid email to send you a verification link when necessary or log in.