minimal C implementation of speculative decoding based on llama2.c
Why do you think that https://github.com/codeplea/genann is a good alternative to speculative_decoding.c
minimal C implementation of speculative decoding based on llama2.c
Why do you think that https://github.com/codeplea/genann is a good alternative to speculative_decoding.c