coral-pi-rest-server
rwkv.cpp
coral-pi-rest-server | rwkv.cpp | |
---|---|---|
44 | 12 | |
66 | 1,111 | |
- | 2.6% | |
0.0 | 6.8 | |
7 months ago | 27 days ago | |
Jupyter Notebook | C++ | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
coral-pi-rest-server
- BeagleY-AI: 4 TOPS-capable $70 board from Beagleboard
- Do you recommend Orange PI for ML or LLM projects?
-
Framework for machine learning?
That said, you can always look at something like https://coral.ai/products/accelerator/ to help with the performance you need.
-
Mini PC for AI
Should only be ~$60 https://coral.ai/products/accelerator/
-
What are some USB devices worth using in a Home Lab Environment?
The Coral USB accelerator might be of interest if you want to do some light ML with a low power budget.
- Is a PCIe x1 enough for light ML tasks
-
Would I be able to run ggml models such as whisper.cpp or llama.cpp on a raspberry pi with a coral ai USB Accelerator?
However, a pi doesn't have the strength to run something like Llama.cpp, of course, so I've been considering using something like the Coral USB Accelerator (https://coral.ai/products/accelerator). As I've been learning more about it, it seems to be very geared towards TensorFlow Lite models. But whisper.cpp and Llama.cpp use ggml models.
- Looking for a Mini PC for Home Assistant and Frigate.
- AI development suite on a stick?
-
Modder wires ChatGPT into Skyrim VR so NPCs can roleplay and remember past conversations
Recently found this thing, though I haven't found a use case for me.
rwkv.cpp
-
Eagle 7B: Soaring past Transformers
There's https://github.com/saharNooby/rwkv.cpp, which related-ish[0] to ggml/llama.cpp
[0]: https://github.com/ggerganov/llama.cpp/issues/846
- People who've used RWKV, whats your wishlist for it?
-
The Eleuther AI Mafia
Quantisation thankfully is applicable to RWKV as much as transformers. Most notably in our RWKV.cpp community project: https://github.com/saharNooby/rwkv.cpp
Tooling/Ecosystem is something that I am actively working on as there is still a gap to transformers level of tooling. But i'm glad that there is a noticeable difference!
And yes! experiments are important, to ensure improvements in the architecture. Even if "Linear Transformers" replaces "Transformers". Alternatives should always be explored, to learn from such trade-offs to the benefit of the ecosystem
(This was lightly covered in the podcast, where I share IMO that we should have more research into text based diffusion networks)
- Tiny models for contextually coherent conversations?
-
New model: RWKV-4-Raven-7B-v12-Eng49%-Chn49%-Jpn1%-Other1%-20230530-ctx8192.pth
Q8_0 models: only for https://github.com/saharNooby/rwkv.cpp (fast CPU).
- [R] RWKV: Reinventing RNNs for the Transformer Era
-
4096 Context length (and beyond)
There's https://github.com/saharNooby/rwkv.cpp which seems to work, and might be compatible with text-generation-webui.
-
The Coming of Local LLMs
Also worth checking out https://github.com/saharNooby/rwkv.cpp which is based on Georgi's library and offers support for the RWKV family of models which are Apache-2.0 licensed.
-
KoboldCpp - Combining all the various ggml.cpp CPU LLM inference projects with a WebUI and API (formerly llamacpp-for-kobold)
I'm most interested in that last one. I think I heard the RWKV models are very fast, don't need much Ram, and can have huge context tokens, so maybe their 14b can work for me. I wasn't sure how ready for use they were though, but looking more into it, stuff like rwkv.cpp and ChatRWKV and a whole lot of other community projects are mentioned on their github.
- rwkv.cpp: FP16 & INT4 inference on CPU for RWKV language model (r/MachineLearning)
What are some alternatives?
alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM
llama.cpp - LLM inference in C/C++
double-take - Unified UI and API for processing and training images for facial recognition.
RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
rpi-urban-mobility-tracker - The easiest way to count pedestrians, cyclists, and vehicles on edge computing devices or live video feeds.
ChatRWKV - ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
opentts - Open Text to Speech Server
mpt-30B-inference - Run inference on MPT-30B using CPU
HASS-coral-rest-api - Coral REST API for HASS
verbaflow - Neural Language Model for Go
os-nvr