SaaSHub helps you find the best software and product alternatives Learn more โ
Top 23 Jupyter Notebook AI Projects
-
generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI ๐ https://microsoft.github.io/generative-ai-for-beginners/
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
h4cker
This repository is primarily maintained by Omar Santos (@santosomar) and includes thousands of resources related to ethical hacking, bug bounties, digital forensics and incident response (DFIR), artificial intelligence security, vulnerability research, exploit development, reverse engineering, and more.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
-
generative-ai
Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI (by GoogleCloudPlatform)
-
Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focused on training faces, objects, and styles. (by JoePenna)
-
machine-learning-experiments
๐ค Interactive Machine Learning experiments: ๐๏ธmodels training + ๐จmodels demo
-
vertex-ai-samples
Sample code and notebooks for Vertex AI, the end-to-end machine learning platform on Google Cloud
-
imodels
Interpretable ML package ๐ for concise, transparent, and accurate predictive modeling (sklearn-compatible).
-
tensor-house
A collection of reference Jupyter notebooks and demo AI/ML applications for enterprise use cases: marketing, pricing, supply chain, smart manufacturing, and more.
-
Deep-Learning-In-Production
Build, train, deploy, scale and maintain deep learning models. Understand ML infrastructure and MLOps using hands-on examples.
-
chameleon-llm
Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
-
bark
๐ BARK INFINITY GUI CMD ๐ถ Powered Up Bark Text-prompted Generative Audio Model (by JonathanFly)
-
PConv-Keras
Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Generative AI For Beginners: a collection of resources to learn about Generative AI, including tutorials, code samples, and more.
Project mention: Show HN: Next-token prediction in JavaScript โ build fast LLMs from scratch | news.ycombinator.com | 2024-04-10People on here will be happy to say that I do a similar thing, however my sequence length is dynamic because I also use a 2nd data structure - I'll use pretentious academic speak: I use a simple bigram LM (2-gram) for single next-word likeliness and separately a trie that models all words and phrases (so, n-gram). Not sure how many total nodes because sentence lengths vary in training data, but there are about 200,000 entry points (keys) so probably about 2-10 million total nodes in the default setup.
"Constructing 7-gram LM": They likely started with bigrams (what I use) which only tells you the next word based on 1 word given, and thought to increase accuracy by modeling out more words in a sequence, and eventually let the user (developer) pass in any amount they want to model (https://github.com/google-research/google-research/blob/5c87...). I thought of this too at first, but I actually got more accuracy (and speed) out of just keeping them as bigrams and making a totally separate structure that models out an n-gram of all phrases (e.g. could be a 24-token long sequence or 100+ tokens etc. I model it all) and if that phrase is found, then I just get the bigram assumption of the last token of the phrase. This works better when the training data is more diverse (for a very generic model), but theirs would probably outperform mine on accuracy when the training data has a lot of nearly identical sentences that only change wildly toward the end - I don't find this pattern in typical data though, maybe for certain coding and other tasks there are those patterns though. But because it's not dynamic and they make you provide that number, even a low number (any phrase longer than 2 words) - theirs will always have to do more lookup work than with simple bigrams and they're also limited by that fixed number as far as accuracy. I wonder how scalable that is - if I need to train on occasional ~100-word long sentences but also (and mostly) just ~3-word long sentences, I guess I set this to 100 and have a mostly "undefined" trie.
I also thought of the name "LMJS", theirs is "jslm" :) but I went with simply "next-token-prediction" because that's what it ultimately does as a library. I don't know what theirs is really designed for other than proving a concept. Most of their code files are actually comments and hypothetical scenarios.
I recently added a browser example showing simple autocomplete using my library: https://github.com/bennyschmidt/next-token-prediction/tree/m... (video)
And next I'm implementing 8-dimensional embeddings that are converted to normalized vectors between 0-1 to see if doing math on them does anything useful beyond similarity, right now they look like this:
[nextFrequency, prevalence, specificity, length, firstLetter, lastLetter, firstVowel, lastVowel]
๐ https://github.com/microsoft/AI-For-Beginners ๐ https://microsoft.github.io/AI-For-Beginners/
Deci's YOLO-NAS Pose: Redefining Pose Estimation! Elevating healthcare, sports, tech, and robotics with precision and speed. Github link and blog link down below! Repo: https://github.com/spmallick/learnopencv/tree/master/YOLO-NAS-Pose
Project mention: The Era of 1-bit LLMs: ternary parameters for cost-effective computing | news.ycombinator.com | 2024-02-28https://github.com/Stability-AI/StableLM?tab=readme-ov-file#...
Project mention: [D] Where can I find a list of the foundational academic papers in RL/ML/DL and what are your go-to places to find new academic papers in RL/ML/DL? | /r/MachineLearning | 2023-07-07Labml.ai stopped working in May. I like https://github.com/dair-ai/ML-Papers-of-the-Week
I've used the code based on similar examples from GitHub [1]. According to docs [2], imagegeneration@005 was released on the 11th, so I guessed it's Imagen 2, though there are no confirmations.
[1] https://github.com/GoogleCloudPlatform/generative-ai/blob/ma...
[2] https://console.cloud.google.com/vertex-ai/publishers/google...
Project mention: Will there be comprehensive tutorials for fine-tuning SD XL when it comes out? | /r/StableDiffusion | 2023-07-01Tons of stuff here, no? https://github.com/JoePenna/Dreambooth-Stable-Diffusion/
Project mention: Gemini 1.5 outshines GPT-4-Turbo-128K on long code prompts, HVM author | news.ycombinator.com | 2024-02-18
Project mention: To Bridge the Gap Until the Official Audiobooks Are Released I Tried Making a Myne TTS [P5V5] | /r/HonzukiNoGekokujou | 2023-10-19So I looked around and decided to use Bark Infinity. (Originally wanted to use Amazon Polly, but don't have a credit card) I tried around and found out that the female storyteller voice sounds quite decently. So I used that and a reference clip of Myne's voice as prompt (which I think might have helped a little... I don't get all that program's features) to generate a whole chapter. That worked quite well.
Jupyter Notebook AI related posts
-
Ask HN: Why all these GitHub fake accounts starring my project
-
Alternative Chunking Methods
-
Machine Learning and AI Beyond the Basics Book
-
Google Research website is down
-
GPT-4, without specialized training, beat a GPT-3.5 class model that cost $10B
-
FREE AI Course By Microsoft: ZERO to HERO! ๐ฅ
-
Building an Open Source Decentralized E-Book Search Engine
-
A note from our sponsor - SaaSHub
www.saashub.com | 17 May 2024
Index
What are some of the best open-source AI projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | generative-ai-for-beginners | 43,780 |
2 | google-research | 32,991 |
3 | AI-For-Beginners | 31,684 |
4 | learnopencv | 20,471 |
5 | h4cker | 16,717 |
6 | StableLM | 15,853 |
7 | stable-diffusion-webui-colab | 15,290 |
8 | dopamine | 10,378 |
9 | ML-Papers-of-the-Week | 8,943 |
10 | generative-ai | 5,640 |
11 | nlpaug | 4,252 |
12 | ArtLine | 3,531 |
13 | Dreambooth-Stable-Diffusion | 3,170 |
14 | examples | 2,465 |
15 | clip-retrieval | 2,163 |
16 | machine-learning-experiments | 1,607 |
17 | vertex-ai-samples | 1,384 |
18 | imodels | 1,293 |
19 | tensor-house | 1,179 |
20 | Deep-Learning-In-Production | 1,073 |
21 | chameleon-llm | 1,020 |
22 | bark | 960 |
23 | PConv-Keras | 893 |
Sponsored