Top 11 Jupyter Notebook Dataset Projects
-
indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
-
cleora
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SKAB
SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
-
artificial-self-AMLD-2020
Workshop material for the AMLD 2020 workshop on "Meet your Artificial Self: Generate text that sounds like you"
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
parsee-datasets
Datasets, case studies and benchmarks for extracting structured information from PDFs, HTML files or images, created by the Parsee.ai team. Datasets also on Hugging Face: https://huggingface.co/parsee-ai
-
Data-Science-Data-Analystics-Contribution---Hacktoberfest-2022
About Submit Just 4 PRs to earn Tshirts🔥 in Hacktoberfest 2022
-
ProTaska-GPT
Unleash the Potential of Datasets with Intelligent Tasks, Tutorials, and Algorithm Recommendations.
To test this, we created 3 different datasets, all based on the same selection of 1,156 randomly selected annual reports for the year 2023 of publicly listed US companies.
The resulting (fully labeled) datasets contain a combined total of 10,404 rows, 37,536,847 tokens and 1,156 images and can be found on Github and Huggingface: https://github.com/parsee-ai/parsee-datasets/tree/main/datas...
For our study, we are evaluating 8 state-of-the-art (M)LLMs on a subset of 100 reports with some interesting results.
Project mention: Learn Data Science with a GPT-powered Tutor: ProTaska-GPT | /r/learnpython | 2023-06-19
Jupyter Notebook Datasets related posts
Index
What are some of the best open-source Dataset projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | indonlu | 490 |
2 | cleora | 477 |
3 | SKAB | 296 |
4 | Tegridy-MIDI-Dataset | 127 |
5 | ekya | 94 |
6 | artificial-self-AMLD-2020 | 80 |
7 | openfema-samples | 21 |
8 | intel-processors | 16 |
9 | parsee-datasets | 61 |
10 | Data-Science-Data-Analystics-Contribution---Hacktoberfest-2022 | 5 |
11 | ProTaska-GPT | 2 |
Sponsored