Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 12 Jupyter Notebook Benchmark Projects
-
llm-colosseum
Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)
-
SKAB
SKAB - Skoltech Anomaly Benchmark. Time-series data for evaluating Anomaly Detection algorithms.
-
Awesome_Satellite_Benchmark_Datasets
Supplementary material for our paper "THERE IS NO DATA LIKE MORE DATA" is provided.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Microbenchmarks
Microbenchmarks comparing the Julia Programming language with other languages (by JuliaLang)
-
food-recognition-benchmark-starter-kit
This repository is the main Food Recognition Benchmark template and Starter kit. Clone the repository to compete now!
-
hashtable-bench
A benchmark for hash tables and hash functions in C++, evaluate on different data as comprehensively as possible
Project mention: How to build a prediction model where there is negligible relation between the target variable and independent variables? | /r/datascience | 2023-05-31
Project mention: Romeo and Julia, Where Romeo Is Basic Statistics | news.ycombinator.com | 2024-03-15> Every language I've ever seen with garbage collection has gone through decades of "now the garbage collection is better" or "just wait until the next version, garbage collection will be better".
Ok but the Go example I linked is already in production, right now, you can use it. This isn't a "it will get better in two releases" situation, Go's GC as of today has pause times that are sub-millisecond. The Java Shenandoah example I linked is still mostly in beta, but it's also something you can use right now, though admittedly it'll probably be awhile before it's in a mainline release.
> This is besides the point of performance and no longer talking about reality, it's just FUD from a "what if" future.
It's not "just FUD", there are dozens of reported security issues that have happened because of bad manual memory management problems. Off the top of my head, Heartbleed was a famous case.
This isn't me badmouthing anyone; manual memory management is hard to get right, even for very smart people.
> Right, but you get it by avoiding allocation and avoiding the garbage collector the same way avoiding allocation in C++ is important, but in julia it won't be woven in to the performance, it will cause big pauses.
Fair enough, I did look at the code for the official benchmarks (https://github.com/JuliaLang/Microbenchmarks/blob/master/per...) and outside of the integer parsing code it does indeed seem to avoid dynamic allocations so I will concede that the benchmarks might be a bit more skewed compared to real-world code.
I still get a hunch that if you compared it allocation-heavy Julia to malloc+free-heavy C++ the differences wouldn't really be that far off, but that's just a hunch and I don't have data to back that up; might be a fun test to write though, so maybe I'll try that this weekend.
-----
Sort of tangential, but I also do think that there's value in having decent concurrency constructs built into the language. With C++, if you stick to built-ins you are basically stuck with mutexes and despite what people like to pretend, getting correct code with mutexes is really really hard to get right, and very easy to screw up in a non-obvious way. If you allow yourself to use libraries, then you have stuff like ZeroMQ and OpenMP and stuff, so it's really not that dire realistically. However, I think there's value in having nice, easy to use concurrency constructs in the language other than mutexes, and I do wonder if as a result of that it encourages people to utilize multiple threads more frequently, because they don't have to worry about weird deadlock situations as much.
Again, I believe Rust actually does address this because of the single-owner-enforced-at-compile-time stuff, but I haven't used it enough to really draw a conclusion on it.
Project mention: SciTS: A tool to benchmark Time-series on different databases | news.ycombinator.com | 2023-07-10
file-format-benchmark: benchmark script of key operations between different file formats
Jupyter Notebook Benchmark related posts
-
LLM Colosseum
-
Evaluate LLMs in Real Time with Street Fighter III
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
SKAB: NEW Data - star count:238.0
-
I Still ‘Lisp’ (and You Should Too)
-
A note from our sponsor - InfluxDB
www.influxdata.com | 10 May 2024
Index
What are some of the best open-source Benchmark projects in Jupyter Notebook? This list will help you:
Project | Stars | |
---|---|---|
1 | llm-colosseum | 942 |
2 | human-learn | 780 |
3 | indonlu | 490 |
4 | SKAB | 295 |
5 | Awesome_Satellite_Benchmark_Datasets | 284 |
6 | tf-metal-experiments | 265 |
7 | benchmarks | 164 |
8 | Microbenchmarks | 84 |
9 | food-recognition-benchmark-starter-kit | 66 |
10 | hashtable-bench | 12 |
11 | SciTS | 11 |
12 | file-format-benchmark | 2 |
Sponsored