Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Python Distributed Projects
-
Ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
-
nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
-
code2vec
TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
-
evotorch
Advanced evolutionary computation library built directly on top of PyTorch, created at NNAISENSE.
-
runhouse
Write local debuggable Python which traverses your powerful remote infra. Deploy as-is. Unobtrusive, unopinionated, PyTorch-like APIs.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Ray: Unified framework for scaling AI and Python applications | news.ycombinator.com | 2024-05-03
Project mention: Optuna – A Hyperparameter Optimization Framework | news.ycombinator.com | 2024-04-06I didn’t even know WandB did hyperparameter optimization, I figured it was a neural network visualizer based on 2 minute papers. Didn’t seem like many alternatives out there to Optuna with TPE + persistence in conditional continuous & discrete spaces.
Anyway, it’s doable to make a multi objective decide_to_prune function with Optuna, here’s an example https://github.com/optuna/optuna/issues/3450#issuecomment-19...
Project mention: Potential of the Julia programming language for high energy physics computing | news.ycombinator.com | 2023-12-04> Yes, julia can be called from other languages rather easily
This seems false to me. StaticCompiler.jl [1] puts in their limitations that "GC-tracked allocations and global variables do not work with compile_executable or compile_shlib. This has some interesting consequences, including that all functions within the function you want to compile must either be inlined or return only native types (otherwise Julia would have to allocate a place to put the results, which will fail)." PackageCompiler.jl [2] has the same limitations if I'm not mistaken. So then you have to fall back to distributing the Julia "binary" with a full Julia runtime, which is pretty heavy. There are some packages which do this. For example, PySR [3] does this.
There is some word going around though that there is an even better static compiler in the making, but as long as that one is not publicly available I'd say that Julia cannot easily be called from other languages.
[1]: https://github.com/tshort/StaticCompiler.jl
[2]: https://github.com/JuliaLang/PackageCompiler.jl
[3]: https://github.com/MilesCranmer/PySR
An awesome read!
Something related that I found out about from HN a few months back is another engine called quokka. It's particularly interesting and applicable how quokka schedules distributed queries to outperform Spark https://github.com/marsupialtail/quokka/blob/master/blog/why...
Project mention: Show HN: Real-time image autocomplete in <100 lines of code with SDXL Lightning | news.ycombinator.com | 2024-02-23We made a small app for SDXL Lightning, running your own Python code on GPUs. It generates images in real time.
https://potatoes.ai/
We know there was a fal.ai post yesterday, and that got a lot of interest, but we also made this demo yesterday and didn't share — just wanted to mention it as an alternative option for people who like running their own code and custom models instead of using a prebuilt API provider.
The backend code is open-source too and you can deploy it yourself: https://github.com/modal-labs/modal-examples/blob/main/06_gpu_and_ml/stable_diffusion/stable_diffusion_xl_lightning.py
Project mention: Show HN: Hatchet – Open-source distributed task queue | news.ycombinator.com | 2024-03-08
Project mention: [P] Introducing PPO and Rainbow DQN to our super fast evolutionary HPO reinforcement learning framework | /r/MachineLearning | 2023-10-15
Python Distributed related posts
-
Ray: Unified framework for scaling AI and Python applications
-
Optuna – A Hyperparameter Optimization Framework
-
Future Plan for Arq
-
How to test optimal parameters
-
Optuna – A Hyperparameter Optimization Framework
-
Word2vec
-
How Query Engines Work
-
A note from our sponsor - InfluxDB
www.influxdata.com | 21 May 2024
Index
What are some of the best open-source Distributed projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Ray | 31,414 |
2 | nni | 13,797 |
3 | optuna | 9,751 |
4 | modin | 9,498 |
5 | scrapy-redis | 5,466 |
6 | Gerapy | 3,223 |
7 | lingvo | 2,779 |
8 | arq | 1,959 |
9 | PySR | 1,961 |
10 | fugue | 1,887 |
11 | MLBox | 1,477 |
12 | quokka | 1,082 |
13 | code2vec | 1,081 |
14 | pottery | 1,019 |
15 | evotorch | 976 |
16 | bagua | 865 |
17 | runhouse | 725 |
18 | optuna-examples | 609 |
19 | Pyrlang | 585 |
20 | modal-examples | 585 |
21 | wakaq | 566 |
22 | AgileRL | 501 |
23 | malib | 467 |
Sponsored