Dask
Poetry
Dask | Poetry | |
---|---|---|
32 | 378 | |
12,113 | 29,807 | |
0.9% | 1.1% | |
9.6 | 9.6 | |
3 days ago | 6 days ago | |
Python | Python | |
BSD 3-clause "New" or "Revised" License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Dask
- The Distributed Tensor Algebra Compiler (2022)
-
A peek into Location Data Science at Ola
Data scientists work on phenomenally large datasets, and Dask is a handy tool for exploration within the confines of a single cloud VM or their local PCs. Location data visualization is an essential part of deciding further algorithm development and roadmap for projects. This lays the foundation for data engineering and science to work at scale, with petabytes of data.
- File format for large data with many columns
-
What is the best way to save a csv.file in number only ? PC hangs when my file is more than 2GB
Dask
-
Large Scale Hydrology: Geocomputational tools that you use
We're using a lot of Python. In addition to these, gridMET, Dask, HoloViz, and kerchunk.
-
msgspec - a fast & friendly JSON/MessagePack library
I wrote this for speeding up the RPC messaging in dask, but figured it might be useful for others as well. The source is available on github here: https://github.com/jcrist/msgspec.
-
What does it mean to scale your python powered pipeline?
Dask: Distributed data frames, machine learning and more
-
Data pipelines with Luigi
To do that, we are efficiently using Dask, simply creating on-demand local (or remote) clusters on task run() method:
-
Is Numpy always more efficient than Pandas? And how much should we rely on Python anyway?
Look into Dask, see: https://dask.org/
-
Ask HN: Is PySPark a Dead-End?
[1] https://dask.org/
Poetry
-
Understanding Dependencies in Programming
You can manage dependencies in Python with the package manager pip, which comes pre-installed with Python. Pip allows you to install and uninstall Python packages, and it uses a requirements.txt file to keep track of which packages your project depends on. However, pip does not have robust dependency resolution features or isolate dependencies for different projects; this is where tools like pipenv and poetry come in. These tools create a virtual environment for each project, separating the project's dependencies from the system-wide Python environment and other projects.
-
Implementing semantic image search with Amazon Titan and Supabase Vector
Poetry provides packaging and dependency management for Python. If you haven't already, install poetry via pip:
-
From Kotlin Scripting to Python
Poetry
-
How to Enhance Content with Semantify
The Semantify repository provides an example Astro.js project. Ensure you have poetry installed, then build the project from the root of the repository:
-
Uv: Python Packaging in Rust
Has anyone else been paying attention to how hilariously hard it is to package PyTorch in poetry?
https://github.com/python-poetry/poetry/issues/6409
-
Boring Python: dependency management (2022)
Based on this comment 5 days ago[0], it's working? I'm not sure didn't dig in too far but based on that comment it seems fair to say that it's not fully Poetry's fault because torch removed hashes (which poetry needs to be effective) for a while only recently adding it back in.
Not sure where I would stand if I fully investigated it tho.
[0] https://github.com/python-poetry/poetry/issues/6409#issuecom...
-
Fun with Avatars: Crafting the core engine | Part. 1
We will be running this project in Python 3.10 on Mac/Linux, and we will use Poetry to manage our dependencies. Later, we will bundle our app into a container using docker for deployment.
-
Python Packaging, One Year Later: A Look Back at 2023 in Python Packaging
Here are the two main packaging issues I run into, specifically when using Poetry:
1) Lack of support for building extension modules (as mentioned by the article). There is a workaround using an undocumented feature [0], which I've tried, but ultimately decided it was not the right approach. I still use Poetry, but build the extension as a separate step in CI, rather than kludging it into Poetry.
2) Lack of support for offline installs [1], e.g. being able to download the dependencies, copy them to another machine, and perform the install from the downloaded dependencies (similar to using "pip --no-index --find-links=."). Again, you can work around this (by using "poetry export --with-credentials" and "pip download" for fetching the dependencies, then firing up pypiserver [2] to run a local PyPI server on the offline machine), but ideally this would all be a first class feature of Poetry, similar to how it is in pip.
I don't have the capacity to create Pull Requests for addressing these issues with Poetry, and I'm very grateful for the maintainers and those who do contribute. Instead, on the linked issues I share my notes on the matter, in the hope that it may at least help others and potentially get us closer to a solution.
Regardless, I'm sticking with Poetry for now. Though to be fair, the only other Python packaging tools I've used extensively are Pipenv and pip/setuptools. It's time consuming to thoroughly try out these other packaging tools, and is generally lower priority than developing features/fixing bugs, so it's helpful to read about the author's experience with these other tools, such as PDM and Hatch.
[0] https://github.com/python-poetry/poetry/issues/2740
[1] https://github.com/python-poetry/poetry/issues/2184
[2] https://pypi.org/project/pypiserver/
-
Introducing Flama for Robust Machine Learning APIs
We believe that poetry is currently the best tool for this purpose, besides of being the most popular one at the moment. This is why we will use poetry to manage the dependencies of our project throughout this series of posts. Poetry allows you to declare the libraries your project depends on, and it will manage (install/update) them for you. Poetry also allows you to package your project into a distributable format and publish it to a repository, such as PyPI. We strongly recommend you to learn more about this tool by reading the official documentation.
-
How do you resolve dependency conflicts?
I started using poetry. The problem is poetry will not install if there is dependency conflict and there is no way to ignore: github
What are some alternatives?
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Pipenv - Python Development Workflow for Humans.
Numba - NumPy aware dynamic Python compiler using LLVM
PDM - A modern Python package and dependency manager supporting the latest PEP standards
Kedro - Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
hatch - Modern, extensible Python project management
NetworkX - Network Analysis in Python
pyenv - Simple Python version management
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
pip-tools - A set of tools to keep your pinned Python dependencies fresh.
Interactive Parallel Computing with IPython - IPython Parallel: Interactive Parallel Computing in Python
virtualenv - Virtual Python Environment builder