-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
pandas-profiling
Discontinued Create HTML profiling reports from pandas DataFrame objects [Moved to: https://github.com/ydataai/pandas-profiling] (by pandas-profiling)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Looks like a Datasette[0] clone which runs on top of something (jupyter) which runs on top of Python (ipython). I think I would like to see how much time it takes to open a massive dataset in Mito & in Datasette :P
[0]: https://datasette.io/
Mito is open source, but using Pro features does actually require a Pro or enterprise license. You can check out this callout in the license [1], as well as the restrictions on Mito Pro features here [2]. We're in the process of fixing up the upgrade to Pro process a bit... as you can tell... :)
You can of course fork Mito and turn off telemetry as long as you open source your changes! Go for it - happy to hop on a call and help you get set up with the codebase, if you want. Yay open source!
[1] https://github.com/mito-ds/monorepo/blob/974091b455950c6c50e...
I played around with many of these before:
https://github.com/quantopian/qgrid
https://github.com/man-group/dtale
I find that I'm actually a lot faster using basic Pandas methods to get the data I want in exactly the form I want it.
If I really want to show everything, I just use:
'''
For those who are going through the thread finding new tools: pandas-profiling[0] is a library for automatic EDA (part of what bamboolib[1] does).
[0]: https://github.com/pandas-profiling/pandas-profiling
One cool library I saw recently for helping on the visualisation side is https://github.com/vegafusion/vegafusion
It allows you to use Altair in Python for visualising data, but does the computation in the backend using Arrow DataFusion. Not for 15GB perhaps, but cool nonetheless.
If you can write visualisations in Python itself, I am a big fan of Altair's syntax (https://github.com/altair-viz/altair), which is based on vega-lite. A while back, I wrote a brief guide and comparison of the main plotting libraries: https://datapane.com/reports/87NNEJ7/the-ultimate-guide-to-p...
One benefit of having them in actual code is that you can programmatically automate the creation of things like dashboards and reports. For instance, schedule a script to share an interactive plot every Monday morning, or build a live dashboard that updates every 10m. This opens up a lot of possibilities that would be impossible in a traditional drag-and-drop tool.