ydata-quality
awesome-data-centric-ai
ydata-quality | awesome-data-centric-ai | |
---|---|---|
1 | 7 | |
413 | 306 | |
0.7% | 1.3% | |
0.0 | 3.2 | |
20 days ago | 6 months ago | |
Jupyter Notebook | Jupyter Notebook | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ydata-quality
-
[P] Open-source python library for assessing Data Quality
Hi r/MachineLearning community! We at YData created an open-source project regarding data quality ( https://github.com/ydataai/ydata-quality ) and wanted to share it with you all!
awesome-data-centric-ai
-
Thoughts: Continue current degree with one year left, or start anew with degree apprenticeship
I would finish the degree anyway. It's only one year left. If teachers miss classes, I would disregard that and try to learn on my own, and then yes, I would move on to an internship (or even do It at the same time if it's possible). If you like, come as meet us at the Data-Centric AI Community and we can do some projects together :)
-
Data science projects
Definitely a lot of growth in the AI space, and it will evolve rapidly in the next few years. There several paid propositions at the Data-Centric AI Community discord, check them out.
-
I absolutely hate my internship
2: Tbh, quit (?) We have open jobs at the Data-Centric AI Community. Bonus points: you can vent there as much as you want
-
Prioritise Data Science Projects
Let me invite you to the Data-Centric AI Community we have several code along sessions and projects and a lot of beginners that are starting to learn DS that you can connect with.
-
Imbalanced data
If you need specific help with your project you can find me at the Data-Centric AI Community and we'll be happy to take a look and give you some tips to move forward :)
-
Building my first Porfolio
You can share with us your progress on the Data-Centric AI Community and ask someone to review it, we often do that with CVs as well and help each other out.
-
[Q] How to generate synthetic dataset for anomaly detection?
Maybe you can use a synthetic data generator and use your current dataset as input? I believe there are a lot of GAN-based models for this purpose out there. The ones listed on https://github.com/Data-Centric-AI-Community/awesome-data-centric-ai are mostly focused on structured data, but I'm sure there are similar packages for images.
What are some alternatives?
data - Data and code behind the articles and graphics at FiveThirtyEight
ydata-synthetic - Synthetic data generators for tabular and time-series data
ta - Technical Analysis Library using Pandas and Numpy
machine_learning_complete - A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.
alphalens - Performance analysis of predictive (alpha) stock factors
walkalongs - Resources and solutions of various technologies that I am currently learning
feature-engineering-tutorials - Data Science Feature Engineering and Selection Tutorials
DataScienceProjects
code - Compilation of R and Python programming codes on the Data Professor YouTube channel.
Portfolio
PANDAS-TUTORIAL - Jupyter Notebooks and Data Sets for Pandas Library
fullnamematchscore-go - Generates a match score of two person names from 0-100, where 100 is the highest, on how closely two individual full names match. The scoring is based on a series of tests, algorithms, AI, and an ever-growing body of Machine Learning-based generated knowledge