Top 23 Python Computer Vision Projects

Face Recognition

34 51,816 0.0 Python

The world's simplest facial recognition api for Python and the command line

Project mention: Security Image Recognition | /r/computervision | 2023-12-10

Camera connected to a PI? Something like this could run locally: https://github.com/ageitgey/face_recognition

pytorch-CycleGAN-and-pix2pix

10 22,029 2.5 Python

Image-to-Image Translation in PyTorch

Project mention: List of AI-Models | /r/GPT_do_dah | 2023-05-16

Click to Learn more...

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
EasyOCR

38 21,953 3.6 Python

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Project mention: Leveraging GPT-4 for PDF Data Extraction: A Comprehensive Guide | dev.to | 2023-12-27

PyTesseract Module [ Github ] EasyOCR Module [ Github ] PaddlePaddle OCR [ Github ]

d2l-en

6 21,704 8.5 Python

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
datasets

15 18,443 9.5 Python

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools

Project mention: 🐍🐍 23 issues to grow yourself as an exceptional open-source Python expert 🧑‍💻 🥇 | dev.to | 2023-10-19

vit-pytorch

11 18,006 7.3 Python

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Project mention: Is it easier to go from Pytorch to TF and Keras than the other way around? | /r/pytorch | 2023-05-13

I also need to learn Pyspark so right now I am going to download the Fashion Mnist dataset, use Pyspark to downsize each image and put the into separate folders according to their labels (just to show employers I can do some basic ETL with Pyspark, not sure how I am going to load for training in Pytorch yet though). Then I am going to write the simplest Le Net to try to categorize the fashion MNIST dataset (results will most likely be bad but it's okay). Next, try to learn transfer learning in Pytorch for both CNN or maybe skip ahead to ViT. Ideally at this point I want to study the Attention mechanism a bit more and try to implement Simple Vit which I saw here: https://github.com/lucidrains/vit-pytorch/blob/main/vit_pytorch/simple_vit.py

vision

19 15,454 9.4 Python

Datasets, Transforms and Models specific to Computer Vision

Project mention: Transitioning From PyTorch to Burn | dev.to | 2024-02-14

Let's start by defining the ResNet module according to the Residual Network architecture, as replicated[1] by the torchvision implementation of the model we will import. Detailed architecture variants with a depth of 18, 34, 50, 101 and 152 layers can be found in the table below.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
supervision

15 14,068 9.9 Python

We write your reusable computer vision tools. 💜

Project mention: Supervision: Reusable Computer Vision | news.ycombinator.com | 2024-03-24

You can always slice the images into smaller ones, run detection on each tile, and combine results. Supervision has a utility for this - https://supervision.roboflow.com/latest/detection/tools/infe..., but it only works with detections. You can get a much more accurate result this way. Here is some side-by-side comparison: https://github.com/roboflow/supervision/releases/tag/0.14.0.

facenet

5 13,507 0.0 Python

Face recognition using Tensorflow
labelme

6 12,361 8.7 Python

Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
fashion-mnist

15 11,439 0.0 Python

A MNIST-like fashion product database. Benchmark :point_down:

Project mention: Logistic Regression for Image Classification Using OpenCV | news.ycombinator.com | 2023-12-31

In this case there's no advantage to using logistic regression on an image other than the novelty. Logistic regression is excellent for feature explainability, but you can't explain anything from an image.
Traditional classification algorithms but not deep learning such as SVMs and Random Forest perform a lot better on MNIST, up to 97% accuracy compared to the 88% from logistic regression in this post. Check the Original MNIST benchmarks here: http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/#

gaussian-splatting

7 11,391 8.7 Python

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Project mention: Show HN: Gaussian Splat renderer in VR with Unity | news.ycombinator.com | 2024-01-24

Chris' post doesn't really give much background info, so here's what's going on here and why it's awesome.
Real-time 3D rendering has historically been based on rasterisation of polygons. This has brought us a long way and has a lot of advantages, but making photorealistic scenes takes a lot of work from the artist. You can scan real objects like photogrammetry and then convert to high poly meshes, but photogrammetry rigs are pro-level tools, and the assets won't render at real time speeds. Unreal 5 introduced Nanite which is a very advanced LoD algorithm and that helps a lot, but again, we seem to be hitting the limits of what can be done with polygon based rendering.
3D Gaussian Splats is a new AI based technique that lets you render in real-time photorealistic 3D scenes that were captured with only a few photos taken using normal cameras. It replaces polygon based rendering with radiance fields.
https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
3DGS uses several advanced techniques:
1. A 3D point cloud is estimated by using "structure in motion" techniques.
2. The points are turned into "3D gaussians", which are sort of floating blobs of light where each one has a position, opacity and a covariance matrix defined using "spherical harmonics" (no me neither). They're ellipsoids so can be thought of as spheres that are stretched and rotated.
3. Rendering is done via a form of ray-tracing in which the 3D Gaussians are projected to the 2D screen (into "splats"), sorted so transparency works and then rasterized on the fly using custom shaders.
The neural network isn't actually used at rendering time, so GPUs can render the scene nice and fast.
In terms of what it can do the technique might be similar to Unreal's Nanite. Both are designed for static scenes. Whilst 3D Gaussians can be moved around on the fly, so the scene can be changed in principle, none of the existing animation, game engines or artwork packages know what to do without polygons. But this sort of thing could be used to rapidly create VR worlds based on only videos taken from different angles, which seems useful.

ludwig

3 10,827 9.5 Python

Low-code framework for building custom LLMs, neural networks, and other AI models

Project mention: Show HN: Toolkit for LLM Fine-Tuning, Ablating and Testing | news.ycombinator.com | 2024-04-07

This is a great project, little bit similar to https://github.com/ludwig-ai/ludwig, but it includes testing capabilities and ablation.
questions regarding the LLM testing aspect: How extensive is the test coverage for LLM use cases, and what is the current state of this project area? Do you offer any guarantees, or is it considered an open-ended problem?
Would love to see more progress toward this area!

Meshroom

126 10,599 9.5 Python

3D Reconstruction Software

Project mention: AI bots saying I made a fake post: NOT FAKE POST | /r/opensource | 2023-12-10

pytorch-grad-cam

5 9,456 5.4 Python

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Project mention: Exploring GradCam and More with FiftyOne | dev.to | 2024-02-13

For the two examples we will be looking at, we will be using pytorch_grad_cam, an incredible open source package that makes working with GradCam very easy. There are excellent other tutorials to check out on the repo as well.

Kornia

11 9,395 9.4 Python

Geometric Computer Vision Library for Spatial AI
nerfstudio

10 8,533 9.6 Python

A collaboration friendly studio for NeRFs

Project mention: Smerf: Streamable Memory Efficient Radiance Fields | news.ycombinator.com | 2023-12-13

You’re under the right paper for doing this. Instead of one big model, they have several smaller ones for regions in the scene. This way rendering is fast for large scenes.
This is similar to Block-NeRF [0], in their project page they show some videos of what you’re asking.
As for an easy way of doing this, nothing out-of-the-box. You can keep an eye on nerfstudio [1], and if you feel brave you could implement this paper and make a PR!
[0] https://waymo.com/intl/es/research/block-nerf/
[1] https://github.com/nerfstudio-project/nerfstudio

RobustVideoMatting

16 8,189 0.0 Python

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!

Project mention: lineart_coarse + openpose, batch img2img | /r/StableDiffusion | 2023-05-10

U-2-Net

30 8,115 3.1 Python

The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."

Project mention: I used the ChatGPT API to create a proof-of-concept AI driven video game. Using generative AI for the images and dialogue and GPT-3.5 for narrative and game control. More info in comments. | /r/ChatGPT | 2023-06-17

I use a finetuned custom Stable Diffusion model in combination with a style embedding for the characters for image generation and U²-Net for background removal.

deeplake

13 7,729 9.8 Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25

autogluon

8 7,124 9.6 Python

Fast and Accurate ML in 3 Lines of Code
fiftyone

19 6,712 10.0 Python

The open-source tool for building high-quality datasets and computer vision models

Project mention: May 8, 2024 AI, Machine Learning and Computer Vision Meetup | dev.to | 2024-05-01

In this brief walkthrough, I will illustrate how to leverage open-source FiftyOne and Anomalib to build deployment-ready anomaly detection models. First, we will load and visualize the MVTec AD dataset in the FiftyOne App. Next, we will use Albumentations to test out augmentation techniques. We will then train an anomaly detection model with Anomalib and evaluate the model with FiftyOne.

BackgroundMattingV2

3 6,665 2.8 Python

Real-Time High-Resolution Background Matting
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Computer Vision related posts

Voxel51 Is Hiring AI Researchers and Scientists — What the New Open Science Positions Mean

1 project | dev.to | 26 Apr 2024
Show HN: I made a ROS package for realtime semantic segmentation

1 project | news.ycombinator.com | 26 Apr 2024
How to Estimate Depth from a Single Image

8 projects | dev.to | 25 Apr 2024
How to Detect Small Objects

3 projects | dev.to | 22 Apr 2024
How to Cluster Images

5 projects | dev.to | 9 Apr 2024
Running OCR against PDFs and images directly in the browser

7 projects | news.ycombinator.com | 30 Mar 2024
Supervision: Reusable Computer Vision

5 projects | news.ycombinator.com | 24 Mar 2024
A note from our sponsor - SaaSHub
www.saashub.com | 4 May 2024

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Computer Vision projects in Python? This list will help you:

	Project	Stars
1	Face Recognition	51,816
2	pytorch-CycleGAN-and-pix2pix	22,029
3	EasyOCR	21,953
4	d2l-en	21,704
5	datasets	18,443
6	vit-pytorch	18,006
7	vision	15,454
8	supervision	14,068
9	facenet	13,507
10	labelme	12,361
11	fashion-mnist	11,439
12	gaussian-splatting	11,391
13	ludwig	10,827
14	Meshroom	10,599
15	pytorch-grad-cam	9,456
16	Kornia	9,395
17	nerfstudio	8,533
18	RobustVideoMatting	8,189
19	U-2-Net	8,115
20	deeplake	7,729
21	autogluon	7,124
22	fiftyone	6,712
23	BackgroundMattingV2	6,665

Python Computer Vision

Top 23 Python Computer Vision Projects

Python Computer Vision related posts

Voxel51 Is Hiring AI Researchers and Scientists — What the New Open Science Positions Mean

Show HN: I made a ROS package for realtime semantic segmentation

How to Estimate Depth from a Single Image

How to Detect Small Objects

How to Cluster Images

Running OCR against PDFs and images directly in the browser

Supervision: Reusable Computer Vision

Index