LAVIS
pytorch-widedeep
LAVIS | pytorch-widedeep | |
---|---|---|
18 | 7 | |
8,838 | 1,242 | |
2.9% | - | |
6.3 | 8.7 | |
24 days ago | 7 days ago | |
Jupyter Notebook | Python | |
BSD 3-clause "New" or "Revised" License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
LAVIS
- FLaNK AI for 11 March 2024
- FLaNK 04 March 2024
-
[D] Why is most Open Source AI happening outside the USA?
For multimodal, there's China (*many), then Salesforce.
-
Need help for a colab notebook running Lavis blip2_instruct_vicuna13b?
Been trying for all day to get a working inference for this example: https://github.com/salesforce/LAVIS/tree/main/projects/instructblip
-
most sane web3 job listing
There's also been big breakthroughs in computer vision. Not that long ago it was hard to recognize if a photo contained a bird; that's solved now by models like CLIP, Yolo, or Segment Anything. Now research has moved on to generating 3D scenes from images or interactively answering questions about images.
-
I work at a non-tech company and have been asked to make software that is impossible. How do I explain it to my boss?
The new hotness is multimodal vision-language models like InstructBLIP that can interactively answer questions about images. Check out the examples in the github repo, I would not have thought this was possible a few years ago.
-
Two-minute Daily AI Update (Date: 5/15/2023)
Salesforce’s BLIP family has a new member– InstructBLIP, a vision-language instruction-tuning framework using BLIP-2 models. It has achieved state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks, substantially outperforming BLIP-2 and Flamingo. (Source)
-
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Github
-
Can I use my own art as a training set?
Most of my workflows are self-made. For captioning I used Blip-2 in a custom script I made that automates the process by going into directories and their sub-directories and creates a .txt file beside each image. This way I can keep my images organized in their proper directories, without having to put dump them all in a single place.
- FLiP Stack Weekly for 13-Feb-2023
pytorch-widedeep
-
why can't I import pytorch-widedeep ?
Ask the dev https://github.com/jrzaurin/pytorch-widedeep/issues
-
[P] pytorch-widedeep model alert: TabPerceiver and TabFastFormer are now available in the library
New DL models for Tabular Data and functionalities added to the pytorch-widedeep library
-
[P] pytorch-widedeep model alert: SAINT and the FT-Transformer are now available in the library
🚨MODEL ALERT! 🚨New DL models for Tabular Data added to the pytorch-widedeep library . SAINT by Gowthami Somepalli and collaborators (paper: https://arxiv.org/abs/2106.01342) and the FT-Transformer which was already used at the SAINT paper but officially introduced by Yury Gorishniy and collaborators (paper: https://arxiv.org/abs/2106.11959). More functionalities coming soon to the [library](https://github.com/jrzaurin/pytorch-widedeep)
-
How to do K-Fold Cross Validation for hyperparameter tuning on pytorch-widedeep ?
Now in pytorch-widedeep (https://github.com/jrzaurin/pytorch-widedeep) the recommended path for training a single model is:
-
[P] pytorch-widedeep v1.0: deep learning for tabular data that you can combine with images and text
Main repo
- Pytorch-widedeep v1.0: deep learning for tabular data
-
[P] pytorch-widedeep, deep learning for tabular data: Deep Learning vs LightGBM
A thorough comparison between Deep Learning algorithms for tabular data (using pytorch-widedeep ) and LightGBM for classification and regression problems.
What are some alternatives?
CLIP-Caption-Reward - PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)
tabnet - PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
sparseml - Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
alibi-detect - Algorithms for outlier, adversarial and drift detection
robo-vln - Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
rtdl - Research on Tabular Deep Learning (Python package & papers) [Moved to: https://github.com/Yura52/rtdl]
DeepViewAgg - [CVPR'22 Best Paper Finalist] Official PyTorch implementation of the method presented in "Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation"
torchio - Medical imaging toolkit for deep learning
linkis - Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
scrambpy - Scramb.py is a region based JPEG Image Scrambler and Descrambler written in Python for End-to-End-Encrypted (E2EE) Image distribution through unaware channels.
multimodal - A collection of multimodal datasets, and visual features for VQA and captionning in pytorch. Just run "pip install multimodal"
autogluon - Fast and Accurate ML in 3 Lines of Code