SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Tensorflow Open-Source Projects
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Ray
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
-
data-science-ipython-notebooks
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
-
DeepSpeech
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
-
d2l-en
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
-
ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
-
datasets
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
-
face-api.js
JavaScript API for face detection and face recognition in the browser and nodejs with tensorflow.js
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
# L2-normalize the encoding tensors image_encoding = tf.math.l2_normalize(image_encoding, axis=1) audio_encoding = tf.math.l2_normalize(audio_encoding, axis=1) # Find euclidean distance between image_encoding and audio_encoding # Essentially trying to detect if the face is saying the audio # Will return nan without the 1e-12 offset due to https://github.com/tensorflow/tensorflow/issues/12071 d = tf.norm((image_encoding - audio_encoding) + 1e-12, ord='euclidean', axis=1, keepdims=True) discriminator = keras.Model(inputs=[image_input, audio_input], outputs=[d], name="discriminator")
Project mention: Reading list to join AI field from Hugging Face cofounder | news.ycombinator.com | 2024-05-18Not sure what you are implying. Thomas Wolf has the second highest number of commits on HuggingFace/transformers. He is clearly competent & deeply technical
https://github.com/huggingface/transformers/
Project mention: Side Quest #3: maybe the real Deepfakes were the friends we made along the way | dev.to | 2024-05-20def batcher_from_directory(batch_size:int, dataset_path:str, shuffle=False,seed=None) -> tf.data.Dataset: """ Return a tensorflow Dataset object that returns images and spectrograms as required. Partly inspired by https://github.com/keras-team/keras/blob/v3.3.3/keras/src/utils/image_dataset_utils.py Args: batch_size: The batch size. dataset_path: The path to the dataset folder which must contain the image folder and audio folder. shuffle: Whether to shuffle the dataset. Default to False. seed: The seed for the shuffle. Default to None. """ image_dataset_path = os.path.join(dataset_path, "image") # create the foundation datasets og_dataset = tf.data.Dataset.from_generator(lambda: original_image_path_gen(image_dataset_path), output_signature=tf.TensorSpec(shape=(), dtype=tf.string)) og_dataset = og_dataset.repeat(None) # repeat indefinitely ref_dataset = tf.data.Dataset.from_generator(lambda: ref_image_path_gen(image_dataset_path), output_signature=(tf.TensorSpec(shape=(), dtype=tf.string), tf.TensorSpec(shape=(), dtype=tf.bool))) ref_dataset = ref_dataset.repeat(None) # repeat indefinitely # create the input datasets og_image_dataset = og_dataset.map(lambda x: tf.py_function(load_image, [x, tf.convert_to_tensor(False, dtype=tf.bool)], tf.float32), num_parallel_calls=tf.data.AUTOTUNE) masked_image_dataset = og_image_dataset.map(lambda x: tf.py_function(load_masked_image, [x], tf.float32), num_parallel_calls=tf.data.AUTOTUNE) ref_image_dataset = ref_dataset.map(lambda x, y: tf.py_function(load_image, [x, y], tf.float32), num_parallel_calls=tf.data.AUTOTUNE) audio_spec_dataset = og_dataset.map(lambda x: tf.py_function(load_audio_data, [x, dataset_path], tf.float64), num_parallel_calls=tf.data.AUTOTUNE) unsync_spec_dataset = ref_dataset.map(lambda x, _: tf.py_function(load_audio_data, [x, dataset_path], tf.float64), num_parallel_calls=tf.data.AUTOTUNE) # ensure shape as tensorflow does not accept unknown shapes og_image_dataset = og_image_dataset.map(lambda x: tf.ensure_shape(x, IMAGE_SHAPE)) masked_image_dataset = masked_image_dataset.map(lambda x: tf.ensure_shape(x, MASKED_IMAGE_SHAPE)) ref_image_dataset = ref_image_dataset.map(lambda x: tf.ensure_shape(x, IMAGE_SHAPE)) audio_spec_dataset = audio_spec_dataset.map(lambda x: tf.ensure_shape(x, AUDIO_SPECTROGRAM_SHAPE)) unsync_spec_dataset = unsync_spec_dataset.map(lambda x: tf.ensure_shape(x, AUDIO_SPECTROGRAM_SHAPE)) # multi input using https://discuss.tensorflow.org/t/train-a-model-on-multiple-input-dataset/17829/4 full_dataset = tf.data.Dataset.zip((masked_image_dataset, ref_image_dataset, audio_spec_dataset, unsync_spec_dataset), og_image_dataset) # if shuffle: # full_dataset = full_dataset.shuffle(buffer_size=batch_size * 8, seed=seed) # not sure why buffer size is such # batch full_dataset = full_dataset.batch(batch_size=batch_size) return full_dataset
In 2019, a new language representation called BERT (Bedirectional Encoder Representation from Transformers) was introduced. The main idea behind this paradigm is to first pre-train a language model using a massive amount of unlabeled data then fine-tune all the parameters using labeled data from the downstream tasks. This allows the model to generalize well to different NLP tasks. Moreover, it has been shown that this language representation model can be used to solve downstream tasks without being explicitly trained on, e.g classify a text without training phase.
Project mention: Show HN: Memories, FOSS Google Photos alternative built for high performance | news.ycombinator.com | 2024-03-21I have been using https://www.photoprism.app for a couple of years, and it works better than expected, with the latest updates it's actually quite fast and the face tagging works reasonably well.
Project mention: Ray: Unified framework for scaling AI and Python applications | news.ycombinator.com | 2024-05-03
virtual dj and others stem separator is shrinked model of this https://github.com/deezer/spleeter you will get better results downloading original + their large model.
Project mention: ESpeak-ng: speech synthesizer with more than one hundred languages and accents | news.ycombinator.com | 2024-05-01As I understand it DeepSpeech is no longer actively maintained by Mozilla: https://github.com/mozilla/DeepSpeech/issues/3693
For Text To Speech, I've found Piper TTS useful (for situations where "quality"=="realistic"/"natual"): https://github.com/rhasspy/piper
For Speech to Text (which AIUI DeepSpeech provided), I've had some success with Vosk: https://github.com/alphacep/vosk-api
Project mention: Intuituvely Understanding Harris Corner Detector | news.ycombinator.com | 2023-09-11The most widely used algorithms for classical feature detection today are "whatever opencv implements"
In terms of tech that's advancing at the moment? https://co-tracker.github.io/ if you want to track individual points, https://github.com/matterport/Mask_RCNN and its descendents if you want to detect, say, the cover of a book.
Project mention: AMD Funded a Drop-In CUDA Implementation Built on ROCm: It's Open-Source | news.ycombinator.com | 2024-02-12ncnn uses Vulkan for GPU acceleration, I've seen it used in a few projects to get AMD hardware support.
https://github.com/Tencent/ncnn
Project mention: 🐍🐍 23 issues to grow yourself as an exceptional open-source Python expert 🧑💻 🥇 | dev.to | 2023-10-19
Then I used face-api.js to find the coordinates of each eye.
Project mention: License Plate Recognition with Home Assistant, Codeproject.ai, and Frigate NVR | news.ycombinator.com | 2024-04-26
You can always slice the images into smaller ones, run detection on each tile, and combine results. Supervision has a utility for this - https://supervision.roboflow.com/latest/detection/tools/infe..., but it only works with detections. You can get a much more accurate result this way. Here is some side-by-side comparison: https://github.com/roboflow/supervision/releases/tag/0.14.0.
See also https://github.com/unifyai/ivy which I have not tried but seems along the lines of what you are describing, working with all the major frameworks
Tensorflow related posts
-
Side Quest #3: maybe the real Deepfakes were the friends we made along the way
-
Reading list to join AI field from Hugging Face cofounder
-
Apple to Power AI Features with M2 Ultra Servers
-
XLSTM: Extended Long Short-Term Memory
-
Container size analysis: TensorFlow 2.8 base image vs Deep Learning
-
Zero Shot Text Classification Under the hood
-
Side Quest Devblog #1: These Fakes are getting Deep
-
A note from our sponsor - SaaSHub
www.saashub.com | 20 May 2024
Index
What are some of the best open-source Tensorflow projects? This list will help you:
Project | Stars | |
---|---|---|
1 | tensorflow | 182,857 |
2 | transformers | 126,170 |
3 | Keras | 61,044 |
4 | Real-Time-Voice-Cloning | 50,951 |
5 | TensorFlow-Examples | 43,247 |
6 | bert | 37,119 |
7 | gold-miner | 33,424 |
8 | PhotoPrism | 32,913 |
9 | Ray | 31,414 |
10 | data-science-ipython-notebooks | 26,545 |
11 | netron | 26,355 |
12 | handson-ml | 25,099 |
13 | spleeter | 25,036 |
14 | DeepSpeech | 24,422 |
15 | Mask_RCNN | 24,201 |
16 | d2l-en | 21,922 |
17 | ncnn | 19,352 |
18 | datasets | 18,523 |
19 | face-api.js | 16,142 |
20 | best-of-ml-python | 15,672 |
21 | frigate | 15,006 |
22 | supervision | 14,673 |
23 | ivy | 14,024 |
Sponsored