[R] New sparsity research (oBERT) enabled 175X increase in CPU performance for MLPerf submission

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • deepsparse

    Sparsity-aware deep learning inference runtime for CPUs

  • Utilizing the oBERT research we published at Neural Magic and some further iteration, we’ve enabled an increase in NLP performance of 175X while retaining 99% accuracy on the question-answering task in MLPerf. A combination of distillation, layer dropping, quantization, and unstructured pruning with oBERT enabled these large performance gains through the DeepSparse Engine. All of our contributions and research are open-sourced or free to use. Read through the oBERT paper on arxiv, try out the research in SparseML, and dive into the writeup to learn more about how we achieved these impressive results and utilize them for your own use cases!

  • sparseml

    Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

  • Utilizing the oBERT research we published at Neural Magic and some further iteration, we’ve enabled an increase in NLP performance of 175X while retaining 99% accuracy on the question-answering task in MLPerf. A combination of distillation, layer dropping, quantization, and unstructured pruning with oBERT enabled these large performance gains through the DeepSparse Engine. All of our contributions and research are open-sourced or free to use. Read through the oBERT paper on arxiv, try out the research in SparseML, and dive into the writeup to learn more about how we achieved these impressive results and utilize them for your own use cases!

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • [R] BERT-Large: Prune Once for DistilBERT Inference Performance

    2 projects | /r/MachineLearning | 16 Jul 2022
  • [R] How well do sparse ImageNet models transfer? Prune once and deploy anywhere for inference performance speedups! (arxiv link in comments)

    2 projects | /r/MachineLearning | 26 Jun 2022
  • [P] Compound sparsification: using pruning, quantization, and layer dropping to improve BERT performance

    3 projects | /r/MachineLearning | 20 Oct 2021
  • Giving Odin Intelligence

    5 projects | dev.to | 21 May 2024
  • จำแนกสายพันธ์ุหมากับแมวง่ายๆด้วยYoLoV5

    1 project | dev.to | 15 Apr 2024