Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 19 C++ Neon Projects
-
simdjson
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
-
mace
MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
xsimd
C++ wrappers for SIMD intrinsics and parallelized, optimized mathematical functions (SSE, AVX, AVX512, NEON, SVE))
-
Simd
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, VMX(Altivec) and VSX(Power7) for PowerPC, NEON for ARM. (by ermig1979)
-
StringZilla
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores, etc 🦖
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
DirectXMath
DirectXMath is an all inline SIMD C++ linear algebra library for use in games and graphics apps
-
fast_float
Fast and exact implementation of the C++ from_chars functions for number types: 4x to 10x faster than strtod, part of GCC 12 and WebKit/Safari
-
simdutf
Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.
-
MIPP
MIPP is a portable wrapper for SIMD instructions written in C++11. It supports NEON, SSE, AVX, AVX-512 and SVE (length specific).
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Tips on adding JSON output to your command line utility. (2021) | news.ycombinator.com | 2024-04-20It's also supported by simdjson [0] (which has a lot of language bindings [1]):
> Multithreaded processing of gigantic Newline-Delimited JSON (ndjson) and related formats at 3.5 GB/s
[0] https://simdjson.org/
[0] https://github.com/simdjson/simdjson?tab=readme-ov-file#bind...
Project mention: Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times for AMD Zen 4 | news.ycombinator.com | 2024-03-31The bf16 dot instruction replaces 6 instructions: https://github.com/google/highway/blob/master/hwy/ops/x86_12...
Project mention: Creando Subtítulos Automáticos para Vídeos con Python, Faster-Whisper, FFmpeg, Streamlit, Pillow | dev.to | 2024-04-29
https://github.com/xtensor-stack/xsimd
GH topics > HashMap:
I was curious about these libraries a few weeks ago and did some searching. Is there one that's got a clearly dominating set of users or contributors?
I don't know what a good way to compare these might be, other than perhaps activity/contributor count.
[1] https://github.com/simd-everywhere/simde
[2] https://github.com/ermig1979/Simd
[3] https://github.com/google/highway
[4] https://gitlab.com/libeigen/eigen
[5] https://github.com/shibatch/sleef
Project mention: Measuring energy usage: regular code vs. SIMD code | news.ycombinator.com | 2024-02-19The 3.5x energy-efficiency gap between serial and SIMD code becomes even larger when
A. you do byte-level processing instead of float words;
B. you use embedded, IoT, and other low-energy devices.
A few years ago I've compared Nvidia Jetson Xavier (long before the Orin release), Intel-based MacBook Pro with Core i9, and AVX-512 capable CPUs on substring search benchmarks.
On Xavier one can quite easily disable/enable cores and reconfigure power usage. At peak I got to 4.2 GB/J which was an 8.3x improvement in inefficiency over LibC in substring search operations. The comparison table is still available in the older README: https://github.com/ashvardanian/StringZilla/tree/v2.0.2?tab=...
...
can_ada is just the python bindings, largely generated via pybind11.
The actual project is at https://github.com/ada-url/ada
IIRC all of the simdutf implementations use a lot of lookup tables except for the AVX512 and RVV backens.
Here is e.g. the NEON code: https://github.com/simdutf/simdutf/blob/1b8ca3d1072a8e2e1026...
neither proposing nor taking a position on this possible addition)
> ... For completeness we would also like to add that a serious issue is that C still lacks vector operations.
Those are good points. The authors don't take a stance on it, but I do think that syntax for packed structs should be standardized. IMO, so should syntax for inline assembly (both as optional features). These are already common extensions; this is exactly what they should standardize. The additions of "typeof" and #embed are also good examples of this (they had been talking about adding #embed since 1995 [1]).
As for vector instructions, I'm unsure how it could be implemented in a standard way, but I'm not against it. Maybe something like this [2], but with the syntax changed for C instead of C++.
[1]: https://groups.google.com/g/comp.std.c/c/zWFEXDvyTwM
[2]: https://github.com/VcDevel/std-simd
I've also run into this thinking, and have been looking to solve it in codebases I'm working on.
I've run across: https://github.com/aff3ct/MIPP but have not worked with it extensively yet. It looks to be a solution to the rewriting X parallel pipeline into Y SIMD extensions.
Perhaps something like this, or languages introducing something similar into their standard libraries/modules would be a solution.
None of this of course solves the run-time detection of capability/growing binary size to support such.
C++ Neon related posts
-
Decoding UTF8 with Parallel Extract
-
Glibc Buffer Overflow in Iconv
-
Vectorizing Unicode conversions on real RISC-V hardware
-
Cray-1 performance vs. modern CPUs
-
SIMD Everywhere Optimization from ARM Neon to RISC-V Vector Extensions
-
The Case of the Missing SIMD Code
-
[Preprint] Transcoding Unicode Characters with AVX-512 Instructions
-
A note from our sponsor - InfluxDB
www.influxdata.com | 17 May 2024
Index
What are some of the best open-source Neon projects in C++? This list will help you:
Project | Stars | |
---|---|---|
1 | simdjson | 18,496 |
2 | mace | 4,882 |
3 | highway | 3,673 |
4 | CTranslate2 | 2,841 |
5 | xsimd | 2,052 |
6 | Simd | 1,979 |
7 | StringZilla | 1,811 |
8 | DirectXMath | 1,491 |
9 | Vc | 1,420 |
10 | fast_float | 1,284 |
11 | sse2neon | 1,230 |
12 | ada | 1,221 |
13 | libsimdpp | 1,193 |
14 | simdutf | 965 |
15 | eve | 859 |
16 | std-simd | 544 |
17 | MIPP | 464 |
18 | hlslpp | 452 |
19 | fractals | 1 |
Sponsored