C++ Utf8

Open-source C++ projects categorized as Utf8

Top 6 C++ Utf8 Projects

  • ImGuiColorTextEdit

    Colorizing text editor for ImGui

  • simdutf

    Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js and Bun.

  • Project mention: Decoding UTF8 with Parallel Extract | news.ycombinator.com | 2024-05-05

    IIRC all of the simdutf implementations use a lot of lookup tables except for the AVX512 and RVV backens.

    Here is e.g. the NEON code: https://github.com/simdutf/simdutf/blob/1b8ca3d1072a8e2e1026...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Rapidcsv

    C++ CSV parser library

  • tiny-utf8

    Unicode (UTF-8) capable std::string

  • uni-algo

    Unicode Algorithms Implementation for C/C++

  • Project mention: uni-algo: Unicode Algorithms Implementation for C/C++ | news.ycombinator.com | 2024-03-25
  • hypergrep

    Recursively search directories for a regex pattern

  • Project mention: Ugrep – a more powerful, ultra fast, user-friendly, compatible grep | news.ycombinator.com | 2023-12-30

    Another issue with Hyperscan is that if you enable HS_FLAG_UTF8[1], which hypergrep does[2,3], and then search invalid UTF-8, then the result is UB.

    > This flag instructs Hyperscan to treat the pattern as a sequence of UTF-8 characters. The results of scanning invalid UTF-8 sequences with a Hyperscan library that has been compiled with one or more patterns using this flag are undefined.

    That's another issue you'll need to grapple with if you use Hyperscan. PCRE2 used to have this issue[4], but they've since defined the semantics of searching invalid UTF-8 with Unicode mode enabled. ripgrep 14 uses that new mode, but I haven't updated that FAQ answer yet.

    [1]: https://intel.github.io/hyperscan/dev-reference/api_files.ht...

    [2]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...

    [3]: https://github.com/p-ranav/hypergrep/blob/ee85b713aa84e0050a...

    [4]: https://github.com/BurntSushi/ripgrep/blob/master/FAQ.md#why...

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

C++ Utf8 related posts

  • Decoding UTF8 with Parallel Extract

    1 project | news.ycombinator.com | 5 May 2024
  • Glibc Buffer Overflow in Iconv

    1 project | news.ycombinator.com | 21 Apr 2024
  • uni-algo: Unicode Algorithms Implementation for C/C++

    1 project | news.ycombinator.com | 25 Mar 2024
  • Vectorizing Unicode conversions on real RISC-V hardware

    1 project | news.ycombinator.com | 27 Jan 2024
  • Cray-1 performance vs. modern CPUs

    4 projects | news.ycombinator.com | 25 Dec 2023
  • [Preprint] Transcoding Unicode Characters with AVX-512 Instructions

    1 project | /r/asm | 29 Mar 2023
  • Why would a language not natively support SIMD?

    1 project | /r/C_Programming | 17 Feb 2023
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 7 Jun 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Utf8 projects in C++? This list will help you:

Project Stars
1 ImGuiColorTextEdit 1,353
2 simdutf 990
3 Rapidcsv 824
4 tiny-utf8 538
5 uni-algo 250
6 hypergrep 163

Sponsored
Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com