Parser generators vs. handwritten parsers: surveying major languages in 2021

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

llvm-project

354 25,962 10.0 C++

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

It seems to me that the parsing code in clang is distributed over multiple files which together are way more than 3000 lines: https://github.com/llvm/llvm-project/tree/llvmorg-12.0.1/cla...

Lark

35 4,519 7.5 Python

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

I know SPARK's docstring use influenced PLY.
PLY doesn't use Earley, but "Earley" does come up in the show notes of an interview with Beazley, PLY's author, at https://www.pythonpodcast.com/episode-95-parsing-and-parsers... . No transcript, and I'm not going to listen to it just to figure out the context.
https://github.com/lark-parser/lark "implements both Earley(SPPF) and LALR(1)".
Kegler, the author of that timeline I linked to, is the author of Marpa. Home page is http://savage.net.au/Marpa.html . The most recent HN comments about it are from a year ago, at https://news.ycombinator.com/item?id=24321395 .

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
adama-lang

26 104 9.9 Java

A headless spreadsheet document container service.

When I switched from ANTLR to hand written for Adama ( http://www.adama-lang.org/ ), I felt way better about things. I was able to get sane error messages, and I could better annotate my syntax tree with comments and line/char numbers.
A killer feature for a parser generator would be the ability to auto-generate a pretty printer which requires stuffing comments into the tree as a "meta token".

IParse

5 11 3.3 C++

IParse: an interpreting parser written in C++

I implemented an unparse function in IParse, which is not a parser generator, but a parser that interprets a grammar. See for example https://github.com/FransFaase/IParse/blob/master/software/c_... where symbols starting with a back slash are a kind of white space terminals during the unparse. For example, \inc stands for incrementing the indentation where \dec decrements it. The \s is used to indicate that at given location a space should be included.

ruby

183 21,592 10.0 Ruby

The Ruby Programming Language

The Ruby yacc file is scary to look at. 13+ thousand lines in a single file.
Would it be better with hand rolled and they could have abstracted and organized somethings or does it all make sense in its current format if you are familiar with it?
https://github.com/ruby/ruby/blob/v3_0_2/parse.y

nearley

3 3,557 0.0 JavaScript

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
Fast Parse

4 1,079 4.6 Scala

Writing Fast Parsers Fast in Scala

Agreed! I would say that parser combinators are the sweet spot and the right choice in most cases.
Scala has them as well, e.g.: https://com-lihaoyi.github.io/fastparse/
And the good thing is, you don't have to learn a completely new language/syntax, you can use the host language's syntax and you have full IDE support as well.

SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
dmd

148 2,900 9.9 D

dmd D Programming Language compiler

Just read the code for an existing one like:
https://github.com/dlang/dmd/blob/master/src/dmd/cparse.d
which is a C parser. It's not hard to follow.

Crate

6 3,970 9.9 Java

CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

How do you start your own programming language?

3 projects | /r/learnprogramming | 31 Oct 2022
Generic constant expressions: a future bright side of nightly Rust

2 projects | dev.to | 16 May 2024
Aya Rust tutorial Part One

1 project | dev.to | 9 May 2024
The search for easier safe systems programming

11 projects | news.ycombinator.com | 8 May 2024
I hate Rust (programming language)

1 project | news.ycombinator.com | 22 Apr 2024

Parser generators vs. handwritten parsers: surveying major languages in 2021

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Parsing Parser parsing-library Language HacktoberFest
Post date: 21 Aug 2021

llvm-project

Lark

InfluxDB

adama-lang

IParse

ruby

nearley

Fast Parse

SaaSHub

dmd

Crate

Related posts

How do you start your own programming language?

Generic constant expressions: a future bright side of nightly Rust

Aya Rust tutorial Part One

The search for easier safe systems programming

I hate Rust (programming language)

Parser generators vs. handwritten parsers: surveying major languages in 2021

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Parsing Parser parsing-library Language HacktoberFest Post date: 21 Aug 2021

Related posts

How do you start your own programming language?

Generic constant expressions: a future bright side of nightly Rust

Aya Rust tutorial Part One

The search for easier safe systems programming

I hate Rust (programming language)

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Parsing Parser parsing-library Language HacktoberFest
Post date: 21 Aug 2021