SQLite Internals: How the Most Used Database Works

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • sqlite-parser

    An ANTLR4 grammar for SQLite statements. (by bkiers)

  • > ...than it would be to learn the exact syntax and quirks and possibly bugs of someone else's implementation...

    Yup. Also, having deep knowledge of the language is required.

    SQLite's grammar is neat. Creating a compatible parser would make a fun project. Here's a pretty good example: https://github.com/bkiers/sqlite-parser (Actual ANTLR 4 grammar: https://github.com/bkiers/sqlite-parser/blob/master/src/main... )

    Postgres, which tries to be compliant with the latest standards, however...

    SQL-2016 is a beast. Not to mention all the dialects.

    I'm updating my personal (soon to be FOSS) grammar from ANTLR 3 LL(k) to ANTLR 4 ALL().

    I've long had a working knowledge of SQL-92, with some SQL-1999 (eg common table expressions).

    But the new structures and extensions are a bit overwhelming.

    Fortunately, ANTLR project has ~dozen FOSS grammars to learn from. https://github.com/antlr/grammars-v4/tree/master/sql

    They mostly mechanically translate BNFs to LL(k) with some ALL(). Meaning few take advantage of left-recursion. https://github.com/antlr/antlr4/blob/master/doc/left-recursi...

    Honestly, I struggled to understand these grammars. Plus, not being conversant with the SQL-2016 was a huge impediment. Just finding a succinct corbis of test cases was a huge hurdle for me.

    Fortunately, the H2 Database project is a great resource. https://github.com/h2database/h2database/tree/master/h2/src/...

    Now for the exciting conclusion...

    My ANTLR grammar which passes all of H2's tests looks nothing like any of the official or product specific BNFs.

    Further, I found discrepancy between the product specific BNFs and their implementations.

    So a lot of trial & error is required for a "real world" parser. Which would explain why the professional SQL parsing tools charge money.

    I still think creating a parser for SQLite is a great project.

  • grammars-v4

    Grammars written for ANTLR v4; expectation that the grammars are free of actions.

  • > ...than it would be to learn the exact syntax and quirks and possibly bugs of someone else's implementation...

    Yup. Also, having deep knowledge of the language is required.

    SQLite's grammar is neat. Creating a compatible parser would make a fun project. Here's a pretty good example: https://github.com/bkiers/sqlite-parser (Actual ANTLR 4 grammar: https://github.com/bkiers/sqlite-parser/blob/master/src/main... )

    Postgres, which tries to be compliant with the latest standards, however...

    SQL-2016 is a beast. Not to mention all the dialects.

    I'm updating my personal (soon to be FOSS) grammar from ANTLR 3 LL(k) to ANTLR 4 ALL().

    I've long had a working knowledge of SQL-92, with some SQL-1999 (eg common table expressions).

    But the new structures and extensions are a bit overwhelming.

    Fortunately, ANTLR project has ~dozen FOSS grammars to learn from. https://github.com/antlr/grammars-v4/tree/master/sql

    They mostly mechanically translate BNFs to LL(k) with some ALL(). Meaning few take advantage of left-recursion. https://github.com/antlr/antlr4/blob/master/doc/left-recursi...

    Honestly, I struggled to understand these grammars. Plus, not being conversant with the SQL-2016 was a huge impediment. Just finding a succinct corbis of test cases was a huge hurdle for me.

    Fortunately, the H2 Database project is a great resource. https://github.com/h2database/h2database/tree/master/h2/src/...

    Now for the exciting conclusion...

    My ANTLR grammar which passes all of H2's tests looks nothing like any of the official or product specific BNFs.

    Further, I found discrepancy between the product specific BNFs and their implementations.

    So a lot of trial & error is required for a "real world" parser. Which would explain why the professional SQL parsing tools charge money.

    I still think creating a parser for SQLite is a great project.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • ANTLR

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

  • > ...than it would be to learn the exact syntax and quirks and possibly bugs of someone else's implementation...

    Yup. Also, having deep knowledge of the language is required.

    SQLite's grammar is neat. Creating a compatible parser would make a fun project. Here's a pretty good example: https://github.com/bkiers/sqlite-parser (Actual ANTLR 4 grammar: https://github.com/bkiers/sqlite-parser/blob/master/src/main... )

    Postgres, which tries to be compliant with the latest standards, however...

    SQL-2016 is a beast. Not to mention all the dialects.

    I'm updating my personal (soon to be FOSS) grammar from ANTLR 3 LL(k) to ANTLR 4 ALL().

    I've long had a working knowledge of SQL-92, with some SQL-1999 (eg common table expressions).

    But the new structures and extensions are a bit overwhelming.

    Fortunately, ANTLR project has ~dozen FOSS grammars to learn from. https://github.com/antlr/grammars-v4/tree/master/sql

    They mostly mechanically translate BNFs to LL(k) with some ALL(). Meaning few take advantage of left-recursion. https://github.com/antlr/antlr4/blob/master/doc/left-recursi...

    Honestly, I struggled to understand these grammars. Plus, not being conversant with the SQL-2016 was a huge impediment. Just finding a succinct corbis of test cases was a huge hurdle for me.

    Fortunately, the H2 Database project is a great resource. https://github.com/h2database/h2database/tree/master/h2/src/...

    Now for the exciting conclusion...

    My ANTLR grammar which passes all of H2's tests looks nothing like any of the official or product specific BNFs.

    Further, I found discrepancy between the product specific BNFs and their implementations.

    So a lot of trial & error is required for a "real world" parser. Which would explain why the professional SQL parsing tools charge money.

    I still think creating a parser for SQLite is a great project.

  • H2

    H2 is an embeddable RDBMS written in Java.

  • > ...than it would be to learn the exact syntax and quirks and possibly bugs of someone else's implementation...

    Yup. Also, having deep knowledge of the language is required.

    SQLite's grammar is neat. Creating a compatible parser would make a fun project. Here's a pretty good example: https://github.com/bkiers/sqlite-parser (Actual ANTLR 4 grammar: https://github.com/bkiers/sqlite-parser/blob/master/src/main... )

    Postgres, which tries to be compliant with the latest standards, however...

    SQL-2016 is a beast. Not to mention all the dialects.

    I'm updating my personal (soon to be FOSS) grammar from ANTLR 3 LL(k) to ANTLR 4 ALL().

    I've long had a working knowledge of SQL-92, with some SQL-1999 (eg common table expressions).

    But the new structures and extensions are a bit overwhelming.

    Fortunately, ANTLR project has ~dozen FOSS grammars to learn from. https://github.com/antlr/grammars-v4/tree/master/sql

    They mostly mechanically translate BNFs to LL(k) with some ALL(). Meaning few take advantage of left-recursion. https://github.com/antlr/antlr4/blob/master/doc/left-recursi...

    Honestly, I struggled to understand these grammars. Plus, not being conversant with the SQL-2016 was a huge impediment. Just finding a succinct corbis of test cases was a huge hurdle for me.

    Fortunately, the H2 Database project is a great resource. https://github.com/h2database/h2database/tree/master/h2/src/...

    Now for the exciting conclusion...

    My ANTLR grammar which passes all of H2's tests looks nothing like any of the official or product specific BNFs.

    Further, I found discrepancy between the product specific BNFs and their implementations.

    So a lot of trial & error is required for a "real world" parser. Which would explain why the professional SQL parsing tools charge money.

    I still think creating a parser for SQLite is a great project.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Are Functional Programming Languages the best option for Crafting a Compiler?

    4 projects | /r/Compilers | 5 Sep 2021
  • SQL-Parsing

    2 projects | /r/SQL | 25 Jun 2023
  • How should I prepare for AI-driven changes in the industry as a Software Engineering Manager

    2 projects | /r/ExperiencedDevs | 3 May 2023
  • Spring, SchemaSpy DB docs, and GitHub Pages

    8 projects | dev.to | 12 Mar 2023
  • Analysing Github Stars - Extracting and analyzing data from Github using Apache NiFi®, Apache Kafka® and Apache Druid®

    8 projects | dev.to | 11 Jan 2023