Transforming free-form geospatial directions into addresses - SOTA?

This page summarizes the projects mentioned and recommended in the original post on /r/LanguageTechnology

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • libpostal

    A C library for parsing/normalizing street addresses around the world. Powered by statistical NLP and open geo data.

  • I know of https://github.com/openvenues/libpostal which handles typos and omissions in addresses, but I am looking into a more fuzzy description of a location.

  • duckling

    Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.

  • To understand what relative distance and direction is indicated from the reference point, I'd look into something like Facebook & Wit.AI's Duckling, and a custom classifier to identify if it's on the reference point ("corner of"), or some distance from ("200 meters southwest"). If you can parse out a distance and direction, then it's all logic to plot the point.

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • spaCy

    💫 Industrial-strength Natural Language Processing (NLP) in Python

  • If you've got a specific area you're looking at, and already have street data, you could: 1. Follow the ArcGis blog's directions, creating intersection features. 2. Train a classifier (or a specific NER entity type; SpaCy would be a good package for that) on the types of cross-street references you're finding in your text. You can see some of the relevant tokens in the examples you provided - "Corner of", "along", and I'd imagine "intersection of" etc. Even simple string lookups could help you bootstrap the training data. 3. Use some sort of embedding similarity to compare the hit terms to potential cross-streets.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch

    11 projects | news.ycombinator.com | 10 Apr 2024
  • DataDreamer

    1 project | news.ycombinator.com | 11 Feb 2024
  • A Curated List of Free ML/ DL YouTube Courses

    1 project | news.ycombinator.com | 28 Jan 2024
  • Sorry if this is a dumb question but is the main idea behind LLMs to output text based on user input?

    2 projects | /r/LocalLLaMA | 11 Dec 2023
  • ML-YouTube-Courses: NEW Courses - star count:11622.0

    1 project | /r/algoprojects | 7 Dec 2023