Launch HN: Patterns (YC S21) – A much faster way to build and deploy data apps

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • patterns-components

    Patterns Open-source Components

  • Hey HN, I’m Ken, co-founder of Patterns (https://www.patterns.app/) with with my friend Chris. Patterns gets rid of repetitive gruntwork when building and deploying data applications. We abstract away the micro-management of compute, storage, orchestration, and visualization, letting you focus on your specific app’s logic. Our goal is to give you a 10x productivity boost when building these things. Basically, we’re Heroku for AI apps. There’s a demo video here: https://www.patterns.app/videos/homepage/demo4k.mp4.

    We built Patterns because of our frustration trying to ship data and AI projects. We are data scientists and engineers and have built data stacks over the past 10 years for a wide variety of companies—from small startups to large enterprises across FinTech, Ecommerce, and SaaS. In every situation, we’ve been let down by the tools available in the market.

    Every data team spends immense time and resources reinventing the wheel because none of the existing tools work end-to-end (and getting 5 different tools to work together properly is almost as much work as writing them all yourself). ML tools focus on just modeling; notebook tools are brittle, hard to maintain, and don’t help with ETL or operationalization; and orchestration tools don’t integrate well with the development process.

    As a result, when we worked on data applications—things like a trading bot side-project, a risk scoring model at a startup, and a PLG (product-led growth) automation at a big company—we spent 90% of our time doing things that weren’t specific to the app itself: getting and cleaning data, building connections to external systems and software, and orchestrating and productionizing. We built Patterns to address these issues and make developing data and AI apps a much better experience.

    At its core, Patterns is a reactive (i.e. automatically updating) graph architecture with powerful node abstractions: Python, SQL, Table, Chart, Webhook, etc. You build your app as a graph using the node types that make sense, and write whatever custom code you need to implement your specific app.

    We built this architecture for modularity, composability, and testability, with structurally-typed data interfaces. This lets you build and deploy data automations and pipelines quickly and safely. You write and add your own code as you need it, taking advantage of a library of forkable open-source components—see https://www.patterns.app/marketplace/components and https://github.com/patterns-app/patterns-components.git .

    Patterns apps are fully defined by files and code, so you can check them into Git the same way you would anything else—but we also provide an editable UI representation for each app. You work at either level, depending on what’s convenient, and your changes propagate automatically to the other level with two-way consistency.

    One surprising thing we’ve learned while building this is that the problem actually gets simpler when you broaden the scope. Individual parts of the data stack that are huge challenges in isolation—data observability, lineage, versioning, error handling, productionizing—become much easier when you have a unified “operating system”.

    Our customers include SaaS and ecommerce co’s building customer data platforms, fintech companies building lending and risk engines, and AI companies building prompt engineering pipelines.

    Here are some apps we think you might like and can clone:

  • dcp

    Universal data copy (by kvh)

  • 3. Sales lead enrichment, scoring, and routing: (https://studio.patterns.app/graph/9e11ml5wchab3r9167kk/lead-...

    Oh and we have two Hacker News specials. Our Getting Started Tutorial features a Hacker News semantic search and alerting bot (https://www.patterns.app/docs/quick-start). We also built a template app that uses a LLM from Cohere.ai to classify HN stories into categories like AI, Programming, Crypto, etc. (https://studio.patterns.app/graph/n996ii6owwi5djujyfki/hn-co...).

    Long-term, we want to build a collaborative ecosystem of reusable components and apps. To enable this, we’ve created abstractions over both data infrastructure (https://github.com/kvh/dcp.git) and “structurally-typed data interfaces” (https://github.com/kvh/common-model.git), along with a protocol for running data operations in Python or SQL (other languages soon) in a standard way across any cloud database or compute engine.

    Thanks for reading this—we hope you’ll take a look! Patterns is an idea I’ve had in my head for over a decade now, and I feel blessed to have the chance to finally build it out with the best co-founder on the planet (thanks Chris!) and a world-class engineering team.

    We’re still early beta and have a long road ahead, but we’re ready to be tried and eager for your feedback!

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • common-model

  • 3. Sales lead enrichment, scoring, and routing: (https://studio.patterns.app/graph/9e11ml5wchab3r9167kk/lead-...

    Oh and we have two Hacker News specials. Our Getting Started Tutorial features a Hacker News semantic search and alerting bot (https://www.patterns.app/docs/quick-start). We also built a template app that uses a LLM from Cohere.ai to classify HN stories into categories like AI, Programming, Crypto, etc. (https://studio.patterns.app/graph/n996ii6owwi5djujyfki/hn-co...).

    Long-term, we want to build a collaborative ecosystem of reusable components and apps. To enable this, we’ve created abstractions over both data infrastructure (https://github.com/kvh/dcp.git) and “structurally-typed data interfaces” (https://github.com/kvh/common-model.git), along with a protocol for running data operations in Python or SQL (other languages soon) in a standard way across any cloud database or compute engine.

    Thanks for reading this—we hope you’ll take a look! Patterns is an idea I’ve had in my head for over a decade now, and I feel blessed to have the chance to finally build it out with the best co-founder on the planet (thanks Chris!) and a world-class engineering team.

    We’re still early beta and have a long road ahead, but we’re ready to be tried and eager for your feedback!

  • orchest

    Build data pipelines, the easy way 🛠️

  • First want to say congrats to the Patterns team for creating a gorgeous looking tool. Very minimal and approachable. Massive kudos!

    Disclaimer: we're building something very similar and I'm curious about a couple of things.

    One of the questions our users have asked us often is how to minimize the dependence on "product specific" components/nodes/steps. For example, if you write CI for GitHub Actions you may use a bunch of GitHub Action references.

    Looking at the `graph.yml` in some of the examples you shared you use a similar approach (e.g. patterns/openai-completion@v4). That means that whenever you depend on such components your automation/data pipeline becomes more tied to the specific tool (GitHub Actions/Patterns), effectively locking in users.

    How are you helping users feel comfortable with that problem (I don't want to invest in something that's not portable)? It's something we've struggled with ourselves as we're expanding the "out of the box" capabilities you get.

    Furthermore, would have loved to see this as an open source project. But I guess the second best thing to open source is some open source contributions and `dcp` and `common-model` look quite interesting!

    For those who are curious, I'm one of the authors of https://github.com/orchest/orchest

  • windmill

    Open-source developer platform to turn scripts into workflows and UIs. Fastest workflow engine (5x vs Airflow). Open-source alternative to Airplane and Retool.

  • I am working on something adjacent to this problem. We focus much less on data pipelines but on automation, but in the end also have an abstraction for flows that one can use to build data pipeline. The locking-in issue was something I thought a lot about and ended up deciding that our generic steps should just be plain code in typescript/python/go/bash, the only requirement is that those snippets code have a main function and return a result. We built the https://hub.windmill.dev where users can share their scripts directly and we have a team of moderators to approve the one to integrate directly into the main product. The goal with those snippets is that they are generic enough to be reuse outside of Windmill and they might be able to work straight out of the box for orchest for the python ones.

    nb: author of https://github.com/windmill-labs/windmill

  • getting-started

    This repository is a getting started guide to Singer. (by singer-io)

  • Thanks for chipping in.

    I’ve been leaning towards this direction. I think I/O is the biggest part that in the case of plain code steps still needs fixing. Input being data/stream and parameterization/config and output being some sort of typed data/stream.

    My “let’s not reinvent the wheel” alarm is going of when I write that though. Examples that come to mind are text based (Unix / https://scale.com/blog/text-universal-interface) but also the Singer tap protocol (https://github.com/singer-io/getting-started/blob/master/doc...). And config obviously having many standard forms like ini, yaml, json, environment key value pairs and more.

    At the same time, text feels horribly inefficient as encoding for some of the data objects being passed around in these flows. More specialized and optimized binary formats come to mind (Arrow, HDF5, Protobuf).

    Plenty of directions to explore, each with their own advantages and disadvantages. I wonder which direction is favored by users of tools like ours. Will be good to poll (do they even care?).

    PS Windmill looks equally impressive! Nice job

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • A quick comparison: Streamlit, Dash, Reflex and Rio

    4 projects | dev.to | 30 May 2024
  • Streamlit: Créer des apps en Python très simplement

    2 projects | dev.to | 27 May 2024
  • Create a Python app easily with Streamlit

    2 projects | dev.to | 27 May 2024
  • AI Strategy Guide: How to Scale AI Across Your Business

    4 projects | dev.to | 11 May 2024
  • How I discovered Named Entity Recognition while trying to remove gibberish from a string.

    1 project | dev.to | 6 May 2024