OSS framework for voice first multimodal assistants

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • bolna

    End-to-end platform for building voice first multimodal agents

  • Demo (https://www.youtube.com/watch?v=OSrOmyR7oQs)

    1. Open Source orchestration: We're open-sourcing our orchestration to quickly setup and create LLM based voice driven conversational applications https://github.com/bolna-ai/bolna/

    2. Hosted API Platform: Exposing our managed solution via APIs to build voice driven applications https://docs.bolna.dev/api-reference/introduction

    3. Normal LLM telemetry tools won't work in giving visibility for audio bytes in and out of the system across multiple models. So, we've build our own observability layer fully integrated with the dashboard as well.

    4. 3 different modes for creating agents - Lite (Intent classification based) (useful for basic calls and really pocket friendly). Normal (<2sec latency but only one llm call means it's cheaper than nitro), Nitro (<1sec latency and but multiple llm calls means really expensive)

    5. Follow up tasks like webhook integration, summarisation, and extraction.

    6. Modular and extensible architecture, which means connecting two different llms yet parallel paths(for example code and english to automate leetcode screening interviews) is really easy, albeit you'll initially need some hacking until we're able to release that to both hosted and open source versions).

    7. Vector and Scalar caches to reduce the entire cost (by 3x compared to deepgram + mixtral + elevenlabs) and latency by 300ms - 500ms.

    Over the next weeks we'd be doing a lot of small releases here starting with a Speech language model only pipeline, integrating gazzele

    We'd love to welcome you guys to our community, give us feedback and together build "langchain for voice first AI assistants".

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Show HN: An Open source platform for building voice first multimodal agents

    1 project | news.ycombinator.com | 15 May 2024
  • Open source projects for connecting LLM+ tts andstt

    1 project | news.ycombinator.com | 1 Apr 2024
  • Show HN: OSS voice based conversational API with <1sec latency and other nuances

    1 project | news.ycombinator.com | 27 Mar 2024
  • RAG, fine-tuning, API calling and gptscript for Llama 3 running locally

    2 projects | news.ycombinator.com | 24 May 2024
  • Systematically Improving Your RAG

    1 project | news.ycombinator.com | 22 May 2024