Meta AI releases Code Llama 70B

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • codellama

    Inference code for CodeLlama models

  • The github [0] hasn't been fully updated, but it links to a paper [1] that describes how the smaller code llama models were trained. It would be a good guess that this model is similar.

    [0] https://github.com/facebookresearch/codellama

  • can-ai-code

    Self-evaluating interview for AI coders

  • This is a completely fair, but open question. Not to be a typical HN user, but when you say SOTA local, the question is really what benchmarks do you really care about in order to evaluate. Size, operability, complexity, explainability etc.

    Working out what copilot models perform best has been a deep exercise for myself and has really made me evaluate my own coding style on what I find important and things I look out for when investigating models and evaluating interview candidates.

    I think three benchmarks & leaderboards most go to are:

    https://huggingface.co/spaces/bigcode/bigcode-models-leaderb... - which is the most understood, broad language capability leaderboad that relies on well understood evaluations and benchmarks.

    https://huggingface.co/spaces/mike-ravkine/can-ai-code-resul... - Also comprehensive, but primarily assesses Python and JavaScript.

    https://evalplus.github.io/leaderboard.html - which I think is a better take on comparing models you intend to run locally as you can evaluate performance, operability and size in one visualisation.

    Best of luck and I would love to know which models & benchmarks you choose and why.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • continue

    ⏩ Open-source VS Code and JetBrains extensions that enable you to easily create your own modular AI software development system

  • Continue doesn’t support tab completion like Copilot yet.

    A pull/merge request is being worked on: https://github.com/continuedev/continue/pull/758

  • text-generation-webui

    A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.

  • You can download it and run it with [this](https://github.com/oobabooga/text-generation-webui). There's an API mode that you could leverage from your VS Code extension.

  • llama.cpp

    LLM inference in C/C++

  • M3 Max is actually less than ideal because it peaks at 400 Gb/s for memory. What you really want is M1 or M2 Ultra, which offers up to 800 Gb/s (for comparison, RTX 3090 runs at 936 GB/s). A Mac Studio suitable for running 70B models with speeds fast enough for realtime chat can be had for ~$3K

    The downside of Apple's hardware at the moment is that the training ecosystem is very much focused on CUDA; llama.cpp has an open issue about Metal-accelerated training: https://github.com/ggerganov/llama.cpp/issues/3799 - but no work on it so far. This is likely because training at any significant sizes requires enough juice that it's pretty much always better to do it in the cloud currently, where, again, CUDA is the well-established ecosystem, and it's cheaper and easier for datacenter operators to scale. But, in principle, much faster training on Apple hardware should be possible, and eventually someone will get it done.

  • llama

    Inference code for Llama models

  • https://github.com/facebookresearch/llama/pull/947/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Continue will generate, refactor, and explain entire sections of code

    2 projects | news.ycombinator.com | 18 Dec 2023
  • VSC Continue.dev with own Rest API

    4 projects | /r/LocalLLaMA | 11 Dec 2023
  • What is your motive for running open-source models, instead of just using a ready-made solution like GPT-4?

    1 project | /r/LocalLLaMA | 10 Dec 2023
  • How helpful are LLMs with MATLAB?

    1 project | /r/matlab | 9 Nov 2023
  • How are people using open source LLMs in production apps?

    2 projects | /r/LocalLLaMA | 27 Oct 2023