[D] Tokenizers Truncation during Fine-tuning with Large Texts

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • trl

    Train transformer language models with reinforcement learning.

  • SFTtrainer from huggingface

  • llama-recipes

    Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

  • Llama-recipes

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Show HN: ffmpeg-english "capture from /dev/video0 every 1 second to jpg files"

    2 projects | news.ycombinator.com | 19 May 2024
  • GSM8K Will Make AI Hate Humanity

    1 project | dev.to | 19 May 2024
  • The Worst Method for Learning ML – Reproducing AlphaFold

    1 project | news.ycombinator.com | 19 May 2024
  • Devon: An open-source pair programmer

    1 project | news.ycombinator.com | 19 May 2024
  • Show HN: Interactive Graph by LLM (GPT-4o)

    5 projects | news.ycombinator.com | 19 May 2024