anthropic-tokenizer VS llm_utils

Compare anthropic-tokenizer vs llm_utils and see what are their differences.

anthropic-tokenizer

Approximation of the Claude 3 tokenizer by inspecting generation stream (by javirandor)

llm_utils

Utilities for Llama.cpp, Openai, Anthropic, Mistral-rs. (by ShelbyJenkins)
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
anthropic-tokenizer llm_utils
3 2
87 24
- -
7.8 6.0
6 days ago 22 days ago
Python Rust
MIT License MIT License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

anthropic-tokenizer

Posts with mentions or reviews of anthropic-tokenizer. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-06-17.

llm_utils

Posts with mentions or reviews of llm_utils. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-06-17.
  • Show HN: Token price calculator for 400+ LLMs
    12 projects | news.ycombinator.com | 17 Jun 2024
    > tiktoken.encoding_for_model(model)

    Calling this where model == 'gpt-4o' will encode with CL200k no?

    But yes, I do agree with you. I had time implementing non-tiktoken tokenizers for my project. I ended up manually adding tokenizer.json files into my repo.[1] The other options is downloading from HF, but the official repos where the model's tokenizer.json lives require agreeing to their terms to access. So it requires an HF key, and agreeing to the terms. So not a good experience for a consumer of the package.

    > Message frame tokens?

    Do you mean the chat template tokens? Oh, that's another good point. Yeah, it counts OpenAI prompt tokens. I solved this by implementing a Jinja templating engine to create the full prompt. [2] Granted, both llama.cpp and mistral-rs do this on the backend, so it's purely for counting tokens. I guess it would make sense to add a function to convert tokens to Dollars.

    [1] https://github.com/ShelbyJenkins/llm_utils/tree/main/src/mod...

What are some alternatives?

When comparing anthropic-tokenizer and llm_utils you can also consider the following projects:

tokencost - Easy token price estimates for 400+ LLMs

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured

Did you konow that Python is
the 1st most popular programming language
based on number of metions?