Tokenizers
Adapted from the Griptape AI Framework documentation.
Overview
Tokenizers are used throughout Gen AI Builder to calculate the number of tokens in a piece of text. They are particularly useful for ensuring that the LLM token limits are not exceeded.
Tokenizers are a low level abstraction that you will rarely interact with directly.
Tokenizers
OpenAI
from griptape.tokenizers import OpenAiTokenizer tokenizer = OpenAiTokenizer(model="gpt-4.1") print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))
3 127989 4093
Cohere
import os from cohere import Client from griptape.tokenizers import CohereTokenizer tokenizer = CohereTokenizer(model="command", client=Client(os.environ["COHERE_API_KEY"])) print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))
3 4093 4093
Anthropic
from griptape.tokenizers import AnthropicTokenizer tokenizer = AnthropicTokenizer(model="claude-3-opus-20240229") print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))
10 199990 4086
import os from griptape.tokenizers import GoogleTokenizer tokenizer = GoogleTokenizer(model="gemini-2.0-flash", api_key=os.environ["GOOGLE_API_KEY"]) print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))
3 1048573 8189
Hugging Face
from griptape.tokenizers import HuggingFaceTokenizer tokenizer = HuggingFaceTokenizer( model="sentence-transformers/all-MiniLM-L6-v2", max_output_tokens=512, ) print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))
5 507 507
Amazon Bedrock
from griptape.tokenizers import AmazonBedrockTokenizer tokenizer = AmazonBedrockTokenizer(model="amazon.titan-text-express-v1") print(tokenizer.count_tokens("Hello world!")) print(tokenizer.count_input_tokens_left("Hello world!")) print(tokenizer.count_output_tokens_left("Hello world!"))