Returns the number of tokens in a text — get_token_count • rtiktoken

Returns the number of tokens in a text

Usage

get_token_count(text, model)

Arguments

text: a character string to encode to tokens, can be a vector
model: a model to use for tokenization, either a model name, e.g., gpt-4o or a tokenizer, e.g., o200k_base. See also available tokenizers.

Value

the number of tokens in the text, vector of integers

See also

model_to_tokenizer(), get_tokens()

Examples

get_token_count("Hello World", "gpt-4o")
#> [1] 2