Skip to contents

Converts text to tokens

Usage

get_tokens(text, model)

Arguments

text

a character string to encode to tokens, can be a vector

model

a model to use for tokenization, either a model name, e.g., gpt-4o or a tokenizer, e.g., o200k_base. See also available tokenizers.

Value

a vector of tokens for the given text as integer

Examples

get_tokens("Hello World", "gpt-4o")
#> [1] 13225  5922
get_tokens("Hello World", "o200k_base")
#> [1] 13225  5922