Token to Character Mapping Guideline for Now Assist (LLM Model) - Support and Troubleshooting

Token to Character Mapping Guideline for Now Assist (LLM Model) Issue How many Characters are here per token when sending a request to the Now Assist (LLM Model)?

Release All releases

Resolution Tokens are units of measurement representing the amount of information in both the request and response of a large language model (LLM) Each LLM has a maximum token limit, which is the sum of request tokens, response tokens, and buffer tokens. This limit is set by the model provider When making a request to the LLM, the prompt and context are tokenized to determine the number of request tokens consumed. If the request exceeds the system's limit, some information may be truncated.

There is no fixed token-per-character ratio for the Now Assist LLM because it uses subword tokenization, where a single character can be a token or part of a larger token. For example, a common word like "token" might be represented by a single token, while "summarization" could be broken into multiple tokens. Instead of a direct character-to-token conversion, a prompt is tokenized into a series of tokens that represent words, sub-words, and punctuation. The LLM then processes these tokens to generate a response, which also consists of a new set of tokens

The platform uses the script include named "TokenCalculator" to calculate the number of tokens.

/nav_to.do?uri=sys_script_include.do?sys_id=434703e2c3632110634cdb450a40dd8c

This API can be used as a guideline for the chunking/truncation strategy used in the platform

See function getTokenCount which calls function _calculateTokenNowLLM when the provider is specified as the NOW LLM model.

This uses 0.8 words per token at time of writing in its calculation for the NOWLLM model

Please find an example of a test script

var msg = 'This is my message to see token breakdown'; var tknCalc = new sn_generative_ai.TokenCalculator(); var tokenBreakdown = tknCalc.getTokenCount(msg, 'Now LLM', 0);

gs.print(tokenBreakdown);

*** Script: 10

Analysis was performed using the "Apriel model" internally to assess the Word To Token Count ratio and Character To Token Count ratio to provide a guideline

The analysis showed that on internal datasets on Agent Assist use cases (Case Summarization, Chat Summarization), Agentic Workflow datasets, Text2Code datasets, Requestor Assist datasets (Conversational Catalog) that the Character To Token Count ratio ranges from 3.5 to 4. Word To Token Count ratio is around 0.5

Note: This analysis is a guideline only and should not be considered absolute

Related Links Useful Articles: KB2038552: Token Limits in Now Assist/Generative AI products https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB2038552