Usage and billing
What are message tokens?
Message tokens are used each time you send a message to a bot and receive a response. Each message uses between 1000 and 4000 message tokens (but can be more), depending on the questions and the bot's response. This means 200,000 message tokens roughly equate to 100 messages.
Generally, as a conversation gets longer, more message tokens will be used for each message as the entire chat history is sent to the model each time.
Models and message tokens
Depending on the model you choose for your bot, the number of message tokens used will be calculated differently. This allows us to standardise the cost of the available models compared to the GPT 3.5 Turbo model. The token multipliers for each model are listed below:
Name | Input modifier | Output modifier |
---|---|---|
GPT-4o Mini ourdefault | x 0.3 | x 1.2 |
Claude 3 - Haiku | x 0.5 | x 2.5 |
Mistral - Open Mistral 7b | x 0.5 | x 0.5 |
Google - Gemini 1.5 Flash | x 0.7 | x 2.1 |
Cohere - Command R | x 1 | x 3 |
GPT-3.5 Turbo | x 1 | x 3 |
Mistral - Open Mixtral 8x7b | x 1.4 | x 1.4 |
Mistral - Mistral Small | x 2 | x 6 |
Claude 3.5 - Haiku | x 2 | x 10 |
Mistral - Open Mixtral 8x22b | x 4 | x 4 |
Mistral - Mistral Medium | x 5.4 | x 16.2 |
Cohere - Command R+ | x 6 | x 30 |
Claude 3 - Sonnet | x 6 | x 30 |
Claude 3.5 - Sonnet | x 6 | x 30 |
Google - Gemini 1.5 Pro | x 7 | x 21 |
Mistral - Mistral Large | x 8 | x 24 |
GPT-4o | x 10 | x 30 |
GPT-4 Turbo 128k | x 20 | x 60 |
Claude 3 - Opus | x 30 | x 150 |
GPT-4 | x 60 | x 120 |
Input and output tokens
Many model providers charge different amounts for input tokens and output tokens. Input tokens are often cheaper and include the prompt, the message, and anything else that is sent to the model when you chat with a bot. Output tokens are often more expensive and consist of the responses generated by your bots.
What does this all mean?
In simple terms, the amount of tokens your bot consumes when used depends on the model you choose. For example, if you choose GPT-4 Turbo 128k, your bot will use 20x the input tokens and 60x the output tokens compared to GPT-3.5 Turbo. This reflects the underlying cost of GPT-4 Turbo 128k being much higher than GPT 3.5 Turbo.
Updating model multipliers
The cost of models changes frequently; they usually get cheaper, and we always aim to pass these cost savings on to customers. As the models get more affordable, we will change the multipliers to reflect the new pricing, meaning your tokens go further.