Images & PDFs
How to send images and PDFs to OpenRouter
OpenRouter supports sending images and PDFs via the API. This guide will show you how to work with both file types using our API.
Both images and PDFs also work in the chat room.
Image Inputs
Requests with images, to multimodel models, are available via the /api/v1/chat/completions
API with a multi-part messages
parameter. The image_url
can either be a URL or a base64-encoded image. Note that multiple images can be sent in separate content array entries. The number of images you can send in a single request varies per provider and per model. Due to how the content is parsed, we recommend sending the text prompt first, then the images. If the images must come first, we recommend putting it in the system prompt.
Using Image URLs
Here’s how to send an image using a URL:
Using Base64 Encoded Images
For locally stored images, you can send them using base64 encoding. Here’s how to do it:
Supported image content types are:
image/png
image/jpeg
image/webp
PDF Support
OpenRouter supports PDF processing through the /api/v1/chat/completions
API. PDFs can be sent as base64-encoded data URLs in the messages array, via the file content type. This feature works on any model on OpenRouter.
When a model supports file input natively, the PDF is passed directly to the model. When the model does not support file input natively, OpenRouter will parse the file and pass the parsed results to the requested model.
Note that multiple PDFs can be sent in separate content array entries. The number of PDFs you can send in a single request varies per provider and per model. Due to how the content is parsed, we recommend sending the text prompt first, then the PDF. If the PDF must come first, we recommend putting it in the system prompt.
Processing PDFs
Here’s how to send and process a PDF:
Pricing
OpenRouter provides several PDF processing engines:
"mistral-ocr"
: Best for scanned documents or PDFs with images ($2 per 1,000 pages)."pdf-text"
: Best for well-structured PDFs with clear text content (Free)."native"
: Only available for models that support file input natively (charged as input tokens).
If you don’t explicitly specify an engine, OpenRouter will default first to the model’s native file processing capabilities, and if that’s not available, we will use the "mistral-ocr"
engine.
To select an engine, use the plugin configuration:
Skip Parsing Costs
When you send a PDF to the API, the response may include file annotations in the assistant’s message. These annotations contain structured information about the PDF document that was parsed. By sending these annotations back in subsequent requests, you can avoid re-parsing the same PDF document multiple times, which saves both processing time and costs.
Here’s how to reuse file annotations:
When you include the file annotations from a previous response in your
subsequent requests, OpenRouter will use this pre-parsed information instead
of re-parsing the PDF, which saves processing time and costs. This is
especially beneficial for large documents or when using the mistral-ocr
engine which incurs additional costs.
Response Format
The API will return a response in the following format: