diff options
Diffstat (limited to 'docs/tools/mcp-server.md')
| -rw-r--r-- | docs/tools/mcp-server.md | 50 |
1 files changed, 50 insertions, 0 deletions
diff --git a/docs/tools/mcp-server.md b/docs/tools/mcp-server.md index 050e10e8..1222c693 100644 --- a/docs/tools/mcp-server.md +++ b/docs/tools/mcp-server.md @@ -571,6 +571,56 @@ The MCP integration tracks several states: This comprehensive integration makes MCP servers a powerful way to extend the Gemini CLI's capabilities while maintaining security, reliability, and ease of use. +## Returning Rich Content from Tools + +MCP tools are not limited to returning simple text. You can return rich, multi-part content, including text, images, audio, and other binary data in a single tool response. This allows you to build powerful tools that can provide diverse information to the model in a single turn. + +All data returned from the tool is processed and sent to the model as context for its next generation, enabling it to reason about or summarize the provided information. + +### How It Works + +To return rich content, your tool's response must adhere to the MCP specification for a [`CallToolResult`](https://modelcontextprotocol.io/specification/2025-06-18/server/tools#tool-result). The `content` field of the result should be an array of `ContentBlock` objects. The Gemini CLI will correctly process this array, separating text from binary data and packaging it for the model. + +You can mix and match different content block types in the `content` array. The supported block types include: + +- `text` +- `image` +- `audio` +- `resource` (embedded content) +- `resource_link` + +### Example: Returning Text and an Image + +Here is an example of a valid JSON response from an MCP tool that returns both a text description and an image: + +```json +{ + "content": [ + { + "type": "text", + "text": "Here is the logo you requested." + }, + { + "type": "image", + "data": "BASE64_ENCODED_IMAGE_DATA_HERE", + "mimeType": "image/png" + }, + { + "type": "text", + "text": "The logo was created in 2025." + } + ] +} +``` + +When the Gemini CLI receives this response, it will: + +1. Extract all the text and combine it into a single `functionResponse` part for the model. +2. Present the image data as a separate `inlineData` part. +3. Provide a clean, user-friendly summary in the CLI, indicating that both text and an image were received. + +This enables you to build sophisticated tools that can provide rich, multi-modal context to the Gemini model. + ## MCP Prompts as Slash Commands In addition to tools, MCP servers can expose predefined prompts that can be executed as slash commands within the Gemini CLI. This allows you to create shortcuts for common or complex queries that can be easily invoked by name. |
