diff options
| author | cperry-goog <[email protected]> | 2025-05-15 20:04:33 -0700 |
|---|---|---|
| committer | GitHub <[email protected]> | 2025-05-15 20:04:33 -0700 |
| commit | 58ef39e2a964386a1026ba68419e4d64c4612551 (patch) | |
| tree | 5c00113b2a92a33ee9bc4f0d4dc03782d3b342b2 /docs/architecture.md | |
| parent | 3674fb0c7e230651f1f33c4d46b24ca003dd532a (diff) | |
Docs: Add initial project documentation structure and content (#368)
Co-authored-by: Taylor Mullen <[email protected]>
Diffstat (limited to 'docs/architecture.md')
| -rw-r--r-- | docs/architecture.md | 76 |
1 files changed, 76 insertions, 0 deletions
diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 00000000..37523f3a --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,76 @@ +# Gemini CLI Architecture Overview + +This document provides a high-level overview of the Gemini CLI's architecture. Understanding the main components and their interactions can be helpful for both users and developers. + +## Core Components + +The Gemini CLI is primarily composed of two main packages, along with a suite of tools that the system utilizes: + +1. **CLI Package (`packages/cli`):** + + - **Purpose:** This is the user-facing component. It provides the interactive command-line interface (REPL), handles user input, displays output from Gemini, and manages the overall user experience. + - **Key Features:** + - Input processing (parsing commands, text prompts). + - History management. + - Display rendering (including Markdown, code highlighting, and tool messages). + - Theme and UI customization. + - Communication with the Server package. + - Manages user configuration settings specific to the CLI. + +2. **Server Package (`packages/server`):** + + - **Purpose:** This acts as the backend for the CLI. It receives requests from the CLI, orchestrates interactions with the Gemini API, and manages the execution of available tools. + - **Key Features:** + - API client for communicating with the Google Gemini API. + - Prompt construction and management. + - Tool registration and execution logic. + - State management for conversations or sessions. + - Manages server-side configuration. + +3. **Tools (`packages/server/src/tools/`):** + - **Purpose:** These are individual modules that extend the capabilities of the Gemini model, allowing it to interact with the local environment (e.g., file system, shell commands, web fetching). + - **Interaction:** The Server package invokes these tools based on requests from the Gemini model. The CLI then displays the results of tool execution. + +## Interaction Flow + +A typical interaction with the Gemini CLI follows this general flow: + +1. **User Input:** The user types a prompt or command into the CLI (`packages/cli`). +2. **Request to Server:** The CLI package sends the user's input to the Server package (`packages/server`). +3. **Server Processes Request:** The Server package: + - Constructs an appropriate prompt for the Gemini API, possibly including conversation history and available tool definitions. + - Sends the prompt to the Gemini API. +4. **Gemini API Response:** The Gemini API processes the prompt and returns a response. This response might be a direct answer or a request to use one of the available tools. +5. **Tool Execution (if applicable):** + - If the Gemini API requests a tool, the Server package prepares to execute it. + - **User Confirmation for Potentially Impactful Tools:** If the requested tool can modify the file system (e.g., file edits, writes) or execute shell commands, the CLI (`packages/cli`) displays a confirmation prompt to the user. This prompt details the tool and its arguments, and the user must approve the execution. Read-only operations (e.g., reading files, listing directories) may not always require this explicit confirmation step. + - If confirmed (or if confirmation is not required for the specific tool), the Server package identifies and executes the relevant tool (e.g., `read_file`, `execute_bash_command`). + - The tool performs its action (e.g., reads a file from the disk). + - The result of the tool execution is sent back to the Gemini API by the Server. + - The Gemini API processes the tool result and generates a final response. +6. **Response to CLI:** The Server package sends the final response (or intermediate tool messages) back to the CLI package. +7. **Display to User:** The CLI package formats and displays the response to the user in the terminal. + +## Diagram (Conceptual) + +```mermaid +graph TD + User[User via Terminal] -- Input --> CLI[packages/cli] + CLI -- Request --> Server[packages/server] + Server -- Prompt/Tool Info --> GeminiAPI[Gemini API] + GeminiAPI -- Response/Tool Call --> Server + Server -- Tool Details --> CLI + CLI -- User Confirms --> Server + Server -- Execute Tool --> Tools[Tools e.g., read_file, shell] + Tools -- Tool Result --> Server + Server -- Final Response --> CLI + CLI -- Output --> User +``` + +## Key Design Principles + +- **Modularity:** Separating the CLI (frontend) from the Server (backend) allows for independent development and potential future extensions (e.g., different frontends for the same server). +- **Extensibility:** The tool system is designed to be extensible, allowing new capabilities to be added. +- **User Experience:** The CLI focuses on providing a rich and interactive terminal experience. + +This overview should provide a foundational understanding of the Gemini CLI's architecture. For more detailed information, refer to the specific documentation for each package and the development guides. |
