feat(slides): add slides.createFromJson — agent-friendly blueprint-to-slides tool#348
feat(slides): add slides.createFromJson — agent-friendly blueprint-to-slides tool#348n0012 wants to merge 22 commits intogemini-cli-extensions:mainfrom
Conversation
…eFromJson
slides.createFromJson is a high-level agent-friendly tool that accepts a
JSON blueprint and translates it into Slides API batch requests in one call.
Avoids requiring agents to know the raw Slides API request format.
Blueprint format:
{ "slides": [{ "elements": [...] }, ...] } — multiple slides
{ "elements": [...] } — single slide (shorthand)
Each element supports: type (text|shape|image), position (x,y,w,h in PT on
a 720x405 grid), content, shape_type, url, layer (z-index), and a style
object with: size, bold, italic, underline, strikethrough, font_family,
align, vertical_align, color, bg_color, border_color, border_weight,
no_border, bold_phrases, bold_until, links.
Hardcoded Google Sans replaced with configurable font_family (defaults to
Arial for cross-platform compatibility).
Layers are sorted before rendering: shapes → images → text, then by layer
value, so background shapes reliably render beneath text.
Also adds:
- slides.create: create a blank presentation (returns ID + URL)
- slides.batchUpdate: execute raw Slides API request arrays (escape hatch
for operations not covered by createFromJson)
Complements gemini-cli-extensions#237 which adds further write tools (addSlide, insertText,
replaceText, deleteSlide). The batchUpdate and create implementations here
are intentionally minimal; gemini-cli-extensions#237's versions may supersede them on merge.
There was a problem hiding this comment.
Code Review
This pull request introduces new Google Slides tools for creating presentations, performing batch updates, and generating slides from JSON blueprints. The review feedback highlights the need for consistency by using the registerTool wrapper to respect feature flags and suggests enhancing input schemas to support structured objects. Additionally, the feedback addresses a potential crash in the createFromJson service method and recommends adjusting the slide insertion logic to append new slides to the end of a presentation.
| slidesService.getSlideThumbnail, | ||
| ); | ||
|
|
||
| server.registerTool( |
There was a problem hiding this comment.
The slides.create tool is being registered using server.registerTool directly, which bypasses the registerTool wrapper defined on line 154. This wrapper is responsible for checking if the tool is enabled via feature flags (WORKSPACE_FEATURE_OVERRIDES). Using the wrapper ensures consistency and allows users to disable these tools if needed.
| server.registerTool( | |
| registerTool( |
| ); | ||
|
|
||
| server.registerTool( | ||
| 'slides.batchUpdate', |
| }); | ||
|
|
||
| server.registerTool( | ||
| 'slides.createFromJson', |
| requests: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The requests field is restricted to a string, but the SlidesService.batchUpdate implementation (line 329) and most MCP clients support passing structured arrays directly. Allowing both a JSON string and an array of objects provides a better experience for AI agents.
| requests: z | |
| .string() | |
| .describe( | |
| 'JSON string of an array of Slides API request objects (e.g., [{"createSlide":{}}, {"createShape":{...}}]). Will be parsed server-side.', | |
| ), | |
| requests: z | |
| .union([z.string(), z.array(z.any())]) | |
| .describe( | |
| 'An array of Slides API request objects or a JSON string of that array (e.g., [{"createSlide":{}}, {"createShape":{...}}]).', | |
| ), |
| slideJson: z | ||
| .string() | ||
| .describe( | ||
| 'JSON string of the slide blueprint. Use {"slides":[{"elements":[...]},...]} for multiple slides or {"elements":[...]} for one slide. Will be parsed server-side.', | ||
| ), |
There was a problem hiding this comment.
The slideElementSchema defined on lines 493-591 is currently unused. It should be applied to the slideJson input schema to provide the AI agent with structured validation and clear documentation of the expected blueprint format. Additionally, allowing both objects and strings makes the tool more robust.
slideJson: z
.union([
z.object({
slides: z.array(z.object({ elements: z.array(slideElementSchema) })),
}),
z.object({
elements: z.array(slideElementSchema),
}),
z.string(),
])
.describe(
'The slide blueprint. Use {"slides":[{"elements":[...]}]} for multiple slides or {"elements":[...]} for one slide. Can be a JSON string or object.',
),| const slideDefs = (slideJson as any).slides | ||
| ? (slideJson as any).slides | ||
| : [{ elements: (slideJson as any).elements || [] }]; |
There was a problem hiding this comment.
If the slides format is used but an individual slide object is missing the elements property (e.g., { "slides": [{}] }), slideDefs[i].elements will be undefined. This will cause a crash in buildSlideRequests when it attempts to spread or iterate over elements (line 421).
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides | |
| : [{ elements: (slideJson as any).elements || [] }]; | |
| const slideDefs = (slideJson as any).slides | |
| ? (slideJson as any).slides.map((s: any) => ({ ...s, elements: s.elements || [] })) | |
| : [{ elements: (slideJson as any).elements || [] }]; |
| requests.push({ | ||
| createSlide: { | ||
| objectId: slideId, | ||
| insertionIndex: i + 1, |
There was a problem hiding this comment.
Hardcoding insertionIndex: i + 1 causes new slides to always be inserted at the beginning of the presentation (after the first slide). For an 'add slides' tool, the expected behavior is usually to append slides to the end. Omitting insertionIndex entirely will cause the Slides API to append the new slides to the end of the presentation.
| insertionIndex: i + 1, | |
| slideLayoutReference: { predefinedLayout: 'BLANK' }, |
…late image URLs
Three gaps vs the Python reference implementation:
indent: expose indentStart in updateParagraphStyle so agents can
express bullet/sub-item indentation without raw batchUpdate calls.
Style property: indent (number, points).
vertical_align on shapes: contentAlignment was only applied to text
elements. Shapes (e.g. a labelled colored box) also accept
contentAlignment — added to the shape updateShapeProperties block
so vertical_align works on both element types.
Image URL sanitization: when an LLM outputs a template URL with
unresolved placeholders (e.g. https://.../{{color}}/{{icon}}.png),
the Slides API call fails. Detect '{' or '%7B' in the URL and fall
back to a neutral info icon rather than breaking the whole slide.
|
test with new tool that can batch create slides is slides.createFromJson |
Adds an optional 'theme' parameter with built-in named palettes:
- google — Google Blue/Green + Google Sans
- minimal — dark grey headers + Arial
- dark — dark background + blue accent + Arial
When a theme is active:
- Color fields in style accept named aliases instead of RGB objects:
primary, primary_text, secondary, secondary_text, surface,
surface_alt, text, text_muted, background
- font_family defaults to the theme font (pass 'theme' to opt in explicitly)
- text color defaults to theme.text when not specified
Fully backward compatible — existing blueprints with explicit RGB values
and no theme parameter behave identically to before.
Theme changes:
- Add 'light' theme (default): near-black #202124 headers,
Google Blue 600 #1A73E8 accent, Arial, white background.
Safe cross-platform — no Google-specific fonts required.
- Fix 'google' theme to match official Google brand palette:
primary: #1A73E8 (Google Blue 600, was #1C549E — wrong)
secondary: #34A853 (Google Green, was #198A3D — wrong)
surface: #E8F0FE (Blue 50)
surfaceAlt:#E6F4EA (Green 50)
text: #1F1F1F (gm3-sys-color-on-surface)
textMuted: #444746 (gm3-sys-color-on-surface-variant)
font: Google Sans (unchanged)
- Fix 'dark' theme primary to #0B57D0 (Material 3 gm3-sys-color-primary)
- 'light' is now the default when no theme param is provided,
giving new users sensible font and color defaults out of the box.
Backward compat: existing blueprints with explicit RGB values
get theme.text as text color default (was hardcoded black).
Adds concise layout guidance to the tool description so agents understand the 720x405 grid, layer ordering, padding conventions, font size hierarchy, and theme alias usage without needing a separate skill loaded. Principles-based, not rigid recipes.
Replaces prescriptive layout recipe with design intent guidance: - Lead with 'let content drive layout' not 'header bars go here' - Deck consistency comes from theme/margin/type system, not templates - Technical notes (layers, text clipping) kept as reference - Opinionated layout patterns live in slides-designer skill instead
Retrieves and writes per-slide speaker notes. getSpeakerNotes returns an array — one entry per slide — with slideIndex, slideObjectId, speakerNotesObjectId, and notes text. updateSpeakerNotes replaces notes on a single slide by objectId. Approach adapted from gemini-cli-extensions#235 by @stefanoamorelli (MIT). Key difference: we target slides by objectId rather than index, matching the pattern used by getMetadata and createFromJson throughout this PR.
…cription Two observed failure modes addressed: 1. Theme ignored: agent set theme:'google' but hardcoded RGB values, so theme had no visual effect. Description now leads with 'HOW THEMES WORK — IMPORTANT' and explicitly states that hardcoding RGB bypasses the theme entirely. Color aliases are the mechanism. 2. Speaker notes skipped: agent built the deck but never called updateSpeakerNotes because the prompt didn't mention notes. Description now instructs: for any professional presentation, write speaker notes for every slide after creation.
speaker_notes can now be included per-slide in the JSON blueprint:
{ slides: [{ speaker_notes: '...', elements: [...] }] }
The server writes notes automatically after creating slides — no
separate getSpeakerNotes/updateSpeakerNotes calls needed. This
removes the most common failure mode (agent creates slides but
never calls updateSpeakerNotes because it's a separate step).
Also updates tool description to reflect inline notes and
reinforces that theme aliases are the mechanism for brand colors.
Model was ignoring optional speaker_notes — field existed but was never populated across three test runs. Changed schema description to REQUIRED with 'Omitting speaker_notes produces an unprofessional deck' framing. The server still handles missing notes gracefully (no crash), but the model should now always include them.
The 'google' theme was producing blue-and-green-only decks that didn't look authentically Google. Three changes: 1. Headers now use #202124 (near-black) instead of #1A73E8. Real Google decks use dark headers with brand colors as accents, not solid blue bars on every slide. 2. Added all four Google brand colors as accent aliases: 'blue' #4285F4, 'red' #EA4335, 'yellow' #FBBC05, 'green' #34A853 (also accessible as accent1-accent4). 3. Tool description now says 'use all four across a deck for an authentic Google look' to prevent agents defaulting to blue-only. Theme interface extended with optional accent1-accent4 fields. Backward compatible — existing blueprints using 'primary'/'secondary' still resolve correctly.
Adds explicit instruction to include the signature Google four-color bar (blue/red/yellow/green, 4×180pt rectangles at y=397) on every slide in a Google-themed deck. Also adds semantic color aliases 'blue', 'red', 'yellow', 'green' that resolve to the four brand colors so agents can reference them by name.
Dark theme had both primary and secondary as blue — every slide was blue-on-dark with zero variety. Replaced with Material Design 300-weight colors that are light enough for dark backgrounds: Blue 300 #8AB4F8, Red 300 #F28B82, Yellow 300 #FDD663, Green 300 #81C995 Light theme: added accent1-4 as muted professional tones (Blue, Red 600, Amber, Green 700) for agents that want variety without switching to the Google brand palette. All three themes now have four accent colors available via the blue/red/yellow/green and accent1-4 aliases.
Adds 'LESS IS MORE' principle to tool description: color is for emphasis not decoration, most slides should be mostly white with dark text. Removes 'four-color bar on every slide' instruction (was producing template-heavy output). Theme description simplified.
When createFromJson returns with no speaker_notes provided, the response now includes: speakerNotesStatus: 'MISSING' action_required: 'Call slides.updateSpeakerNotes for each slideId...' This nudges the agent to make a second pass for notes instead of considering the job done. Addresses the failure mode where Gemini drops speaker_notes from the JSON blueprint because it's focused on layout complexity. Tool description updated to describe both approaches: inline (simpler) or two-pass (focus on layout first, notes second).
One-sentence notes don't make a usable talk track. Guided to write 4-6 sentences per slide: opening line, key points, transition.
Google Slides creates a blank first slide (objectId 'p') with every new presentation. createFromJson now deletes it automatically after building the real slides, so the deck starts clean. Fails silently if the slide doesn't exist (not a new presentation).
Dark theme had no real usage and added complexity. Light theme updated to use same Google brand colors (just Arial instead of Google Sans). Two themes: light (default, Arial) and google (Google Sans). Both share the same palette and layout philosophy.
One theme. Google brand palette is always active — no parameter needed. Color aliases resolve automatically. Removes light/dark themes and the theme enum from the schema. Users can still use hardcoded RGB for one-off colors.
Server now automatically adds to every slide created by createFromJson: - 'Google Cloud' text (10pt, text_muted) at bottom-left (x=36, y=375) - 'Proprietary & Confidential' text (10pt, right-aligned) at bottom-right - Four-color brand bar (red/yellow/blue/green, 4×180pt, h=8) at y=397 Agents no longer need to include these in the blueprint JSON. Ensures consistent positioning across all slides — fixes the offset/missing footer inconsistency from agent-generated footers.
Footer text ('Google Cloud', 'Proprietary & Confidential') and
four-color brand bar were hardcoded into createFromJson. This locked
every user into Google branding. Moved to the slides-designer skill
where brand decisions belong. The server stays generic — a different
skill can teach different branding.
Brand-neutral template skill for slides.createFromJson with
placeholder branding ('YOUR COMPANY', generic colors). Users
copy and customize for their organization. Includes all 5 layout
patterns, design philosophy, footer snippet, and tips.
The Google Cloud version lives in n0012/ai-skills/slides-designer
as a filled-in example of this template.
What this adds
slides.createFromJson— a high-level tool designed specifically for AI agents that need to build presentations without knowing the raw Slides API format.Agents pass a JSON blueprint; the server translates it into a Slides API
batchUpdatein one round trip.Blueprint format
Element schema
Each element:
type(text|shape|image),position({x,y,w,h}in points on a 720×405 grid), optionalcontent,shape_type,url,layer(z-index).styleobject (all optional):sizebold,italic,underline,strikethroughfont_family"Arial"for cross-platform compatibilityalignSTART | CENTER | ENDvertical_alignTOP | MIDDLE | BOTTOMcolor{red,green,blue}(0-1)bg_color{red,green,blue}(0-1)border_color/border_weightno_borderbold_phrasesbold_untillinks[{text, url}]inline hyperlinks by matching textHow layers work
Elements are sorted before rendering: shapes → images → text, then by
layervalue within each group. This ensures background shapes reliably render beneath text without requiring agents to manually sequence API calls.Also included
slides.create: create a blank presentation, returns{presentationId, title, url}slides.batchUpdate: raw Slides API request array passthrough — escape hatch for operationscreateFromJsondoesn't coverRelationship to #237
This PR complements #237. The
slides.createandslides.batchUpdateimplementations here are intentionally minimal; #237's versions (with seeding, rollback logic, and folder support) are more complete and should take precedence on merge.slides.createFromJsonis not present in #237 and is the primary new contribution here.Happy to rebase on top of #237 once it merges to reduce overlap.
Validation
tsc --noEmit)