GEMINI OMNI PROMPTS

Glossary · 32 terms · Source-cited

Gemini Omni Glossary

The working vocabulary for prompting Gemini Omni Flash. Camera moves, style references, conversational editing patterns, Omni-specific syntax, and known failure modes — each one sourced from DeepMind\'s official guide or verified community testing.

8 terms

Camera motion verbs

Verbs Omni parses literally. Used in place of "shot list" notation — describe the move in plain English and Omni applies it.

Push / Push in
Camera advances toward the subject.

Equivalent to a dolly-in. Omni interprets "push in" / "punch in" / "slow push" as a forward-moving camera maintaining focus on the subject. Use "fast push" for whip-style energy or "slow push" for tension.

Source: DeepMind prompt guide

Pull-back / Pull-up / Pull-down
Camera reverses away from subject; "up/down" adds vertical motion.

"Pull-back" reveals more of the scene by retreating. "Pull-up" combines pull-back with vertical rise (drone-style). Common opening for landscape and reveal shots.

Source: DeepMind prompt guide

Orbit / Sweep around / Circle
Camera revolves around the subject on a horizontal axis.

Use for hero shots, product showcases, and 360 reveals. Combine with "while subject stays still" to lock subject behavior. Specify speed ("slow orbit", "fast sweep").

Source: DeepMind prompt guide

Pan / Vertical pan
Camera rotates horizontally (or vertically) from a fixed position.

Different from tracking — camera does not move through space, only rotates. "Pan left to right" / "vertical pan upward". Use for scanning a static scene.

Static / Locked off / Oner
Camera does not move; subject moves through frame.

"Locked off" and "static" are synonymous. "Oner" implies a single continuous unbroken take. Use when subject motion or staging carries the shot.

Source: DeepMind prompt guide

Dolly zoom
Camera pushes in while zooming out (or vice versa). The Hitchcock / Vertigo effect.

Background appears to warp while subject stays the same size. Use sparingly — strong narrative cue. Specify "subtle dolly zoom" to avoid melodrama.

Reveal / Ascend revealing
Camera move that uncovers part of the scene that was hidden.

"Pull-back and rotate revealing X" / "ascend revealing the city below". Omni parses "reveal" as an intent, not just a motion — the destination matters.

Source: DeepMind prompt guide

Tracking / Side tracking
Camera moves parallel to a moving subject.

Used to follow walking, running, vehicles. "Side tracking" follows from the left or right side. Specify "smooth tracking" for stabilized look, "handheld tracking" for organic jitter.

5 terms

Style references

Treated as camera/medium descriptors. Omni renders the chosen "lens character" into the output — grain, color, lens distortion.

Natural smartphone zoom
Slightly soft, digital-feeling zoom characteristic of phone cameras.

Use when shooting product/lifestyle that should feel candid rather than cinematic. Pairs well with "vertical 9:16".

Source: DeepMind prompt guide

Film camera (warm, slight grain)
Adds film-stock texture: warm color cast, fine grain, soft highlights.

Use for cinematic, narrative, hero shots. Specify "16mm film" for grittier grain, "35mm film" for cleaner look.

Source: DeepMind prompt guide

Webcam style
Compressed, slightly soft footage typical of laptop cameras.

Useful for video-call mockups, talking-head content. Combine with "looking at camera" for natural eye contact.

Handheld (micro-jitter, organic)
Adds subtle camera shake from a human operator.

Use for documentary feel or run-and-gun energy. Specify "stabilized handheld" if you want minimal shake, "shaky handheld" for chaos.

Source: DeepMind prompt guide

One continuous shot
No edits, no cuts — Omni treats this as a hard constraint.

Locking this phrase into your opening sentence is one of the highest-leverage moves: Omni respects it strictly. Pair with subject and camera motion verbs to fill the duration.

Source: Medium — Gemini Omni Prompt Playbook

4 terms

Composition & duration

How Omni reads frame, time, and output format. Lock these in the first sentence.

16:9 — Cinematic / Standard
Default cinematic widescreen. Best for hero shots, landscapes, narrative.

Pairs with 8-10 second duration for mood pieces. Most YouTube long-form is 16:9.

9:16 — Vertical / Shorts
For Reels, TikTok, YouTube Shorts. Subject should fill the frame.

Pairs with 5-7 second duration. Subject framing matters more here than in 16:9.

1:1 — Square / Social
For Instagram feed, product hero shots.

Pairs with 5-8 second duration. Works for product showcases and slow-motion impact shots.

10-second hard cap (Gemini Omni Flash)
Omni Flash cannot produce a single clip longer than 10 seconds.

For longer videos, chain multiple clips using conversational editing or Flow workspace. Specify duration in your opening sentence; Omni respects it.

Source: TechCrunch launch coverage ↗

4 terms

Conversational editing

Omni's flagship feature — refine a video over multiple turns. The vocabulary here protects consistency across edits.

Conversational editing chain
A sequence of follow-up turns refining the same generated video — "now change X, keep Y identical".

Omni's differentiator vs Veo / Sora. Edits are applied in the same scene context rather than re-generating from scratch. Order matters — each turn builds on the prior frame state.

Source: Atlas Cloud — Conversational video editing ↗

Keep X identical lock
Explicit phrase in every follow-up turn listing what should NOT change. Without it, Omni re-styles the whole scene.

Pattern: `[Change instruction]. Keep [X, Y, Z] exactly the same.` Documented as the highest-leverage discipline for multi-turn editing.

Source: Atlas Cloud features overview

Trigger pattern
Conditional change inside a single clip — "When [action], [transformation]".

Used in Google's viral demos: mirror-arm-transformation, bubble sculpture, origami ships. Pattern: `[Base scene]. When [specific trigger action], [specific transformation]. Keep [list] identical.`

Source: DeepMind prompt guide

Multi-turn consistency score (3/5)
Atlas Cloud's observed character-consistency score across 4+ shots — drift begins past shot 4.

Use `@character_name` with a reference image for best results in shots 1-3; accept noticeable drift beyond shot 4.

Source: Atlas Cloud multi-turn consistency review ↗

6 terms

Gemini Omni-specific syntax

Tokens and features unique to Omni (vs Veo / Sora / Runway). These do not work in other models.

@username summon (Avatar)
Special syntax `@your_username` injects a recorded reference of the user into the generated video.

Tied to the Avatar feature. Requires age ≥ 18, identity verification, clear reference recording (eyes/nose/mouth visible, no sunglasses or masks). Not available in EEA / Switzerland / UK at launch.

Source: Google Gemini Avatar help ↗

Avatar feature
Lets you summon your own face/voice into generated video via @username.

Every video carries a SynthID watermark, non-optional, embedded in pixels. English-only at launch (May 2026). Hands-on review at Chrome Unboxed reports surprisingly accurate lip-sync and micro-expressions.

Source: Chrome Unboxed hands-on review

Reusable character / @character_name
A named character anchored to a reference image, reusable across multiple prompts.

Different from @username — used for non-self characters. Best results within first 3 shots. Combine with `keep [character] identical` for stronger anchoring.

Source: Medium — Gemini Omni Prompt Playbook

SynthID watermark
Invisible pixel-level identifier embedded in every Omni-generated video.

Cannot be disabled. Persists through compression, format conversion, and minor editing. Used to identify AI-generated content. Required by Google for the Avatar feature in particular.

Source: Google Gemini Avatar help

Gemini Omni Flash
The first publicly available Omni model. Fast tier, 10-second clip cap.

Available in the Gemini app (paid AI Plus / Pro / Ultra), Google Flow, and free on YouTube Shorts / Create. Slower/heavier Omni tiers expected to follow.

Source: Google blog — Introducing Gemini Omni

Google Flow
Google's full editing workspace where Omni outputs can be chained into longer sequences.

Where multi-clip productions are assembled. Drops the 10-second per-clip cap by chaining.

5 terms

Failure modes (be aware of these)

Omni weaknesses verified by hands-on community testing. Frame your prompts to avoid these failure surfaces.

Text rendering degradation
Any onscreen text, labels, signs, or brand logos degrade in Omni output.

Avoid mentioning text overlays in your prompt. If text is unavoidable, expect garbled letters. Verified across multiple reviews.

Source: PixVerse hands-on review ↗

Hand articulation drift
Hands gripping objects, typing, signing — fine articulation breaks.

Frame shots to avoid close-ups on hands, or accept some imperfection. Worse with complex objects (musical instruments, tools).

Multi-shot consistency cap
Character identity drifts past shot 4 in conversational editing chains.

Plan productions in chunks of ≤ 4 shots. Use @character_name to anchor identity; expect to regenerate beyond shot 4.

Source: Atlas Cloud multi-turn review

Prompt-length cap (~50 words)
Prompts longer than ~50 words dilute focus and reduce output quality.

Be specific but concise. If the prompt grows, split into a base scene + a trigger / Keep-identical follow-up turn.

Source: Seaart analysis ↗

IP / brand name risk
Referencing copyrighted characters or brand names invites filtering or legal exposure.

Avoid "Studio Ghibli style" — use "hand-painted watercolor animation". Avoid "shot on DJI Mavic Pro" — Omni does not parse camera brands; use motion verbs instead.