Blog / 5 min read · 2026-05-24
Camera vocabulary Gemini Omni parses literally — verbs, lenses, and what to avoid
A working list of camera motion verbs and lens descriptors Gemini Omni interprets correctly. Plus the brand-name traps that look like camera language but actually do nothing.
Most Gemini Omni tutorials list “good prompt examples” without separating which words the model actually parses and which are decoration. This post is the working vocabulary: terms Omni reliably interprets, structured by what they do.
If you’re new to Omni, read the base formula in the field guide first. Then come back here for the verb list.
The six motion verb families
Omni’s prompt guide enumerates these six camera motion verb families. Each maps to a specific behavior you can rely on.
Push (forward motion toward subject)
push in/slow push/fast push— dolly-in equivalentpunch in— sharper, faster pushdolly zoom— combined push-in + zoom-out for the Hitchcock effect
Use for: building intensity, isolating the subject, creating intimacy. Pair with a subject doing nothing — let the camera do the work.
Pull (reverse motion away from subject)
pull-back/slow pull-backpull-up— pull-back + vertical rise (drone reveal)pull-down— pull-back + descend
Use for: revealing context, opening a scene, mood-shifting from intimate to expansive.
Reveal (motion that uncovers hidden information)
ascend revealing the city belowpull-back and rotate revealing Xtilt up revealing the spire
Different from pure pull — the “revealing” destination is the syntactic anchor. Specify what’s being revealed for stronger results.
Orbit (camera revolves around subject)
orbit around/slow orbit/360 orbitsweep aroundcircle the subject
Use for: hero product shots, character introductions, dramatic reveals. Combine with while subject stays still to lock subject behavior.
Pan (rotation from a fixed position)
pan left to right/pan right to leftvertical pan upward/vertical pan downward
Critical distinction from tracking: pan is rotation only, no spatial movement. Use for scanning a static scene.
Static (no camera movement)
static shot/locked off/fixed cameraoner— implies a single continuous takeone continuous shot— explicit “no edits” instruction
Use when subject motion or staging carries the shot. The “one continuous shot” phrase is particularly load-bearing in Omni — it acts as a hard constraint, not a suggestion.
Style references Omni renders
These describe the medium the camera represents rather than the camera move:
natural smartphone zoom— slightly soft, digitalfilm camera, warm, slight grain— film-stock texture16mm film— coarser grain, vintage palette35mm film— cleaner cinematic lookwebcam style— compressed, slightly soft (good for video-call mockups)handheld, micro-jitter, organic— humanized motionstabilized handheld— minimal shakeshaky handheld— chaotic, found-footage
Pair one style reference with one motion verb. Stacking many (“16mm film, handheld, dolly zoom, anamorphic flare”) dilutes focus.
Composition + duration locks
Lock these in your opening sentence:
Create a [duration]-second [aspect-ratio] [genre] video in one continuous shot.
8-second 16:9— standard cinematic5-second 9:16— Reels / Shorts / TikTok10-second 1:1— square social (max Omni duration)8-second 2.39:1— anamorphic widescreen
Omni respects these strictly when specified up front. Stuffing them in metadata or at the end of the prompt is less reliable.
What NOT to write
These look like camera language but actually do nothing in Omni:
Camera brand names
shot on DJI Mavic Pro❌RED Komodo footage❌BMPCC 6K Pro❌
Omni doesn’t parse hardware brands. Use motion verbs and style references instead. “Drone-style aerial pull-up over the cliff” beats “shot on DJI Mavic 3 Cine” every time.
Adjective stacks
cinematic dramatic moody atmospheric epic stunning beautiful breathtaking❌
DeepMind’s prompt guide explicitly says Omni “doesn’t need to be as prescriptive as Veo” — natural language with one clear cinematic anchor beats adjective lists.
IP-bound style names
Studio Ghibli style⚠️ — works but raises IP flagsPixar 3D animation style⚠️ — same
Use medium descriptors instead: hand-painted watercolor animation, 3D animated, soft rim lighting, warm palette. Same look, no legal exposure.
Words over 50
Per Seaart’s analysis, prompts longer than ~50 words dilute focus. Be specific but concise. If your prompt grows, split into a base scene + a follow-up turn with the trigger pattern.
Quick reference: a working prompt
Create an 8-second 16:9 cinematic video in one continuous shot.
A woman in a red coat walks through a snowy Tokyo alley at night.
Slow handheld tracking from her side. 35mm film, warm tungsten lighting from windows.
Parses to: 8 sec duration ✓, 16:9 aspect ✓, one continuous shot ✓, subject ✓, action ✓, location ✓, time of day ✓, camera motion verb (tracking) ✓, style reference (35mm film) ✓, lighting ✓. Under 50 words. No brand names. No adjective stacks.
That’s the shape Omni produces reliably.
Related
- Full field guide — base formula and opening-line trick
- “Keep X identical” lock — for follow-up turns
- Trigger pattern for VFX — when you need a transformation mid-clip
- Glossary — full term-by-term reference
Sources