Most AI video prompts describe what to show — "a woman walking through a forest" — without saying anything about how to frame it. That's like describing a photograph without mentioning whether it's a close-up or an aerial shot.
Camera vocabulary is one of the most underused levers in AI video prompting. The same scene described from a worm's eye view vs. a bird's eye view generates completely different emotional content. A dutch angle signals psychological unease. A dolly-in creates intimacy. A crane shot builds epic scale.
This guide covers the three layers of camera language — angles, shot sizes, and movements — plus how to combine them into prompts that AI video models actually follow.
Three Layers of Camera Language
Most guides conflate these. They're distinct:
- Camera angle — the vertical position and perspective of the camera relative to the subject (eye level, low angle, dutch angle)
- Shot size — how much of the subject the frame includes (wide shot, close-up, extreme close-up)
- Camera movement — how the camera moves during the shot (dolly, pan, tracking, orbit)
Effective prompts typically combine all three: "Low angle close-up, the camera slowly dollying in toward the figure as they reach for the door handle."
Camera Angles: Vertical Position and Perspective

Eye Level
Camera placed at the subject's eye height. The neutral default. Creates familiarity and equality — neither power nor vulnerability. Use it when you want the viewer to feel present with the subject.
Prompt phrase: "Eye-level medium shot, the camera level with her gaze as she sits across the table."
Low Angle
Camera below the subject, pointing upward. The subject appears powerful, dominant, or imposing. Background becomes sky or ceiling, adding grandeur.
Prompt phrase: "Low angle shot looking upward at the figure on the rooftop edge, the city lights glowing behind them, the camera tilting up to emphasize their silhouette."
High Angle
Camera above the subject, pointing downward (30–60 degrees, not straight down). The subject appears smaller, vulnerable, or surveilled.
Prompt phrase: "High angle shot looking down at a lone figure in the center of an empty plaza, emphasizing their smallness against the geometry of the space."
Bird's Eye View
Camera directly overhead at 90 degrees. Reveals spatial patterns and scale. Creates a detached, god-like perspective — useful for establishing geography or emphasizing symmetry.
Prompt phrase: "Bird's eye view looking straight down onto a busy intersection at night, taxi headlights tracing light trails across wet asphalt."
Worm's Eye View
Camera at or near ground level, pointing straight up. Subjects appear towering. Creates awe, scale, or disorientation.
Prompt phrase: "Worm's eye view from ground level looking straight up at the skyscraper, its glass face reflecting clouds as the structure converges toward a narrow strip of sky."
Dutch Angle
Camera tilted on its roll axis — the horizon line becomes diagonal. The visual instability signals something is wrong, tense, or psychologically off-balance.
Prompt phrase: "Dutch angle, camera tilted 20 degrees clockwise, a figure walking down a flickering corridor, the diagonal walls amplifying a growing sense of unease."
LTX-Video tip: If "dutch angle" alone doesn't stick, reinforce with synonyms: "a disorienting angled shot with a tilted horizon, camera canted to the left."
Over-the-Shoulder (OTS)
Camera positioned just behind one character's shoulder, framing the other person in the mid-ground. Establishes spatial relationship and intimacy between characters.
Prompt phrase: "Over-the-shoulder of the detective, camera behind their right shoulder, framing the suspect across the interrogation table under a harsh overhead light."
POV (Point of View)
Camera represents the literal eyes of a character. Immediate, first-person immersion.
Prompt phrase: "POV shot from the perspective of someone hiking through dense forest, hands occasionally visible brushing aside branches, dappled light ahead."
Shot Sizes at a Glance
| Shot Name | What It Shows | Best For |
|---|---|---|
| Extreme Wide Shot | Vast environment, subject tiny | Scale, isolation, world-building |
| Wide Shot | Full body + surroundings | Subject in environment, context |
| Medium Shot | Waist up | Dialogue, character introduction |
| Close-Up | Face and neck | Emotion, reaction |
| Extreme Close-Up | Single feature (eye, hand, object) | Intensity, detail, emphasis |
| Establishing Shot | Location overview | Scene opening, spatial context |
Camera Movements

Static / Locked-Off
Camera does not move. All motion happens within the frame. AI video models are biased toward adding camera movement, so explicitly state: "The camera is entirely motionless for the duration of the scene."
Pan (Left / Right)
Camera rotates horizontally from a fixed point. Reveals a wider scene or follows action side to side.
Prompt: "The camera pans slowly left, sweeping across the rooftops to reveal the harbor beyond."
Tilt (Up / Down)
Camera rotates vertically from a fixed point. Builds anticipation as it ascends or reveals height and depth.
Prompt: "A tilt upward begins at the base of the lighthouse, rising to reveal its beam cutting through heavy fog."
Dolly / Push-In / Pull-Back
The entire camera physically moves forward or backward. Unlike zoom, a dolly changes background perspective. Push-in creates intimacy and intensity. Pull-back reveals context and isolation.
Push-in: "A slow dolly forward gradually fills the frame with her face as the background softly wraps around her."
Pull-back: "The camera pulls back steadily, revealing the figure has been standing alone in an enormous empty stadium."
Tracking Shot
Camera follows a moving subject, matching their speed and direction.
Prompt: "A tracking shot follows the runner from behind, the camera keeping pace at shoulder height, the crowd blurring on either side."
Orbit / Arc
Camera circles around a stationary subject, keeping it centered.
Prompt: "The camera orbits slowly clockwise around the statue, neon reflections shifting across its surface as the city background rotates behind it."
Crane / Boom
Camera sweeps vertically — rising from ground level to aerial height, or descending from above.
Prompt: "A crane shot sweeps upward from a close-up on the commander's face, rising to reveal the full battlefield stretching toward the horizon."
Handheld
Natural handheld shakiness and organic movement. Signals realism, urgency, and immediacy.
Prompt: "A handheld camera follows close behind the medic, its natural jitter conveying the chaos of the scene."
Combining Angle + Shot Size + Movement
The most effective prompts layer all three. Here are four copy-paste examples ready to use in LTX-Video or any AI video model:
Cinematic tension:
"Low angle close-up, the camera slowly dollying in toward the figure as they reach for the door handle, the ceiling looming above and casting long shadows, natural light from a single bare bulb."
Epic scale:
"A wide establishing shot from a crane rising upward, beginning at street level and ascending above the cityscape until the full sweep of the harbor becomes visible at golden hour, the camera movement smooth and unhurried."
Intimacy:
"Eye-level medium close-up, the camera gently pushing in as the conversation grows quieter, shallow focus keeping the subject sharp while the background softens to an impressionistic blur."
Unease:
"Dutch angle medium shot, camera tilted 15 degrees, a figure walking toward the camera in an empty parking structure at night, the diagonal geometry of the concrete walls reinforcing their disorientation."
Putting These to Work in LTX-Video
The LTX-Video prompting guide recommends writing camera instructions as part of a flowing paragraph rather than a list. Runway's official camera terms reference is also a useful cross-check for verifying which terms AI models recognize. Start with the shot and movement, then describe what happens during the motion.
LTX-23 runs on LTX-Video and responds well to this vocabulary. The model is specifically built for cinematically coherent output — which means camera terms like "slow dolly forward," "orbiting around," and "crane rising to aerial height" are interpreted with meaningful accuracy when combined with clear scene descriptions.
Quick Reference: What to Avoid
- Conflicting instructions: "Static handheld shot" — these contradict each other. Pick one.
- Invisible features: Don't describe what would only be visible from a different angle than the one you specified.
- Vague scale: "Wide shot" is underspecified. "Extreme wide shot showing the full landscape with the figure as a small shape on the horizon" is not.
- No endpoint: For movement shots, describe what the scene looks like after the motion completes. This helps the model understand the trajectory.
Verdict
Camera vocabulary is the fastest way to improve AI video output without changing your subject or story. Fifteen minutes learning these terms pays off in every prompt you write from here on.
Start with three: a shot size, a camera angle, and one camera movement. Combine them, describe the scene, and add the endpoint. That's the formula that consistently produces more intentional results.
