AI Analysis
VidStitch uses advanced AI to analyze your content and make intelligent decisions about clip placement and visual sourcing. This document explains how the AI works.
AI Technology
VidStitch is powered by Google Gemini 2.5 Flash, a state-of-the-art multimodal AI model that excels at:
- Natural language understanding
- Context analysis
- Visual-text matching
- Temporal reasoning
How Analysis Works
Content Understanding
The AI processes your content in several ways:
- Semantic Analysis - Understanding meaning, not just words
- Entity Extraction - Identifying people, places, events
- Temporal Mapping - Understanding time references
- Thematic Grouping - Recognizing related concepts
Example Analysis
Input Script:
AI Understanding: - Entity: Roman Colosseum (landmark, Rome, Italy) - Entity: Emperor Titus (historical figure, Roman) - Date: 80 AD (ancient history) - Event: Completion/construction - Theme: Ancient Roman architecture
Analysis Types by Workflow
B-roll Clips Analysis
The AI identifies: - Optimal insertion moments (natural pauses, topic transitions) - Contextual relevance (what B-roll fits the narration) - Pacing considerations (avoiding rapid cuts) - Content type awareness (documentary vs entertainment)
VidStitch AI Analysis
The AI determines: - Visual search strategies - Content categorization - Sentence-level visual mapping - Quality scoring for source materials
V5 Story Analysis
The AI performs: - Source video scene cataloging - Script-to-scene matching - Coverage gap identification - Transition planning
Moment Detection
For B-roll insertion, AI detects ideal moments based on:
Natural Pause Points
- End of sentences
- Topic transitions
- Speaker changes
Content Signals
- Descriptive language ("The vast landscape...")
- Time references ("In 1492...")
- Location mentions ("In Paris, France...")
Exclusion Criteria
AI avoids suggesting insertions during: - Direct quotes - Critical explanations - Emotional peaks - Question-answer sequences
Visual Matching
Query Generation
For each script segment, AI generates targeted search queries:
Script: "The Eiffel Tower was built for the 1889 World's Fair"
Generated Queries: 1. "Eiffel Tower Paris daytime" 2. "Eiffel Tower construction historical" 3. "1889 World's Fair Paris" 4. "Eiffel Tower aerial view"
Source Selection
AI ranks potential sources by: - Relevance score (0-100) - Visual quality indicators - Duration suitability - Content appropriateness
Quality Scoring
Each AI decision includes a confidence score:
| Score | Meaning | Action |
|---|---|---|
| 90-100 | Excellent match | Auto-approve |
| 70-89 | Good match | Review recommended |
| 50-69 | Acceptable | Manual review |
| < 50 | Poor match | Likely rejected |
Content Modes
Documentary Mode
AI prioritizes: - Educational accuracy - Historical authenticity - Informative visuals - Longer clip durations
News Mode
AI prioritizes: - Current/recent footage - Fast-paced editing - Multiple visual changes - Contemporary sources
Sermon Mode
AI prioritizes: - Respectful imagery - Biblical/spiritual context - Avoiding speaker interruption - Contemplative pacing
Improving AI Results
Write Specific Content
Poor: "Many things happened during that time."
Good: "The Industrial Revolution transformed British factories between 1760 and 1840."
Include Visual Subjects
Poor: "It was important for many reasons."
Good: "The steam engine revolutionized transportation across railway networks."
Use Proper Nouns
Poor: "The leader gave a famous speech."
Good: "Winston Churchill delivered his 'We shall fight on the beaches' speech."
AI Limitations
What AI Cannot Do
- Source copyrighted/restricted content
- Create fictional imagery
- Understand sarcasm reliably
- Handle heavy metaphors
- Process non-English content (currently)
Edge Cases
The AI may struggle with: - Very abstract concepts - Highly specialized jargon - Regional/local references - Recent events (training cutoff)
Cost and Credits
AI analysis consumes credits based on: - Script/transcript length - Analysis complexity - Number of segments - Visual sourcing (VidStitch AI)
Typical costs: | Operation | Credit Cost | |-----------|-------------| | Short analysis (<5 min) | 1-2 credits | | Standard analysis (5-15 min) | 2-5 credits | | Long analysis (15+ min) | 5-10 credits | | Visual sourcing | Additional 1-5 credits |
Transparency
VidStitch AI provides reasoning for its decisions:
Placement Reasoning
{
"time": 45.5,
"reason": "Topic transition to Roman architecture",
"confidence": 87,
"context": "Speaker moves from history to visual description"
}
Source Reasoning
{
"segment": "The Colosseum stands in Rome",
"query": "Colosseum Rome exterior daytime",
"selected": "video_id_123",
"reason": "Clear exterior shot matching description"
}
Next Steps
- Writing Effective Scripts - Optimize for AI
- Rendering Options - Output your video
- Troubleshooting - AI issues