- Extracted visual_stats per scene (face count, size, objects, duration, density) - Classified 1130 scenes into 18 types (establishing/close_up/medium/long/two/group × dialogue/sparse/silent) - All from existing data, no LLM needed - Scene type stored in cut chunk metadata