Two parts: 6 video comparisons (A/B blind test, ~10 min) + 15 prompt evaluations (8-dimension scoring, ~20 min). You may leave and resume anytime.