The unsolved problem of AI character consistency
Talk to anyone who's tried to use AI to build a "digital influencer" or a recurring brand persona, and you'll hear the same complaint: the character keeps drifting. They look right in the first generation. They look right-ish in the second. By the fifth post the nose is different, the jawline shifted, the hair changed colour.
Single-image reference workflows can carry identity for a moment. They can't carry it across a 30-post content calendar.
Why one reference image isn't enough
A diffusion model trained on millions of faces sees your reference image as a hint, not a constraint. From one angle, it has to infer what the back of the head looks like, what the side profile looks like, what they look like when they smile. It guesses — and it guesses differently every time.
The fix: don't make it guess.
Multi-angle character refs
In Vephon 2.0 every persona supports a library of character references — front, three-quarter, side, back, closeup, full body — each tagged with an expression (neutral, smile, serious, laugh, surprise, sad, focus, custom) and an optional outfit and lighting note.
When the image model gets a request, the orchestrator picks the two-to-four references that best match the requested angle and expression and forwards them as conditioning inputs alongside the prompt. With Nano Banana Pro accepting up to 14 reference images per request, identity stays locked even across radical scene changes — beach to boardroom to back-lit interior.
What this unlocks
Three things become possible that weren't before:
- A long-form campaign. A 30-day content calendar can star the same digital persona without face drift.
- Specific shot direction. "Profile shot, slight smile, soft side-light" actually picks the right reference set instead of guessing.
- Outfit and prop continuity. Combine character refs with the #asset system (props, locations, backgrounds) for a fully identity-locked composition.
How to build a good cast member
A rule of thumb: 3–6 references is the sweet spot. More than that and you start crowding the model's context; fewer than three and identity drift creeps back in. We recommend:
- 1× front (neutral expression, primary)
- 1× three-quarter (neutral)
- 1× side profile
- 1× full body (signature outfit)
- 1× closeup (smile)
- Optional: 1× back or alternative outfit
Tag the front-neutral as primary so it's always the anchor.
Mark them once. Generate forever.
Once your character refs are in place, every subsequent generation draws from them automatically — Smart Edit, Multi-Platform Variants, content jobs, scene compositions. The cast member you built becomes a real asset you can re-use indefinitely.