VTuber Blog — Live2D Tutorials, Industry Tips

ARKit (iPhone face tracking) gives you 52 blendshapes vs ~10 from typical webcam tracking. The result: micro-expressions like raised eyebrows, lip corners pulling, and authentic surprise faces — things webcam VTubers can't do.

If you VTube with a webcam, you've probably noticed your character's face looks "okay" but never quite ALIVE. There's a reason: webcam-based face trackers (OpenSeeFace, MediaPipe) detect maybe 10-15 facial movements. Apple's ARKit on iPhone X and newer detects 52.

What ARKit actually tracks

The 52 ARKit blendshapes cover micro-movements webcams miss entirely:

Inner brow raise (the surprised "oh!" face)
Outer brow raise (the questioning lift)
Brow squeeze (concentration)
Cheek puff (annoyance / chubby cheeks)
Lip corner pull L+R (asymmetric smile)
Lip funnel (kissy face)
Tongue out (yes, anime VTubers love this one)
Eye look up/down/left/right (independent per eye)
...and 40+ more granular controls

Webcam trackers usually do: blink, mouth open, head turn, head tilt. That's about 8-10 axes vs 52.

Why this matters for retention

Viewers stay tuned when a streamer is expressive. Subtle eyebrow lifts during excited moments, lips parting before a laugh, eye darts when reading chat — these are unconscious cues that humans pick up. When your Live2D character DOESN'T do them, you feel "off" without knowing why.

The retention difference between basic-webcam and ARKit-rigged VTubers in long streams is measurable. Streams over 90 minutes see 15-30% better viewer retention with proper ARKit rigs vs webcam-only.

Setup: cheapest path to ARKit tracking

iPhone X or newer (X, XS, XR, 11+, 12+, 13+, 14+, 15+, 16+) — refurbished iPhone X starts around $80 in 2026
VTube Studio app on iPhone ($10 one-time)
Phone tripod / desk mount at eye level, 30-50 cm away
WiFi connection on same network as your PC
Live2D rig with ARKit blendshape mapping — most premium rigs include this; basic rigs don't

Without an iPhone: webcam ARKit alternatives

If you don't have an iPhone, three webcam options approximate ARKit but never match it:

MeowFace (Android) — uses Android face mesh, ~30 blendshapes. Better than webcam, worse than iPhone.
iFacialMocap PC app + good webcam — extracts ~20 blendshapes from a quality webcam (Logitech Brio recommended). Decent.
OpenSeeFace — open source, free, ~12 axes. Good baseline.

Does YOUR rig support ARKit?

Open VTube Studio with your model. Settings → Tracking → if you see "ARKit Blendshapes" with most checkboxes mappable, your rig supports it. If you only see basic blink/mouth toggles, the rigger didn't do the ARKit work.

Most riggers charge $50-150 extra to add ARKit support to an existing rig — easier and cheaper than reordering. Or commission an ARKit-ready rig from the start.

What AnimArts ships

Every Live2D rig from Standard tier and up includes full 52-blendshape ARKit mapping by default. We test it with the actual VTube Studio iPhone app before delivery — no "oh, you need to remap that" surprises.

If you bring an existing rig that lacks ARKit, we offer ARKit retrofit at $80 for most rigs.

Get an ARKit-ready Live2D commission →

Bottom line

An iPhone X + ARKit-rigged Live2D model is the single biggest "looks alive" upgrade you can give your VTuber persona. Total cost (used iPhone + rig retrofit) starts around $200. The retention boost on long streams pays it back in 1-2 months for most monetised VTubers.

Ready to Get Started?

Get a personalized quote for your project. We respond within 24 hours.

Get a Free Quote View All Services

ARKit VTuber Rigs: Why iPhone Tracking Beats Webcam in 2026