Customer Problem
Shopping online for clothes is still a guessing game. You can’t see how an outfit looks on you, can’t tell if two items work together, and can’t get real advice without walking into a store.
Viostyle turns that guesswork into a conversation: upload a photo, get a realistic avatar, try on any outfit virtually, and get AI-powered styling recommendations tailored to your skin tone, preferences and wardrobe.
Architecture & Models
Viostyle chains 4 different AI models together, each handling a specific task in the pipeline:
- Avatar generation: FireRed Image Edit (fal.ai) — user’s face photo + default clothing template → realistic full-body avatar at 896×1344px. 24-step guided generation with face preservation.
- Outfit recommendations: Doubao LLM (ByteDance) via Volcengine ARK API — system prompt injects skin tone, preferences, scene rules and color theory. Outputs structured JSON, frontend renders as swipeable cards.
- Color season analysis: Doubao vision model — face photo input → warm/cool/neutral classification with confidence score. Pure LLM visual inference, no separate CV pipeline.
- Virtual try-on (garments): FASHN v1.6 (fal.ai) for fast single-garment swaps; IDM-VTON for higher-quality 30-step diffusion. Supports tops, bottoms and one-pieces.
- Virtual try-on (accessories & outfits): FireRed Image Edit for prompt-guided editing — handles bags, layered styling and full outfit changes in one shot.
- Multi-modal chat: users can send photos to the AI stylist, which identifies items in-image via Doubao vision.
What I Designed
I designed and shipped the entire product end-to-end: the model pipeline, the async UX for image generation (fal.ai queue → frontend polling), the conversational styling interface, and the wardrobe import flows from Taobao/Xiaohongshu/JD.
The core UX challenge was making a multi-model pipeline feel like a single conversation. Each step (avatar → recommend → try-on) involves a different model with different latencies. I designed the interface so users never see the plumbing — they just see results.
Business Value
Viostyle demonstrates what AI-native consumer products look like when designed by someone who understands both users and models. It’s not a ChatGPT wrapper — it’s a purpose-built experience where 4 models work together invisibly so the user gets what they need: confidence before they buy.