Viostyle — Rainbow Wu

Project Overview

Shopping for clothes online is still a guessing game — you can’t see how an outfit looks on you, can’t tell if two pieces work together, and can’t get real advice without walking into a store.

Viostyle turns that guesswork into a conversation. Upload a photo, get a realistic digital avatar, try on any outfit virtually, and get AI styling recommendations tailored to your skin tone, body type, weather, and wardrobe — all in one chat interface.

Architecture & Models

Viostyle chains 5 AI models into a single conversational pipeline, each handling a specific task:

Avatar generation: FireRed Image Edit (fal.ai) — user’s face photo + clothing template → realistic full-body avatar at 896×1344px. 24-step guided diffusion with face/body preservation.
Outfit recommendations: Doubao LLM (ByteDance Volcengine ARK) — system prompt encodes skin tone rules, color theory, scene-based styling, and available wardrobe inventory. Outputs structured JSON that the frontend renders as swipeable outfit cards.
Virtual try-on (garments): FASHN v1.6 (fal.ai) for fast single-garment replacement (~10s). Supports tops, bottoms, and one-pieces with category-aware masking.
Virtual try-on (accessories & full outfits): FireRed Image Edit — prompt-guided editing at 640×960px, 12-step accelerated inference. Handles bags, shoes, layered styling, and background changes in one pass.
Multi-modal chat: Doubao vision model identifies clothing items in user-uploaded photos, enables “try this outfit on me” from any image. Also powers skin-tone color season analysis (warm/cool/neutral).
Environment-aware: Real-time weather API integration — Vio proactively adjusts recommendations based on temperature, rain, and humidity without being asked.

What I Designed

I designed and shipped the entire product end-to-end: the multi-model pipeline, the async UX for AI image generation (fal.ai queue → frontend polling with progressive status), the conversational styling interface, and wardrobe import flows from Taobao/Xiaohongshu/JD order data.

The core UX challenge was making a 5-model pipeline feel like a single conversation. Each step — avatar creation, outfit recommendation, virtual try-on — involves a different model with different latencies (2s for text, 10–20s for images). I designed the interface so users never see the plumbing: they just chat, get recommendations, and see themselves wearing the outfit.

Async queue architecture with frontend polling — no blocking, users see progressive status updates
Available-inventory-aware AI prompts — recommendations always map to items the system can actually show
Language-adaptive responses — same model serves English and Chinese users naturally
Weather-reactive styling — proactive suggestions without manual context switching

Business Value

Viostyle demonstrates what AI-native consumer products look like when designed by someone who understands both users and models. It’s not a ChatGPT wrapper — it’s a purpose-built experience where 5 models work together invisibly so the user gets what they actually need: confidence before they buy.