Tag: model comparison

  • Kling V3 Omni vs Wan 2.6: 5 Prompt Test (9:16, 720p)

    Kling V3 Omni vs Wan 2.6: 5 Prompt Test (9:16, 720p)

    This post runs the same 5 short ad-style prompts on two text-to-video models: Kling V3 Omni and Alibaba Wan 2.6. Each test uses 9:16, 720p, 5 seconds, audio off, and one simple camera move.

    Quick specs

    Item Kling V3 Omni Wan 2.6
    Provider Kling Alibaba
    Type Text-to-video (also supports image + reference inputs) Text-to-video (also supports image + reference inputs)
    Resolution options std (720p), pro (1080p) 720P, 1080P
    Aspect ratios 16:9, 9:16, 1:1 16:9, 9:16, 1:1, 4:3, 3:4
    Duration tested here 5 seconds
    Ratio tested here 9:16 (vertical)
    Audio tested here Off

    Test setup (same for both models)

    • Goal: fast vertical clips that could work as product ads or UGC-style demos
    • Duration: 5 seconds per run
    • Ratio: 9:16
    • Resolution: 720p
    • Audio: off
    • Prompt style: 2-4 short sentences, one camera move

    5 prompt results (Kling vs Wan)

    1) Perfume bottle hero shot (reflections)

    Prompt: 9:16 commercial product video. A premium matte-black perfume bottle on dark wet slate. Soft rim light, realistic reflections. Slow camera push-in with a gentle turntable rotation. Clean background, no text.

    Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.

    Kling V3 Omni Wan 2.6
    Prompt 1 (Kling): perfume bottle hero shot.
    Prompt 1 (Wan): perfume bottle hero shot.
    • Kling keeps a matte cylindrical bottle consistent across frames, with stable lighting and reflections.
    • Wan renders a glossier rectangular glass bottle look with strong highlights. Framing stays steady.
    • Both clips look ad-usable for a clean product hero shot.

    2) UGC hand demo (small object handling)

    Prompt: 9:16 UGC phone video in a bright kitchen. A hand opens a wireless earbuds case and takes one earbud out. Slight handheld shake, natural skin texture. Simple background, no text.

    Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.

    Kling V3 Omni Wan 2.6
    Prompt 2 (Kling): earbuds case open + grab.
    Prompt 2 (Wan): earbuds case open + grab.
    • Kling stays stable across the open-and-grab sequence, with normal-looking hands in the sampled frames.
    • Wan looks coherent, but a couple frames show small geometry changes on the earbud/case.
    • For UGC hands, keeping the action list short helps both models.

    3) Running shoes turntable (geometry consistency)

    Prompt: 9:16 studio product ad video of a pair of running shoes on a turntable. One smooth orbit around the shoes. Sharp fabric texture, clean highlights, soft shadow. Minimal background, no text.

    Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.

    Kling V3 Omni Wan 2.6
    Prompt 3 (Kling): shoe turntable/orbit.
    Prompt 3 (Wan): shoe turntable/orbit.
    • Kling keeps the shoe form consistent across frames and looks safe for a generic product spin.
    • Wan looks more stylized and detailed, but the midsole/shape shifts across frames and a brand-like side mark appears.
    • If the product must stay exact, watch for shape drift and accidental branding on footwear prompts.

    4) Stop-motion wrapper reveal (style lock)

    Prompt: 9:16 stop-motion paper cutout ad scene. A chocolate bar wrapper flips open and a paper chocolate square pops out. Handcrafted paper texture, simple loop-like motion. Clean composition, no text.

    Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.

    Kling V3 Omni Wan 2.6
    Prompt 4 (Kling): paper cutout wrapper reveal.
    Prompt 4 (Wan): paper cutout wrapper reveal.
    • Kling keeps the wrapper and chocolate piece coherent across frames, with a clean minimal look.
    • Wan shows readable wrapper text (“Chocolate”) even though the prompt asked for no text.
    • If you need brand safety, add a stronger negative prompt for text/logos and keep packaging generic.

    5) Busy neon subway (crowd + signage)

    Prompt: 9:16 cinematic handheld shot on a crowded subway platform at night. Neon lights reflect on a wet floor. People walk past the camera. One forward tracking move, realistic motion blur, no readable text.

    Settings: Kling mode=std, duration=5s, ratio=9:16, sound=off, scale=0.5. Wan mode=std, duration=5s, ratio=9:16, resolution=720P, audioEnabled=false.

    Kling V3 Omni Wan 2.6
    Prompt 5 (Kling): crowded subway platform.
    Prompt 5 (Wan): crowded subway platform.
    • Kling lands a busy crowd scene, but the sampled frames show more chaos: heavier blur/ghosting and more visible signage.
    • Wan looks more composed and cinematic across frames, with steadier framing and fewer obviously readable signs.
    • For public scenes, always watch for readable signage and recognizable faces if the clip goes into a real ad.

    Verdict (based on these 5 tests)

    • If the goal is clean product shots and simple hand demos, Kling V3 Omni looks steadier and more “safe” across frames.
    • If the goal is a more cinematic vibe for environments (like the subway test), Wan 2.6 looks more composed in this set.
    • Both can surprise you with accidental text or brand-like marks. Negative prompts help, but reviewing frames before shipping is mandatory.

    Prompt tips that improved stability

    • Write 2-4 short sentences. One subject, one camera move.
    • Say “no text” and also add a negative prompt for text, watermark, and logos.
    • For hands: keep the action list to one clear action (open, grab, place). Avoid multi-step instructions.
    • For product spins: keep the background minimal and avoid brand names.

    Try the prompts

    Copy the prompts above, keep the settings the same (9:16, 5s, 720p, audio off), and swap only one variable at a time (ratio, duration, or camera move). That makes it easy to see what actually changes.