Tales Beyond Order

Intro

The project stands as a hybrid of AI-driven visuals and human craftsmanship. Instead of relying on AI as an automatic generator, it was used as a starting point, with each image manipulated, edited and reshaped to serve the story. The result is a film that retains a handcrafted, intentional aesthetic, proving that technology is a tool, not a replacement for creative vision. Many images and movements were created using hybrid techniques, allowing us to maintain full control over the narrative.

Composition work

Pose Extraction

The editing of the film was more like a visual collage, layering different elements to create a unique aesthetic. While AI tools like MidJourney played a role in generating imagery, these images were heavily altered and reworked. For example provided MidJourney originally polished, Disney-like characters with big expressive eyes. These were edited out, replaced with small black dots—a deliberate symbol of the “soulless” nature of this world. Instead of depicting a camp with 20 tents, we show thousands of identical tents stretching endlessly. This reinforces the idea of a world where individuality is erased, and experiences are generic and pre-defined.

Runway Gen2 was limited to producing a maximum of four seconds of coherent body movement.

Video-to-video generation with WAN 2.2 VACE.

WAN 2.2 VACE (Video-Aware Consistency Engine) is designed for frame-stable generation: it aligns temporal consistency across frames so the style remains locked while motion evolves smoothly. In my workflow, the first frame of the sequence was used as a style reference, ensuring that the yellow Cinema4D hand’s appearance stayed identical throughout. The DW-Pose sequence then guided motion, so what you see is my real hand movement faithfully translated into the stylized model.

Miro World

Outro

Everything is generated end-to-end in ComfyUI. Thanks to WAN 2.2 VACE’s frame stability, the output avoids jitter and drifting style, while the real-world motion input keeps the animation expressive. The result is a natural hand animation built entirely from a lightweight AI pipeline—no traditional rigging required.

⬑ research / AI Hand

Shortfilm

https://vimeo.com/999828074

 

Helpful Resources

AI Hand

Intro

The idea was to animate a Cinema4D hand without rigging by transferring real motion through a video-to-video pipeline in ComfyUI. The workflow: record my hand → extract pose with DW Pose → generate the final animation with WAN 2.2 VACE, using the first frame as a style reference.

Motion Capture

Pose Extraction

The source video is processed in ComfyUI with the ControlNet DW Pose Estimator to get frame-accurate skeletal cues for the wrist, palm and fingers. This gives me a clean motion signal—timing and gesture—without committing to any rig.

Comfy

Video-to-video generation with WAN 2.2 VACE.

WAN 2.2 VACE (Video-Aware Consistency Engine) is designed for frame-stable generation: it aligns temporal consistency across frames so the style remains locked while motion evolves smoothly. In my workflow, the first frame of the sequence was used as a style reference, ensuring that the yellow Cinema4D hand’s appearance stayed identical throughout. The DW-Pose sequence then guided motion, so what you see is my real hand movement faithfully translated into the stylized model.

Result

Outro

Everything is generated end-to-end in ComfyUI. Thanks to WAN 2.2 VACE’s frame stability, the output avoids jitter and drifting style, while the real-world motion input keeps the animation expressive. The result is a natural hand animation built entirely from a lightweight AI pipeline—no traditional rigging required.

⬑ research / AI Hand

What is ComfyUI?

ComfyUI is an open-source, node-based interface for Stable Diffusion that lets you build custom image and video generation workflows by connecting modular nodes for full control and flexibility.

What is WAN 2.2?

WAN 2.2 is an open-source text-to-video and video-to-video diffusion model. It introduces the Video-Aware Consistency Engine (VACE), which is designed to keep style stable and motion coherent across frames, making it especially suited for workflows that combine visual consistency with external motion guidance.

Helpful Resources