Living Intelligence: Utilizing advanced AI-Image generation for Narrative Storytelling

Research Team

Spencer Idenouye
Spencer Idenouye
Kevin Santos
James Rowan
Emerson Chan

Partners

Makers
NRC IRAP
CTO

Impact

  • Enabled real-time laser visuals controlled by motion capture.
  • A reliable AI video stylization pipeline using LoRA and ControlNet.
  • Delivered repeatable workflows for immersive and narrative-driven media.

Developing a Refined Workflow for Stylizing Video with Stable Diffusion

The “Living Intelligence” project, a collaboration between Makers and SIRT, addressed the challenge of integrating AI-generated visuals into narrative video while maintaining consistency in character, emotion, and movement. Recognizing a lack of control over AI stylization, the initiative developed a refined visual effects pipeline using Stable Diffusion and related AI tools. An iterative methodological approach involved meticulous video pre-processing, advanced AI styling with ControlNet for detail preservation, and the creation of custom LoRA models for precise character and style control. Post-processing techniques, including Davinci Resolve’s DeFlicker and depth map-based masking in After Effects, ensured visual fidelity and flexible compositing.

The outcome is a validated workflow that enables controlled, cohesive, and high-quality stylized video content, empowering creators to harness AI’s artistic potential for compelling storytelling.

The Challenge of Elevating AI-Generated Visuals for Cohesive Storytelling

In narrative storytelling, integrating AI-generated imagery into video has presented significant challenges, particularly in maintaining character consistency, emotional expression, and seamless alignment with camera and character movements. While AI tools like Stable Diffusion offer unique stylized looks, there has been a notable lack of refined processes to control these artistic outputs, often leading to unpredictable “happy accidents” or “errors” that disrupt visual cohesion. The “Living Intelligence” project was conceived as a proof of concept to address this gap, aiming to develop a sophisticated visual effects pipeline that harnesses AI image generation and manipulation tools while ensuring artistic control and maintaining narrative integrity.

Developing Experimental Prototypes for Immersion and AI Stylization

The research team developed two distinct but complementary workflows:

  1. Subtractive Space Workflow (Laser Interaction)
    • Used OptiTrack motion capture to control TouchDesigner-driven laser visuals.
    • Aligned real-time mocap data with TouchDesigner’s coordinate system.
    • Created a prototype showcasing interactive particle systems and laser beams driven by performer movement.
  2. Living Intelligence Workflow (AI Video Stylization)
    • Employed Stable Diffusion, ControlNet, and LoRA training to style video frames with artistic control.
    • Addressed flickering, masking, and actor facial fidelity through a series of best practices:
      • Batch processing with sharpened footage.
      • Depth map masking in After Effects.
      • Fine-tuned LoRA models to stylize individual characters
    • Documented workflows in tools like Kohya GUI and DaVinci Resolve.

A Calibrated System for Real-Time Virtual Production

“Living Intelligence” project successfully developed and validated a comprehensive framework and workflow for using diffusion models to stylize video frames while retaining critical elements like actor likeness and emotion. This process yielded methods to enhance image edge fidelity, precisely control stylistic influence, and effectively mask elements for compositing.

Key outcomes include:

  • One (1) Interactive Prototype: Real-time motion-captured visuals controlling laser and LED wall outputs in TouchDesigner
  • One (1) AI Workflow Process: A repeatable video stylization pipeline that retains actor emotion, sharpens fidelity, and masks background with depth maps.


Supporting tools and practices were validated through testing and shared as internal documentation. Makers now has a technically sound foundation for both immersive event production and AI-enhanced post-production, positioning them to scale these innovations into future client-facing work.