Creative Tech Digest

Share this post

Multi-ControlNet & Open Source AI Video Generation

creativetechnologydigest.substack.com

Multi-ControlNet & Open Source AI Video Generation

Full breakdown + takeaways from using ControlNet to make a video2video workflow. I use NeRFs here, but it'll apply to any 3D rendered or live action input. Let's get into it!

Bilawal Sidhu
Feb 27, 2023
10
3
Share
Share this post

Multi-ControlNet & Open Source AI Video Generation

creativetechnologydigest.substack.com

ControlNet continues to capture the imagination of the generative AI community — myself included! This thread is a continuation of my deep dive into ControlNet and it's implications for creators and entrepreneurs.

ICYMI, here’s the last post using ControlNet to redecorate a 3D scan of a room. I plan to make more videos + posts like the below, so stay tuned!

Creative Tech Digest
AI Room Makeover: Reskinning Reality With ControlNet, Stable Diffusion & EbSynth
Hey Creative Technologists! Today we’ll be covering an AI video experiment I created to learn and prepare a deep dive on ControlNet…
Read more
3 months ago · 3 likes · 1 comment · Bilawal Sidhu

🪄 Honestly though, this wave of generative AI makes me feel like I'm 11 again discovering VFX & 3D animation software that gave me powers of digital sorcery to blend reality & imagination. It was fun to do this interview on my journey as a YouTube creator & product manager where I go deeper 🙏🏾

No doubt generative AI will brings out the child in all of us. It’s like we haven’t put together all the primitives at our disposal in all possible combinations and we’re learning new things every day.

Plus, new primitives keep dropping. A few of you realize we multi-control net, make a feature request, and bam it’s implemented in a few days. And we didn’t even have to write a PRD or sit through reviews 😉


Now on to the workflow to make 🔥 videos with the latest in open source AI!

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Multi ControlNet is a game changer for making an open source video2video pipeline. I spent some time hacking this NeRF2Depth2Image workflow using a combination of ControlNet methods + SD 1.5 + EbSynth. 🧵 Full breakdown of my workflow & detailed tips shared in the thread below ⬇
11:59 PM ∙ Feb 25, 2023
1,050Likes196Retweets

Subscribe to the Creative Tech Digest and get the AI workflows like this right to your inbox

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Here's an overview workflow we're going to deconstruct! At a high level: Capture video (used my iPhone) ➡️ Train NeRF (used Luma AI) ➡️ Animate & Render RGB + Depth ➡️ Multi-Control Net (Depth + HED) ➡️ EbSynth ➡️ Blending & Compositing. Now let's break it down step by step:
11:59 PM ∙ Feb 25, 2023
91Likes9Retweets
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
For the input, I wanted to see if I can exploit the crispy depth maps you can get out of a Neural Radiance Field (NeRF) 3D scan. - Left: 3D flythrough rendered from a NeRF (iPhone video ➡️ trained w/ Luma AI) - Right: The corresponding depth map (notice the immaculate detail!)
11:59 PM ∙ Feb 25, 2023
51Likes5Retweets
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Dialing in the look was easy with ControlNet + SD. I tested diff. methods and liked the combination of HEB Boundary + Depth the most. Almost went for the GTA look lol! Next, let's turn these into smooth video, then merge the strengths of different ControlNet methods together.
Image
Image
Image
Image
12:00 AM ∙ Feb 26, 2023
28Likes1Retweet
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
With the look dialed, I ran all video frames through ControlNet's depth module. Then I cherry picked a subset to serve as keyframes for Ebsynth. If you flipbook em, this already looks pretty good! I think this is why Runway's Gen 1 output is lower FPS to hide temporal artifacts?
12:00 AM ∙ Feb 26, 2023
18Likes1Retweet
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Once you have your keyframes you can use EbSynth to interpolate between them using your original video as a guide. But if you cut naively between them you'll notice the result are pretty jumpy because the contents of the scene still change a fair bit b/w keyframes. Case in point:
12:00 AM ∙ Feb 26, 2023
19Likes1Retweet
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
The simplest way to make this less jarring is simply to render overlapping segments of the keyframes from EbSynth and blend them together in your video editing tool of choice. 💡 Tip: Render more keyframes and more overlap that you need. You can always refine/discard later.
12:01 AM ∙ Feb 26, 2023
16Likes1Retweet
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
ControlNet methods have their pros/cons based on your subject matter: - Left: Depth does a good job picking up the 3D structure in a scene, but struggles with the textures and thinner structures - Right: HEB boundary finds all the contrasty edges on the facade graffiti textures
12:01 AM ∙ Feb 26, 2023
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
To get a more coherent result, we can fuse these ControlNet methods. I used the crispy depth map from my NeRF scan and used it to composite the Depth + HED boundary passes together. This allows me to have the spatial "foundation" of the depth, then layer on the edge work with HED
12:02 AM ∙ Feb 26, 2023
20Likes1Retweet
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
And voila! We have our final result below. I wanted a stylized "painterly" quality, I experimented with blending modes in and liked "overlay" to add back higher frequency detail from the HED pass from ControlNet. I'm quite happy with the end result! Very clean.
12:02 AM ∙ Feb 26, 2023
24Likes3Retweets
Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Bonus tip: I also learned that you can also use the z-depth pass as an inpainting mask inside automatic1111 to create some very cool/trippy effects. This looks like a portal is opening up to the painting world of Bob Ross :)
12:02 AM ∙ Feb 26, 2023
24Likes2Retweets

And that's a wrap! I’d love to keep sharing what I'm learning with the open source AI & creator community, so if you found this helpful I'd appreciate it if you:

1. RT this thread or share this article with your creative tech frenz

2. Follow me on Twitter for more dank content

3. And if you aren’t already, subscribe below to get these right to your inbox

Creative Technology Digest by Bilawal Sidhu

10
3
Share
Share this post

Multi-ControlNet & Open Source AI Video Generation

creativetechnologydigest.substack.com
3 Comments
Jae
Writes AI Plug
Mar 1Liked by Bilawal Sidhu

thanks for the interesting workflow

Expand full comment
Reply
1 reply by Bilawal Sidhu
KBS Sidhu
Writes The KBS Chronicle
Feb 27

Very comprehensive-- even a non-techie like me can get a sense of the glorious possibilities that this has opened.

Expand full comment
Reply
1 more comment…
Top
New
Community

No posts

Ready for more?

© 2023 Bilawal "Billyfx" Sidhu
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing