Creative Tech Digest

Share this post

🧠From Uncanny Valley to Photorealism: The Game-Changing Leap of Generative AI Models. Have We Shattered the Visual Turing Test? 🤯

creativetechnologydigest.substack.com

🧠From Uncanny Valley to Photorealism: The Game-Changing Leap of Generative AI Models. Have We Shattered the Visual Turing Test? 🤯

GenAI models like Midjourney and Stable Diffusion have hit the threshold of photorealism. Let's explore what these breakthroughs mean for 3D, VFX and creation as we know it.

Bilawal Sidhu
Mar 26, 2023
9
4
Share
Share this post

🧠From Uncanny Valley to Photorealism: The Game-Changing Leap of Generative AI Models. Have We Shattered the Visual Turing Test? 🤯

creativetechnologydigest.substack.com

Hello Creative Technologists!

Today we’ll summarize the remarkable leap generative AI models have taken just in the last 3-6 months.

Then we’ll examine how Gen AI will infuse itself into 3D rendering and VFX workflows in mid to long term.

Creative Tech Digest
AI Room Makeover: Reskinning Reality With ControlNet, Stable Diffusion & EbSynth
Hey Creative Technologists! Today we’ll be covering an AI video experiment I created to learn and prepare a deep dive on ControlNet…
Read more
3 months ago · 8 likes · 2 comments · Bilawal Sidhu

But first, let’s summarize recent developments in generative AI:

  • For a better half of 2022, creators have been able to use tools like Stable Diffusion and DALL-E2 combined with a range of in-painting and post-processing tools to create images that look indistinguishable from reality.

  • Starting 2023, with the rise of ControlNet and composable LoRAs technical artists have been able exert greater artistic control over Stable Diffusion to get even more photorealistic results.

  • Midjourney v5 has now made that multi-step process a single, well defined text prompt. No need to wrangle multiple components and passes together.

  • Yet with it’s Discord interface, Midjourney lacks the creator-centric UX of Adobe Firefly, but touting immaculate dynamic range, with improved natural language prompting — it’s the best model out there, right now*

    *v5 wins prompt-for-prompt as of March 26, 2023 (dates are crucial in AI!)

    Okay! Now let’s dive into Midjourney & Stable Diffusion — and the massive advances they’ve made in coherent photorealistic image generation

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
Midjourney v5 has pushed into photorealism, a goal which has eluded the computer graphics industry for decades (!) 🤯 Insane progression, and all that by 11 people with a shared dream. 🧵 Let's explore what these breakthrough in Generative AI mean for 3D & VFX as we know it...
Image
Image
5:58 PM ∙ Mar 25, 2023
9,828Likes1,349Retweets

First off, Midjourney v5 is far more photorealistic out-of-the-box. Where as it's predecessor has a more painterly, stylized bent.

Click through the above to watch a thorough comparison of v5 vs v4 incase you want to go deeper. Otherwise, let's keep going...

Not to put too fine a point on it, but collaged below is a progression of the same prompt in Midjourney from v1 through v5. Astonishing progress/time:

𝕄𝔸𝕋𝕋 𝔽𝕏 (@mattfx) / Twitter

With v5, Midjourney has crossed the chasm of uncanniness, and is well into photorealistic territory.

And this feeling is resounding amongst professionals. Some might even say it's one for the history books!

Image

I mean how can generative AI *not* absolutely disrupt 3D engines like Unreal & Unity, or even Octane & Redshift

Just look at the quality of this Midjourney generation by Linus 🤯

Twitter avatar for @LinusEkenstam
Linus (●ᴗ●) @LinusEkenstam
Actually this is freaking awesome. Look at the reflections on the windshield and the hood... I got no clue how Midjourney does these things. Magic.
Deadpool wide angle pose on top of a car outside an apartment complex in the uk, canon, vlogger, --v 5 --q 2 --ar 3:2
12:11 AM ∙ Mar 23, 2023
689Likes45Retweets

Like, who knew it'd take generative AI to cross the uncanny valley, particularly for digital humans? No sub-surface scattering required!

You've got all you need to realize your ambitious Bollywood dreams:

Twitter avatar for @ekpraet
Prateek Arora @ekpraet
Daryaganj Dino Run > Pamplona Bull Run
Image
4:22 AM ∙ Mar 17, 2023
223Likes31Retweets

Virtual sets? Not a problem. These Midjourney generations easily surpass the quality of an Unreal Engine or Octane render.

I mean, just look at the high frequency detail in the chair, the knitting, the windows -- and good lord (!) the dynamic range is immaculate

Twitter avatar for @afzainizam
🍪Cookie Munster @afzainizam
#midjourneyv5 is really good for really detailed images. But there's still ROOM for improvement. This scene is a sunlit corner in a boho-inspired living room with a hanging macrame chair. #interiordesign #midjourneyart #midjourney
Image
Image
Image
Image
7:37 AM ∙ Mar 20, 2023
150Likes11Retweets

Obviously, the meme potential is exceedingly high too :)

Especially given MJ's new approach to prompting which allows us to compose complex scenes with multiple characters.

Fancy a Hogwarts rave circa 1998? No problemo:

Twitter avatar for @spacecasetay
Taylor Peterson 💫 @spacecasetay
The celebrity/character recognition of v5 in midjourney is wild - @bilawalsidhu is right 👀 Here's a rave at Hogwarts summer 1998 🧙‍♂️
Image
Image
Image
4:13 PM ∙ Mar 21, 2023
1,167Likes162Retweets

Product photography gets a huge boost too. Imagine products before you create them, or fine tune models with actual product photography to stage virtual shoots on demand.

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
v5: here's an action figure scene that looks straight out of a Hasbro toy commercial. The layout and tilt-shift style bokeh really sell the smaller scale of the scene. I expect product photography use cases will get a nice boost with v5, where proper scale is also crucial.
Image
3:02 PM ∙ Mar 16, 2023
39Likes1Retweet

Doing this in the past has required scanning assets, or modelling them from scratch, plus hours in 3D tools tweaking textures, lighting, animation, edits.

It's no surprise that Jensen Huang CEO of NVIDIA said “Every single pixel will be generated soon. Not rendered: generated”

Obviously, I've been saying this for a hot minute now, especially after playing with ControlNet:

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
I spent the weekend hacking an AI room makeover using ControlNet & Stable Diffusion, restyling my parent's "drawing room." How soon until this workflow is standard for home décor & interior design? ControlNet is a glimpse into the AI-infused future of 3D rendering & AR.
3:27 AM ∙ Feb 21, 2023
1,613Likes247Retweets

In the near term, we will see hybrid approaches that fuse the best of classical 3D + generative AI will reign supreme.

Run a lightweight 3D engine for the first pass, then run a generative filter on top to convert it into AAA quality.

Think NVIDIA's "DLSS" on steroids — and i’m not kidding when I say it’ll be like channel surfing realities:

Twitter avatar for @bilawalsidhu
Bilawal Sidhu @bilawalsidhu
🧠 AI experiment comparing #ControlNet and #Gen1. Video goes in ➡ Minecraft comes out. Results are wild, and it's only a matter of time till this tech runs at 60fps. Then it'll transform 3D and AR. How soon until we're channel surfing realities layered on top of the world?🧵
9:38 PM ∙ Mar 4, 2023
2,409Likes457Retweets

Key point is — 3D’s got a good foundation. But with the rise of generative AI, explicitly modelling reality seems overrated for many visualization use cases. Instead, a hybrid approach absolutely crushes it!

E.g. throw in an uncanny looking Unreal model, use it to drive the performance. Then run it through a generative AI “filter” and get out a much more photorealistic result. Minor temporal inconsistencies aside (which'll be solved!) the result is beyond Unreal:

Twitter avatar for @CoffeeVectors
CoffeeVectors @CoffeeVectors
Tests using a #metahuman from #UnrealEngine5 with #stablediffusion Multi-ControlNet. Applied diff. amounts of deflickering in #DaVinciResolve. Mainly seeing how it handles 3D camera moves. #aicinema #controlnet #aiia @UnrealEngine #aiphotography #MachineLearning #DeepLearning
7:27 PM ∙ Mar 6, 2023
1,584Likes317Retweets

Video is in it's infancy, but clearly the next target. Jon made this short film with a freaking iPhone + Midjourney + Runway.ml Gen-1

And it's all filmed in his apartment! This is James Cameron style virtual production ($$$) democratized.

Imagine where we'll be in +6 months...

Twitter avatar for @mrjonfinger
Jon Finger @mrjonfinger
Social Latency: I made another little experimental film with Ai. This time I filmed myself around my apartment with my phone and reimagined the footage with #gen1 . Voices generated with elevenlabs.io Ai text to speech. Characters and style designed with #midjourney
9:32 PM ∙ Mar 18, 2023
371Likes76Retweets

Obviously, Midjourney's David Holz has always been aiming towards transforming interactive content. Stability’s Emad Mostaque wants to run his visual engine at 60 fps. Just there is massive progress, yet many more will enter the fold.

To put it succinctly — Generative AI will first it'll transform ideation, then asset creation, then 3D engine embellishment, but eventually -- we'll be playing dreams in the cloud 🌥

And I for one, can't wait!

Image

Share

That's a wrap! If you enjoyed this deep dive on AI's impact on real-time 3D & offline VFX:

- RT this thread on Twitter to share with your audience (it really does help!)

- Follow @bilawalsidhu for more creative tech magic (and thanks for 15k!)

- Sign up to get these Creative Tech Digests sent neatly to your inbox:


Bilawal Sidhu signing out — see ya’ll in the next one!

9
4
Share
Share this post

🧠From Uncanny Valley to Photorealism: The Game-Changing Leap of Generative AI Models. Have We Shattered the Visual Turing Test? 🤯

creativetechnologydigest.substack.com
4 Comments
Punit Thakkar
Writes Hello Universe
Mar 26Liked by Bilawal Sidhu

Not only is Midjourney v5 capable of generating photoreal humans, it is capable of generating photoreal humans of multiple ethnicities! I used to try making Indian characters using MJ v3 and I would get the same faces over and over because of probable limitations in its training dataset. Now with v5, the variance in its character generation is really amazing. Check out a few of the examples I was able to generate:

https://futuretelescope.substack.com/p/my-money-where-my-mouth-is

Expand full comment
Reply
2 replies by Bilawal Sidhu and others
KBS Sidhu
Writes The KBS Chronicle
Mar 26Liked by Bilawal Sidhu

It isn’t just game-changing – it’s life-changing.

Expand full comment
Reply
2 more comments…
Top
New
Community

No posts

Ready for more?

© 2023 Bilawal "Billyfx" Sidhu
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing