The State of AI in Video Production (and Research) & Where it Needs to Go [PART TWO]

Comment

The State of AI in Video Production (and Research) & Where it Needs to Go [PART TWO]

In 2023, I wrote an article on my experience with using AI tools in video production.

At the time, the tools felt promising but were limited, fragmented, and often came across more experimental than practical. However, the past three years transformed these tools, and my own workflow in the process.

But just as interesting has been the cultural shift in how professionals, particularly in UX and qualitative research, think about AI. Up until very recently, many researchers, saw AI as a threat to the 'purity' of their craft. There was an understandable fear that these tools would dilute their findings or replace human interpretation altogether.

However, some of those fears have eroded and AI tools have become critical, as more and more is being expected from filmmakers and product/user research teams.

Timelines have compressed, research budgets have tightened, and AI tools have quietly become less about research-replacement and more about capacity. It has not eliminated the need for researchers and storytellers, however it has absorbed many of the technical burdens surrounding the work.

And as a result it has quietly reshaped my production process.

To highlight this evolution first-hand, lets start with a very rough video production timeline (from my perspective as a UX research consultant/filmmaker), and lets pull some AI tools i'm using into the production process.

STEP ONE: Exploratory Call + Creative Planning

Because I work with brands across a wide range of industries—big tech, CPG, startups, nonprofits—I sometimes use an LLM like Claude or Gemini early in the process to orient myself around an unfamiliar space before exploratory calls with clients.

Doing this pre-work on the front end can often make for richer conversations and more productive exploratory/planning calls. It also accelerates the context-building required to provide my clients with measurable value.

STEP TWO: Pre-Production

Pre-production is where complexity of a film can compound quickly. Here I have to pay attention to things like production calendars, storyboarding, recruiting, location logistics and equipment planning.

This is where a real-time discussion with Gemini Live would bring value, helping me think through first-pass storyboards, generate rough discussion guides as thought starters, or to explore the logistical 'reality' of filming three in-homes in one day in NYC.

This work still requires a great deal of human judgment, but it offloads a solid chunk of the initial mental burden.

STEP THREE: Production (Filming)

On the production-side, I feel like there has been a great deal of innovation with integrating AI tools into product hardware.

Companies like DJI, particularly with their Pocket 3 camera, made some big leaps here with some of their ActiveTrack features. It allows cameras to autonomously follow a selected moving subject without any human intervention. In essence, filmers now have the ability to have a second camera act like a sentry, and trust it to film and frame your subject without a dedicated operator. This redundancy and flexibility is an absolute game-changer for solo filmers, new filmers, researchers who are occupied doing other things, or small production teams.

ActiveTrack in action. Credit @mikeerogers

It doesn’t replace a skilled cam operator, but it expands what’s possible with limited resources, space, time.

STEP FOUR: Post-Production (Editing)

This is where the biggest leap has happened. Companies like Adobe introduced Text-Based Editing to Premiere Pro. In essence, its a tool that uses AI to generate transcripts of your video recordings. This allows you to cut, copy, and paste text to restructure your video's rough cut, with changes instantly reflected in the video timeline.

Text-Based editing in action. Credit @Justinserrandigital

These new tools allow you to edit your video, like you would a Word document.

Text-based editing in the late 2010s and early 2020s relied on tools like Descript and Transcriptive. They were effective, but were clunky, error-prone and weren't integrated into established video editing suites. For critical projects with tight timelines, this was often became more trouble than it was worth.

Now, this capability lives inside the editor. I can't emphasize how much of an impact this makes in the research space, where research teams can craft stories from many hours of footage at a time, while expecting professional video outputs with very quick turnarounds.

The result? Editors spend far less time wrestling their video footage and far more time shaping story and narrative around the insights.

LLMs Like Claude and Gemini can also play a role here as creative 'interns', leaning on transcripts to group themes across interviews (with human oversight) and even helping to identify rough narrative arcs.

On the audio side, tools like Adobe Podcast can quickly edit out unnecessary background noise, quickly level audio, and isolate voices to have videos sound like a professional podcast. Things have reached a point here where sound projects that once took many hours, can now take minutes, often with even better results.

What’s striking to me isn’t just how capable these tools have become... but how quietly they’ve integrated themselves into serious workflows. They are freeing up more time and energy for me to focus on story, narrative and engaging my target audience.

What's Missing

But... and this is the part you have likely been waiting for... these tools still have a very long way to go. And the big outage that exists is CONTEXT.

AI can summarize, cluster, structure, and generate beautiful content. What they cannot yet do is truly understand the context that is captured within film... especially in the way effective storytelling demands. While AI systems perform impressively with explicit information (and are beginning to produce cinema quality generated footage), they struggle to convert prompts into meaningful implicit interpretation.

Film is not just a sequence of edited statements. It is a layered interplay between verbal and non-verbal communication. Pacing, silence, tension, subtext, framing, and cultural nuance. These are all incredibly important elements to an effective film, even in the B2B work I do for market researchers.

Even with extensive training the most advanced models still don't have this ability, yet.

I'm starting to see venture-backed companies try and solve this challenge by hiring seasoned video editors to 'tag' stock video footage. I'm presuming the eventual goal would be that these 'contextualized video clips' would be fed into an AI tool, that would eventually enable storytellers and AI agents to craft a video stories fully autonomously. I simply cant see this strategy working.

Video viewership growth over the past 5-10 years has largely been attributed to services like Youtube, Instagram and TikTok, that platform individual creators. Many of these individual creators have succeeded in a large part due to their unique backgrounds, storytelling styles and life experiences.

The broad-based contextualization approach to media that is currently happening will not drive lasting engagement, because it can only produce formulaic media, and does not offer the unique content that viewers crave.

Where it Needs to Go

Bringing this all back to my industry there is a larger concern. In qualitative research, speed and accuracy exist in constant tension. The cost of misinterpretation can be very high. Slight deviations from core learnings can evolve into large-scale strategic errors, especially with the scale many of these large companies operate at.

Processing and communicating those learnings in an engaging way requires a level of nuance AI tools simply aren't able to accomplish, yet.

Do you trust AI to recognize what is wrong with this Pepsi ad?

Effective research storytelling requires engaging your audience, combined with analytical rigor needed to back up your research findings... while speaking in the language of your audience. That interpretation still requires a human touch.

Researchers and storytellers have their own biases. As a Middle Eastern/Black American Male with Western and Eastern cultures associated with my upbringing, my background and the experiences that shaped my life bring a unique perspective on the world. No manner of training can create a model that can replicate my lived perspective.

While AI tools can help teams pump out research films more quickly. Their handlers are unfortunately losing sight of the human element required to tell compelling stories. At the end of the day, a human brain is still required to make sense of the complexity and nuance inherent in research learnings.

As we rapidly adopt AI in our work, it has become more important than ever that we don't let these tools unnaturally skew/blur/bias the most critical parts of our work.

To build on this, video production is becoming more and more niche. Video creators are using more and more creative ways to engage niche audiences in their space. AI tools that apply a broad based approach to contextualizing explicit information will miss this critical element.

AI tools that better understand and anticipate their users creative intent instead of attempting to replace them entirely, will likely separate themselves from the competition.

To do this... what if I could upload my existing film productions, favorite creators, storytellers and chat history to a train a model? What if I could then leverage said model as a production assistant, to help me create create rough cut films in my own personal style?

I can see AI heading in this direction in the near future.

Comment

The State of Artificial Intelligence in Video Production & Where it Needs to Go

Comment

The State of Artificial Intelligence in Video Production & Where it Needs to Go

I’ve been dabbling with AI video editing tools for the past 6 years. Some of the early builds were a huge pain to use, and some tragic failures with the software cost us weeks of productivity.

But in those instances where it works, It can make transformative impacts in your productivity.

A few examples...

Since 2019, I’ve switched entirely over to text based video editing for my documentary work. How does it work?

I start by feeding my video footage into a text transcription service. The software then uses AI to recognize the language used in the video, then creates a text transcript that I can edit right alongside the video timeline.

Instead of sifting through hours of video footage (like you would fast forward and rewind through a Netflix movie), I instead CTRL-F search through the video transcript like a word doc to track down the clips I need.

In the past I had to take a full day out of a project to master the audio portion of a video documentary. Today, that can take as little as five minutes using some reliable AI audio mastering tools.

Only last year, there were video clips that were unusable because the video quality was too poor. Now, there are AI tools that can quickly ‘repair’ those video clips in incredible ways. Footage can be unblurred or I can even remove some distracting background imagery to keep your focus on the main subject.

AI is commoditizing many of the menial tasks associated with video editing. This is helping me focus more on the storytelling and creativity portion of the work, and can provide us with opportunities to deploy video in new ways.

But what's missing?

Many of the technical challenges associated with video editing have been addressed with AI. The entire industry is being flipped on its head with cloud based video editing tools that are making editing simpler and more collaborative.

A few years ago, I dumped hours of video footage into early builds of Descript. Halfway through my edit, I noticed that the audio wasn't syncing properly with the video. After a ton of back and forth with the support team, I discovered that the software couldn't properly sync video footage with different frame rates. Something I took for granted with Premiere Pro.

When it works, I can edit videos 4x to 10x faster.

When it doesn’t, I have to scrap the rough draft and re-edit the entire video.

These kind of issues can be especially stressful when under a tight delivery timeline.

But I keep coming back, and it keeps getting better.

However, what is really missing is the storytelling aspect. I’d love to see these new AI video companies help users easily craft engaging videos.

I'm not expecting Michael Bay levels of production here. Start simple.

People have been using AI tools for quite sometime to proofread and improve their text-based content.

AI should soon be able to guide new video editors through the process of creating simple compelling videos that people actually want to watch.

These video editing tools are growing increasingly more complex, and it can be a full time job to just learn how to use them.... let alone learn how to edit something interesting to watch.


Source: Deep AI Image Generator. Prompt: 'sharing a great story'

"Beauty without intelligence is like a masterpiece without a soul, captivating at first glance but ultimately devoid of meaning."

ChatGPT

Yes, there are some very serious growing pains associated with using some AI tools. But it shouldn’t stop you from exploring how they can help you in your industry.

These are just some elementary uses, but these tools are rapidly growing more elegant. I'm really excited to see how this space evolves... and how I evolve with it.

Comment

Comment

‘Name One Thing You Are Awesome At'​

'Name one thing you are awesome at.'

It was an icebreaker question used at SEEK to help loosen up the research and client teams. 

I don't think SEEK realized how much I kept on using it.

From 2013 to 2017, I co-directed a non-profit with my wife called Project Downtown Cincinnati. It focused on eliminating hunger in the downtown Cincinnati area. We’d make bagged lunches, create hygiene kits and have meaningful conversations with Cincinnati residents in need.

Most of our volunteers were college students from The University of Cincinnati and Xavier University. For most of them it was their first time doing volunteer work, and they came from other cities hoping to make new friends while sharing a common interest.

They were college students fresh from a new city. They were trying to break into a new social group. They were about to engage with a marginalized portion of the population for the first time ever. You could imagine the uncertainty flowing through their head.

To help with this, we'd do an icebreaker Q&A at the end of each volunteer shift. I'd typically Google fun icebreaker questions to ask, or tie in something relevant to the season.

This tradition soon became known as ‘The Circle’, the group would make a large circle and almost everyone would have fun answering these questions. In the weeks we didn't have a good icebreaker ready, The question I would return to the most was 'Name one thing you are awesome at'?

It was almost always a hit and I would usually get three different types of different answers:

#1 A volunteer would be excited to show off a particular talent they have.

#2 A volunteer would humbly admit they had some special talent.

#3 Occasionally, a volunteer would refuse to admit any skill, and when encouraged by the group to share something, it might be self deprecating. Something along the lines of ‘I’m great at showing up late to volunteer’

In professional scenarios i’m often reminded of situations similar to answer #3.

When I make market research films, I often encourage the lead researcher to narrate the videos that would be seen by their teams/stakeholders. This narration would help guide viewers through the findings and would add valuable context to the video. This is similar to the experience you get listening to David Attenborough and Planet Earth.

A large chunk of researchers would refuse for one particular reason. ‘No I have a terrible voice’.

They felt comfortable presenting their findings live in front of a room of stakeholders. They were also comfortable speaking to large groups of people at conferences. But when when it came time for video narration, it was far too uncomfortable to have their voice tied to it.

To keep the project rolling through script development, i'd begrudgingly volunteer my own voice as a placeholder for the video, until either a professional voiceover artist or another researcher on their team would take my place.

When the draft videos were finished and came up for review, the lead client would say 'Your voice works great for this. Please voiceover the rest of them!'. I was shocked. In my eyes, I sound like a teenage skateboarder. Hardly the voice expected to sway the opinions of senior leaders.

This process would continue over and over. After a few years, a couple hundred research videos ended up having my 'skater voice' in the final shared version. A voice that started as a placeholder ended up ringing through the halls of Fortune 500s across the country.

I share this because these 'terrible voices' are some of the most talented researchers i've ever seen. They spent months of work planning, executing and synthesizing this research. No-one knows the ins and outs of the research, and could tell the story of their insights better than them.

Their voice was also plenty good enough to narrate a research video. I'd tell them if it wasn't.

If they are awesome at unveiling insights, they can be just as awesome at narrating them too.

Name one thing YOU are awesome at.

Yousef Hussein is the Founder of UX Films ...and recently found out he is awesome at narrating videos...

Comment

Comment

Applying Pixar's 22 Rules Of Storytelling To Our Everyday Work

Ever since Toy Story dropped when I was 10 years old, I've been fascinated with Pixar movies. It's evident I'm not alone. They are one of the most successful creative enterprises ever, owning 14 of the top 50 highest grossing animated films. Nearly all of their movies (24 of 26) are certified fresh on Rotten Tomatoes.

So what's their formula for success? In 2012, Emma Coats, a storyboard artist for Pixar tweeted 22 guidelines that Pixar follows when crafting the story for their films. 

I thought of ways these guidelines could be applied to our work. Many are nearly impossible to adapt to a business-related presentation or video, but here are a few that really resonated with me.

Rule #2: You gotta keep in mind what’s interesting to you as an audience, not what’s fun to do as a writer. They can be very different.

We often bias our presentations to reflect the topics we are interested in. It can be incredibly difficult to shut off that part of our brain and empathize with what can engage our audience.

Back when I first created research films, I focused a decent amount of my time adding ambient background music to reflect the mood of what I was trying to communicate. I assumed it added polish to the video, but it just ended up distracting the audience. Some of the best presentations require some serious ego checks.

Rule #13: Give your characters opinions. Passive/malleable might seem likable to you as you write, but it’s poison to the audience.

When pulling together films, researchers sometimes shy away from including that strongly opinionated or emotional respondent. There can be a fear of creating too many waves. When editing, I often struggle to strike a balance between engagement and compliance. In many instances, the engagement those particular respondents can create with the audience will outweigh the potential risks of alienating them with their strong opinions.

Rule #17: No work is ever wasted. If it’s not working, let go and move on – it’ll come back around to be useful later.

We often feel the need to shoehorn every little tidbit of information into our presentations. Sometimes that particular finding isn't relevant to your main story, and is bogging everything down or slowing down the pace of your presentation. Some of the highest budget films and TV shows suffer from disengaged audiences because the director felt the need to squeeze every single piece of cool looking video into the film *cough*Rings of Power*cough*

Rule #18: You have to know yourself: the difference between doing your best & fussing. Story is testing, not refining.

This process contrasts with the design-by-committee many creatives encounter in the consulting space. When an approach is aligned on, focus on creating the best possible presentation with the constraints given to you. Be open to testing a new approach if it has the potential to produce the best possible output, instead of constantly re-tweaking a presentation to the point where it’s over-budget, over-timeline and overly-formulaic.

Rule #22: What’s the essence of your story? Most economical telling of it? If you know that, you can build out from there.

When creating research videos, we throw our story into a barebones script template. We start by identifying the key learning/insight spaces. We then flesh out the main supporting quotes, sprinkled with storytelling elements and supporting b-roll where needed. 

Finally, we go through a process of paring down any unnecessary video, word by word, to get to the essence of what needs to be communicated. 

There is certainly a time and place for extra details to help engineering and design teams. But if engagement is your goal, keep things to a minimum.

Comment

Why I ditched my insights notepad and picked up a video camera

Why I ditched my insights notepad and picked up a video camera

user+research+documentary+camera

I’ve worked extensively with data and digital content in various industry contexts, pouring over information, drawing patterns, conclusions and using this analysis to make recommendations to my clients. Projects would often-times take months to complete, cost clients hundreds of thousands of dollars and result in a 30-60 slide PowerPoint reports shared with a C-level Director or VP.

Soon, I discovered that no one wanted to read these overwhelming reports (I don’t blame them). We would respond by condensing detailed findings into executive summaries to make them more easily digestible; but in doing so, reduce their potential to add value or influence decision-making.

It is a paradox: companies are willing to invest massive amounts to generate insights, but often fall short of effectively leveraging this investment to drive change in their organizations. Why is this? One of the biggest challenges agencies face involves transitioning insights and ideas: from researcher, to decision-maker, to implementer.

Over time, I developed a stronger understanding of how to better engage clients and share the richness in our insight work: use a video documentary. If key decision-makers haven’t the time to read a report, then don’t ask them to – and instead put them in the same seat as the researcher, face to face with their consumers speaking to them directly. This way, we are asking them to have empathy through sharing our experiences as researchers.

Here are four benefits of transitioning to using a video documentary to share insights:

  1. Using video captures nuance in your findings that can’t be shared in a typical PowerPoint report. This includes the spectrum of emotions, the non-verbal communication and the context behind the insights and ideas identified.

  2. Great storytelling drives engagement with decision-makers within your organization. It is much easier to effectively share your findings through a well-edited short documentary. It stands out far better, and is much more impactful than the standard PowerPoint report.

  3. Great insights deserve to be shared without a filter. Hearing the insight from the consumer first-hand – and in the original language it was offered – invites stakeholders to experience for themselves an authentic ethnographic research process.

  4. It makes you look like a rock-star. It’s much easier to engage your team members with your research output in a ten to fifteen-minute documentary, than to ask them to sit through an hour of slides.

Yousef Hussein is the founder of UX Films

Five BIG Reasons Why VR is the Future in Consumer Research

image.jpg

Over the past few months, I’ve had the opportunity to use Virtual Reality (VR) to complement my consumer research video work. During this time, I’ve largely focused on figuring out all the technical challenges involved with producing videos that bring lasting value to my clients. I didn’t fully realize the distinct benefits of filming in VR until the first few videos were delivered to our partners.

I'm a tech geek and regularly trial new technology when in-field on research. This is done partially to justify my regular visits to Micro Center and B&H. However, its also to potentially discover new ways of bringing my clients closer to the people they serve. Its rare that the latter reason becomes so apparent, so quickly.

With that in mind, here are five compelling reasons to consider using VR video to capture your next consumer research project:

VR provides context

I’ve cut important shots from my traditional video reels because there isn’t enough context in the footage to clearly articulate what the researcher is trying to communicate. Its one thing when a respondent complains about a television show being “dumb”. It’s another when you can turn your head in VR to see what the respondent is talking about, then turn back and watch their body language.

YOU are in control of what YOU want to know

In a single unedited shot, a UX designer can see how a user is interacting with an app, a hardware engineer can observe how they are holding the device and the marketing team can observe the physical stimuli that is influencing the users actions. Each member of the team can see what’s most relevant for them and glean what they need from the same video.

Smaller research teams but greater participation

There is often a false perception among researchers that to actually effect change within an organization, key stakeholders need to be present during the actual research. VR can change that. You no longer have to be physically present to be part of the experience. VR has the potential to recreate those impactful moments for stakeholders who could be thousands of miles away.

Furthermore, many VR cameras are small and unobtrusive, great for filming in small homes without overwhelming respondents. In one shot you can capture the interviewer, the interviewee and their surrounding environment in its entirety.

Novelty of VR creates opportunities to share richness vs simple sound bites

When producing traditional video, editors are often forced to piece together one to two second quick hit clips to make a point. Why is that? Videos are ubiquitous and our attention spans have become shorter and shorter. Instead of giving video clips the time to truly connect you with the consumer, we are often encouraged to “skip to the good part!”

With VR, you are strapped in and fully immersed. It gives filmmakers the opportunity to share stories in unique ways. You are experiencing the insights versus just watching them.

The immersion

This one is hard to explain unless you've had the opportunity to already use a VR headset. Being transported into a completely different environment with multiple sensory cues, creates a unique experience that traditional video cannot replicate.

With that said, VR is certainly not an end-all solution for all research challenges. Tried and true traditional video is still king - for now, but VR will soon become a powerful tool in your research arsenal.

Yousef Hussein is the founder of UX Films