AI Generates Trending Video Ideas

Mz2Ubiz.jpg!web

Photo by Kon Karampelas on Unsplash

Using Recurrent Neural Networks to Inspire the Next Viral Video

Mar 13 ·7min read

Y ouTube is a massive platform — videos that manage to gain the favor of the recommendation algorithm can get hundreds of millions of views. While content creators guess around in an attempt to create the next viral video, AI can generate as many trending video ideas as you’d like!

In this article, I’ll show how anyone can create and train recurrent neural network to generate trending video ideas — in four lines of code!

First, a bit of light theory…

If you’re not interested in how Recurrent Neural Networks work, feel free to jump down to the implementation.

A Recurrent Neural Network (RNN) is a type of neural network that specializes in processing sequences. Given a seed “She walked her ___”, an RNN might predict “dog”. The trick with RNNs in text generation is using predictions as seeds for further predictions.

r2Q3EfA.png!web

How a RNN might generate text, given a seed ‘She’. Bolded words are the output of the RNN.

One issue with standard neural networks as it applies to text geneation is that it has a fixed input and output size. For example, in a convolutional neural net trained on the MNIST dataset, each training and testing example can only be 784 values — no more, no less. While this is practical in tasks like image recognition, it is certainly not for natural language processing tasks, where the input and output may vary between a few characters to several sentences or even more.

RNNs allows for variable-length inputs and outputs. RNNs can look like any of the below, where red is the input, green is the RNN, and blue is the output:

iee6Nv2.png!web

Whereas standard and convolutional neural networks have a different set of weights and biases for each input value or pixel, recurrent neural networks have the same set of weights and biases for all inputs. A RNN usually has three sets of weights and biases — one between the input and the hidden layers (red to green), one between a hidden layer to another hidden layer (green to green), and another one between a hidden layer and an output layer (green to blue).

Because the same set of weights and biases is used over each layer-to-layer link, the number of cells in a layer, including the inputs and outputs, can be very easily adjusted. Because there are so little parameters, the optimal weights and biases can be honed in on.

So why is the RNN so good at generating text?

RNN text generation is based on the fundamental principle that the next word in a sentence is always applied with the same idea in mind. This makes sense — as an author, the next word you put down was put there with the same intent as the one before it.

v6RZfuj.png!web

In the above graphic, the first sentence is written such that each word is placed with the same intent. The second sentence begins with the same intent but because it keeps on switching, the end result is nowhere near the original intent.

By applying the same RNN on each set of words, the intent of the sentence (where it’s trying to go, what ideas it contains) as well as the phrasing of the sentence is maintained.

If you want a more in-depth explanation of RNNs, check out some of these research papers.

Implementing the Viral Video Title Generator

All machine learning models require data. The dataset we will be using is the Trending YouTube Video Statistics dataset on Kaggle .

When loading and viewing the dataset, we can get an idea for how the data is structured:

import pandas as pd
data = pd.read_csv('/kaggle/input/youtube-new/USvideos.csv')
data.head()

bI7Rbma.png!web

*there are more columns to the right, but we won’t need them

We are interested in the title column — this will provide data to train the RNN on. This data has 40,949 rows; this is not much in comparison to some larger datasets, but to keep the training time reasonable let’s reduce the training data down to 5,000 instances.

In addition, we should narrow down what categories the training data is on:

UBJnmem.png!web

After looking at different categories, it becomes clear that some categories are dedicated for news, music videos, movie trailers, etc., which wouldn’t make sense in the context of an idea generator because news, song titles, music video titles, and so on either can’t be generated or wouldn’t make sense. Category IDs 22, 23, and 24 are dedicated towards comedy and shorter segments created by small content creators. These are more in-line with what we want to generate.

The following code selects rows in data that belong to categories 22, 23, or 24 and puts them in a DataFrame called sub_data .

sub_data = data[(data['category_id']==24) | (data['category_id']==23) | (data['category_id']==22)]

mQNfqaZ.png!web

There are more columns to the right that are unshown.

There are still 16,631 rows — to reduce it down to five thousand rows, we will randomly shuffle the DataFrame several times and then select the top 5,000 rows for training data. sklearn ’s handy shuffle function can help:

from sklearn.utils import shuffle
sub_data = shuffle(shuffle(sub_data))

To feed the data into the model, it must be in a text file, with each new training instance on a separate line. The following code does just that:

titles = open('title.txt','w+')
for item in sub_data.head(5_000)['title']:
    titles.write(item)
    titles.write('\n')
titles.close()

Note that the .head(n) function selects the top n rows in a DataFrame.

To view title.txt , we can call print(open(‘title.txt’,’r’).read()) .

7jyIbai.png!web

A portion of the file, the actual file is much larger.

Finally, the training file is ready. There are many powerful libraries that can implement RNNs like Keras (TensorFlow) and Pytorch, but we’ll be using a library that can skip the complexities of choosing a network architecture called textgenrnn . This module can be called, trained, and used in 3 lines of code (4 if you count installing from pip), at the cost of lack of customizability.

!pip install textgenrnn

…installs the module in the Kaggle notebook environment. You may remove the ! if operating in other environments.

Training is simple:

from textgenrnn import textgenrnn
textgen = textgenrnn()
textgen.train_from_file('title.txt', num_epochs=50)

Since textgenrnn is built on a Keras RNN framework, it will output a familiar Keras progress-tracking print:

This takes about 2.5 hours to run through all 50 epochs.

textgen.generate(temperature=0.5)

…can be used to generate examples. ‘Temperature’ is a measure of how original the generated example will be (the less, the more original). It is a balance of being creative (smaller temperature) but not straying too far from the nature of the task, the balance between underfitting and overfitting.

Finally, the generated video titles!

To show the model’s progress over time, I’ll include three titles from (about) every 10 epochs, then leave you with a treasure trove of 50-epoch-model generated titles.

1 epoch (Loss: 1.9178) —

The Moment To Make Me Make More Cat To Be Coming To The The Moment | The Moment | The Moments
Keryn lost — Marlari Grace (Fi Wheel The Year Indieved)
Reading Omarakhondras | Now Cultu 1010–75

10 epochs (Loss: 0.9409) —

Grammy Dance of Series of Helping a Good Teass Shape | Will Smith and Season 5 Official Trailer
Cardi Book Ad — Dancing on TBS
Why Your Boyfriend In Handwarls

20 epochs (Loss: 0.5871) —

My Mom Buys My Outfits!
DINOSAUR YOGA CHALLENGE!!
The Movie — All of Tam | Lele Pons & Hulue & Jurassic Contineest for Anime | E!

30 epochs (Loss: 0.3069) —

Mirror-Polished Japanese Foil Ball Challenge Crushed in a Hydraulic Press-What’s Inside?
Why Justin Bieber Was The Worst SNL Guest | WWHL
The Most Famous Actor You’ve Never Seen

40 epochs (Loss: 0.1618) —

Will Smith & Joel Edgerton Answer the Web’s Most Searched Questions | WIRED
Adam and Jenna’s Cha Cha — Dancer Sharisons & Reveals Your Door ftta Answering Saffle Officers
Bravon Goes Sneaker Shopping At Seoul Charman’s Fabar Things 2

…and finally, the top five 50-epoch (Loss: 0.1561) generated titles!

MY BOY DO MY MAKEUP
24 HOUR BOX FORT PRISON ESCAPE
Liam Payne Goes Sneaker Shopping
Star Wars: The Bachelor Finale
Disney Princess Pushing A Truck

Going further…

…this was a humorous example of the capabilities of RNNs. You may have noticed that as the amount of epochs went up, the ideas were less and less original — overfitting. This has to do with us restricting the number of training examples. If you’d like to experiment with this on your own (and have a couple hours of computing time to spare), you could try restricting only be category and not by number of training examples (or use the entire data) — the resulting generated titles will probably be much more interesting.

Thanks for reading!

If you enjoyed, you can check out some other fun applications of RNNs:

Using RNNs to Generate a Shakespeare-Imitation Play (Uses TensorFlow)
Using RNNs to Create a Debate Death Match Between AI Joe Biden and AI Bernie Sanders (Uses textgenrnn )

AI Generates Trending Video Ideas