Predict Movie Earnings with Posters

 2 years ago
source link: https://towardsdatascience.com/predict-movie-earnings-with-posters-786e9fd82bdc?gi=5ad6dbfa3401
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Identify the genre and earnings of the movie with movie posters


If you have a summer blockbuster or a short film, what would be the best way to capture your audience’s attention and interest? The 2 most prominent methods are obvious: posters and trailers.

Movie posters communicate important information about the movie such as the title, theme, characters and casts as well as producers involved in the movie. Movie posters serve to inform the audience what genre of movie they are watching, so if given a movie poster, can a machine learning model tell which genre the movie belongs to?

Movie posters are a crucial source of promotion with a great poster design being advantageous to appeal as wide a viewership as possible. We would like to find out if given a movie poster, can we predict if the movie is going to do well in the box office?

In this article, we will explore the data preparation and using convolution neural networks to build machine learning models to answer these questions.


We collected 45466 movies metadata from The Movie Database (TMDb) . There is a wide variety of attributes we can get from TMDb, but for this experiment, we are only interested in the following fields, 1) title, 2) genre, 3) poster, 4) popularity, 5) budget, 6) revenue.

Since a movie can fall into multiple genres, we will only pick the first genre of each movie so that each movie can only have 1 genre. In this experiment we intend to predict if a movie will do well in the box office, we will use revenue/budget ratio, defined as the movie is making money if the value is greater than 1, otherwise it is not.

Here is the sample dataset loaded in Pandas data frame:

Data analysis and filtering

We won’t download all 45466 images right away. Instead, we will do some analysis, filter out those with data issues and select the list of movie posters to download.

Firstly, we will remove those with missing information:

  • blank title after removing all non-alphanumeric characters
  • no genre
  • no poster URL
  • no budget
  • no revenue

We are left with 40727 movies after filtering out the undesirable data. Below is the distribution of the number of movies in each genre:


For our genre prediction task, we want to predict between 10 classes, so we will select the top 10 genres.


Hence, we select the top 1000 most popular movies in each genre based on popularity. These are the movies posters we will be downloading, 10,000 images across 10 genres.


Download the movie posters

From the data frame shown above, the poster_path is the name of the file. To get the image URL for Toy Story poster, we append http://image.tmdb.org/t/p/w185/ to the poster URL to get: http://image.tmdb.org/t/p/w185//rhIRbceoE9lR4veEXuwCC2wARtG.jpg .

We can simply download all the images with the Requests library. I would suggest adding a 1-second delay between each image download. This code is to download and save the images into respective genre folders for predicting the genre of the movie:

Image processing

In order to make use of pretrained models, we would first need to transform our rectangular posters into a square. Furthermore, to reduce the computation cost, the image size is resized to 224 by 224. We have identified 4 image processing methods to achieve these requirements:

  • PIL library resize
  • center crop library resize
  • padding
  • random crop and resize


Method #1: PIL library resize

Use the PIL library to resize the images to 224x224.

from PIL import Imageimage = Image.open(PATHOFIMAGE)
image = image.resize((224, 224), Image.BILINEAR)

The processed image after resize was distorted below:


Method #2: center crop

We will transform the images using PyTorch’s Torchvision.

do_transforms = transforms.Compose([
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])dataset = datasets.ImageFolder(PATH, transform=do_transforms)

The processed image caused both the top and bottom of the image are cropped.


Method #3: padding

As most movie posters are portrait orientated, we decided to add black padding on the left and right. This would avoid any distortion and cropping of the original poster image. Since black padding is zeros in RGB, it will have a minimum effect on our convolution neural networks.

from skimage.transform import resizedef resize_image_to_square(img, side, pad_cval=0, dtype=np.float64):    
    h, w, ch = img.shape
    if h == w:
        padded = img.copy()
    elif h > w:
        padded = np.full((h, h, ch), pad_cval, dtype=dtype)
        l = int(h / 2 - w / 2)
        r = l + w
        padded[:, l:r, :] = img.copy()
        padded = np.full((w, w, ch), pad_cval, dtype=dtype)
        l = int(w / 2 - h / 2)
        r = l + h
        padded[l:r, :, :] = img.copy()resized_img = resize(padded, output_shape=(side, side))
    return resized_img

The processed image after applying Padding:


Method #4: random crop and resize

We will transform the images using PyTorch’s Torchvision.

do_transforms = transforms.Compose([
        transforms.RandomCrop((280,280), padding=None, pad_if_needed=True, fill=0, padding_mode='constant'),
        transforms.Resize(input_size, interpolation=2),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
])dataset = datasets.ImageFolder(PATH, transform=do_transforms)

The processed image after Random Crop and Resize.


Image processing results

To measure the accuracy of image processing methods, we used pretrained ResNet18 to perform classification. We will classify between the comedy and horror genre, as their posters are distinctly different in general. To ensure our comparison is fair, we did the following:

  • the same set of movies for training and same set for validation
  • set seed number
  • load pretrained ResNet18 from PyTorch’s Torchvision

Model accuracy with different image processing methods are as follows:

  • PIL library resize is approximately 80%
  • Center crop library resize is approximately 80%
  • Padding is approximately 85%
  • Random crop and resize is approximately 85%

About Joyk

Aggregate valuable and interesting links.
Joyk means Joy of geeK