1

Using GenAI to Classify an Image as a Photo, Screenshot, or Meme

 3 months ago
source link: https://www.raymondcamden.com/2024/01/18/using-genai-to-classify-an-image-as-a-photo-screenshot-or-meme
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
January 18, 2024

Using GenAI to Classify an Image as a Photo, Screenshot, or Meme

Using GenAI to Classify an Image as a Photo, Screenshot, or Meme

File this under the "I wasn't sure if it would work and it did" category. Recently, a friend on Facebook wondered if there was some way to take a collection of photos and figure out which were 'real' photos versus memes. I thought it could possibly be a good exercise for GenAI and decided to take a shot at it. As usual, I opened up Google's AI Studio and did a few initial tests:

Screenshot from AI Studio

I then simply removed that image and pasted more info to test. From what I could see, it worked well enough. I then took the source code from AI Studio and began working.

The Code #

First, I grabbed some pictures from my collection, eleven of them, and tried to get a few photos, memes, and screenshots. To make it easier for me, after downloading them I renamed them so it would be quicker to see if it worked right. As I mentioned above, AI Studio gave me the code, but I modified it slightly so I could pass a directory of images:

import fs from 'fs/promises';
import 'dotenv/config';

import { GoogleGenerativeAI, HarmCategory, HarmBlockThreshold } from '@google/generative-ai';

const MODEL_NAME = "gemini-pro-vision";
const API_KEY = process.env.GOOGLE_AI_KEY;


async function detectPhoto(path) {
  const genAI = new GoogleGenerativeAI(API_KEY);
  const model = genAI.getGenerativeModel({ model: MODEL_NAME });

  const generationConfig = {
    temperature: 0.4,
    topK: 32,
    topP: 1,
    maxOutputTokens: 4096,
  };

  const safetySettings = [
    {
      category: HarmCategory.HARM_CATEGORY_HARASSMENT,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
    {
      category: HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
      threshold: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    },
  ];

  const parts = [
    {text: "Look at the following photo and tell me if it's a photo, a screenshot, or a meme. Answer with just one word.\n"},
    {
      inlineData: {
        mimeType: "image/jpeg",
        data: Buffer.from(await fs.readFile(path)).toString("base64")
      }
    },
    {text: "\n\n"},
  ];

  const result = await model.generateContent({
    contents: [{ role: "user", parts }],
    generationConfig,
    safetySettings,
  });

  const response = result.response;
  return response.text();
}

const root = './source_for_detector/';
let files = await fs.readdir(root);
for(const file of files) {
	console.log(`Check to see if ${file} is a photo, meme, or screenshot...`);
	let result = await detectPhoto(root + file);
	console.log(result);
}

It worked perfectly!

Terminal output from script

If you want a copy of the source, you can grab it here: https://github.com/cfjedimaster/ai-testingzone/tree/main/detect_meme_ss

The Photos #

Ok, technically you can just head over to the GitHub repo to see these, but here are the source images. First, the 'regular' photos:

Cat laying on a desk next to a computer mouse
Display case that says 'invisible snake'
Picture from a football game
Two cats on a chair

Next, the screenshots:

Screenshot from Reddit ap
Screenshot from walmart.com, Nebulon-B Frigate LEGO
Screenshot from OneNote, a list of shows to watch

And finally, the memes. Enjoy.

Time's Person of the Year - Godzilla
Vote Cobra
Who is Cobra Commander - I mean really...
Brace yourself - winter is coming. The entire thing. All at once. In one weekend.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK