Automatically generate multilingual alt attributes for your images with a LLM and rokka.io

The alt attribute in HTML img elements is crucial and not optional. However, creating them can be cumbersome for content editors and is often overlooked. A multimodal LLM can automatically generate them with sufficient accuracy.

Our image service rokka.io now supports this feature, thanks to OpenAI's Vision API.

In this blog post, we will also share the prompt we used, so you can recreate it for yourself. Alt attributes are vital for enhancing accessibility, and we are keen on promoting their wider use.

Why Alt Attributes?

Besides being mandatory for valid HTML, here's what ChatGPT says about them:

The alt attribute on img tags is used for accessibility, providing a textual description of the image for users who can't see it; as a fallback when the image can't be displayed; for SEO, helping search engines understand and index the image content; and to meet legal accessibility standards.

The Example

Man watering a hanging houseplant with a green watering can.

For the image shown above, rokka and the LLM produced the following accurate descriptions in English, German, and French:

en: Man watering a hanging houseplant with a green watering can.
de: Mann giesst hängende Zimmerpflanze mit grüner Gießkanne.
fr: Homme arrosant une plante suspendue avec un arrosoir vert.

How We Do It

First, we resize the image to a smaller resolution to save costs (width/height of 500 pixels works well). Then, we send that, along with the prompt below, to OpenAI's Vision API, parse the result, and store it in rokka's metadata of the image. That's all. And it works very well most of the time.

If it's implemented into a CMS, you maybe should provide an option for content editors to adjust the alt attribute in case of inaccuracies. We offer this for example in the Drupal rokka plugin, but it's seldom necessary.

The Prompt

You are an alt attribute creator for images. Please describe 
the image suitable for the alt tag in an HTML image element 
in as few words as possible.
Answer in the following format in the languages 
English (en), German (de), French (fr).

Example Output:
en: View from a train window at Aarau station platform with tracks.
de: Blick aus dem Zugfenster auf den Bahnsteig am Bahnhof Aarau mit Gleisen.
fr: Vue depuis la fenêtre du train sur le quai de la gare d'Aarau avec les voies.

Output:

As a response, you should now get a line-by-line description for each image.

Want to Try It Out?

The API calls are documented in the rokka documentation. Currently, paying rokka.io users can use this feature immediately. Contact us if you want to try it out on rokka.

And as mentioned, there's also a Drupal plugin for complete integration of rokka into your Drupal installation.

PS: We are aware that we don't use this feature on this site (not yet, we are working on it). We eat our own dogfood on some upcoming client sites, though.

Do you have a question, a comment, or just feeling inspired? Mention us or share this article on Mastodon, Twitter or LinkedIn.

Subscribe to blog updates using the RSS Feed.

Automatically generate multilingual alt attributes for your images with a LLM a...

Automatically generate multilingual alt attributes for your images with a LLM and rokka.io

Why Alt Attributes?

The Example

How We Do It

The Prompt

Want to Try It Out?

Recommend

iQOO 12 Anniversary Edition for India is coming on April 9 in Desert Red

冰岛户外品牌 66°North任命 Josefine Laigaard 为新任战略和业务发展总监，推动品牌加...

Transform Your Cybersecurity Training with OffSec’s Cyber Ranges

Re-Balancing Design and Development

近日，广汽集团与法国达索系统在广汽中心举行战略合作深化协议签约仪式。双方将在数字...

How to use xUnit to run unit testing in .NET and C#

DHC蝶翠诗宣布艺人陈意涵成为品牌防晒小金帽代言人。

牛仔品牌Lee与设计师 Suneet Varma 合作推出了“Denim Beyond Definition”女装系列。

Comments JavaScript API: Useful Tips and Tricks

Discord To Start Showing Ads This Week After History of Shunning Them - Slashdot

About Joyk