Multimodal trials: solve the Masked Language problem about my tiny ALBEF implementation (episode 3)

I just wrote my implementation of ALBEF in my own way. But when evaluated with some masked sentences, it failed.

I am using this image:

When I asked “This is a chocolate <|mask|>”, it generated “This is a chocolate urn”. Quite strange

Then I asked “This is a <|mask|> cake, it generated “This is a iph cake”. Totally wrong.

After checking my implementation of the dataset, and training on a small part of CC3M, a week passed and I finally got the reason today: the tiktoken is a BPE tokenizer that will use sub-words as tokens and these sub-words severely hurt the model. For example, sub-words “urn” and “iph” appear too many times and the model would use them to replace the masked word in prediction.

By replacing tiktoken with BertTokenizerFast (from “transformers” package), the model correctly generates “This is a chocolate cake”.

Multimodal trials: my tiny CLIP implementation (episode 2)
Three weeks passed since the previous article. Here are the answers to the previous three…
Multimodal trials: my tiny CLIP implementation
CLIP is already a three years old paper but its simple design and significant performance…
My summary for the paper "Unified Language Model Pre-training for Natural Language Understanding and Generation"
For NLU (Natural Language Understanding), we use the bidirectional language model (like BERT), but for…

April 19, 2024 - 4:57 RobinDong machine learning
Multimodal, PyTorch
Leave a comment

Multimodal trials: solve the Masked Language problem about my tiny ALBEF impleme...

Multimodal trials: solve the Masked Language problem about my tiny ALBEF implementation (episode 3)

Related Posts

Leave a Reply Cancel reply

Recommend

美团外卖Apple超品日：iPhone的i，原来是i人的i

这家药业公司宣布20.2亿元收购方案：占股80%

绿箭侠

A Smarter Factory Floor with MongoDB Atlas and Google Cloud's Manufacturing Data...

GitHub - EmNudge/logpipe: Inspect your logs

在2023年财报发布后的电话分析师会议中，加拿大运动瑜伽服供应商 lululemon 首席执行...

How to harness the power of brand mentions across the search universe

Using GitHub and NextAuth.js for Single Sign-on in Next.js

Top 5 Usability Testing Books

抖音：未经逝者生前同意或逝者家属授权慎用“AI复活”技术

About Joyk