3

Working of "word2vec" vectorizer to convert text to numbers

 3 weeks ago
source link: https://www.codeproject.com/Questions/5362926/Working-of-word2vec-vectorizer-to-convert-text-to
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

I worked on a predictive (classification) model. I used Word2Vec to convert the data is textual columns to numeric, following which I ran the machine learning algorithms.

I have the following doubts regarding the working of Word2Vec:

What I have tried:

1) When I check the vector representation of each word of a sentence, I get an array of 100 numbers/vectors. What do all these numbers mean? I know that each number corresponds to a dimension, but what is a dimension in this context (with regard to the vector space)?

2) When training the Word2Vec model on a 'Neural Network', each word in a sentence is fed as input to the input layer & the words are one-hot encoded. So the vector representation of the words being fed would be something like, [1 0 0 0 0 0 0] & [0 0 1 0 0 0 0].

These vectors are initialized with random weights. The weighted sum of inputs is transmitted to the next layer (Hidden Layer). My doubt is, what is the point of assigning random weights to these word vectors when the weights that are being multiplied with the 0s will anyways remain 0?

How is the neural network transmitting information across with sparse data?


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK