dummy

1 Draft

Text representation: 如何让计算机明白单词的含义(understand the concepts of words)?

word vectors: words or phrases from a given language vocabulary are mapped to vectors of real numbers.

2 Traditional vector representation

Bag of Words (aka BoW)

don’t encode any information with regards to the meaning of a given word.

共现矩阵

SVD(奇异值分解)

3 Neural Embeddings

3.1 Word2Vec

Continuous bag-of-words (CBOW)

Continuous skip-gram

GloVe

FastText

4 References

  1. 从Word Embedding到Bert模型—自然语言处理中的预训练技术发展史
  2. Word Embeddings: An Introduction to the NLP Landscape
  3. 词向量发展史-共现矩阵-SVD-NNLM-Word2Vec-Glove-ELMo
  4. Word Vectors and NLP Modeling from BoW to BERT