自然言語処理
- Encodingの歴史
- LV1 One-hot vector = 0,0,0,1,0,0,0 (only one 1 (hot))
- LV2 Tf-idf = appear frequently globally => not important & apper frequently locally => important
- LV3 Word2vec = x,x,x king - distance(male,female) (includes some essence tf-idf
- LV4 BERT
- BERT can’t understand he/she (pronouns)
- But, it 推測 from surrounding other words (word sense disambiguation)
- BERT can’t understand he/she (pronouns)
https://ishitonton.hatenablog.com/entry/2018/11/25/200332
- embeddingについて
- 言語学と情報科学が重なる分野 - アプローチ - 人間が言語の入力/出力を行う仕組みを知りたい - 脳科学とか使わないと脳の情報処理はわからない - なので、観測可能な言語を通じて仕組みを探る