- Date Created: [[2020-10-02]]
- [[Linguistics]] [[Artificial Intelligence]] [[Natural Language Processing]]
- Source:
- 1 [[Computational Linguistics - Wikipedia]]
- 2 [[Article/What Is GPT-3? Everything Your Business Needs to Know About OpenAI’s Breakthrough AI Language Program | ZDNet]]
- ### Borrowing techniques used in gene mapping to better analyze languages [1]([[Computational Linguistics - Wikipedia]])
- ### Attention [2]([[Article/What Is GPT-3? Everything Your Business Needs to Know About OpenAI’s Breakthrough AI Language Program | ZDNet]])
- Attention involves condensing words into vectors of different sizes, instead of the old model that involved condensing sentences into vectors of the same sizes. The flexibility allows words to be coded with more meaning.
- "`Every sentence was crammed into the same-sized vector, no matter how long the sentence.
Bengio and his team concluded that this rigid approach was a bottleneck. A language model should be able to search across many vectors of different lengths to find the words that optimize the conditional probability. And so they devised a way to let the neural net flexibly compress words into vectors of different sizes, as well as to allow the program to flexibly search across those vectors for the context that would matter. They called this attention. `"
- ### Unsupervised learning [2]([[Article/What Is GPT-3? Everything Your Business Needs to Know About OpenAI’s Breakthrough AI Language Program | ZDNet]])
- Unsupervised learning is a step closer to [strong AI](((jlyPVpS4M))), because instead of being spoonfed sample data that has been meticulously curated by humans, machines are given raw data that they analyze and learn from themselves.
- "`The focus up until that time for most language models had been supervised learning with what is known as labeled data. Given an input, a neural net is also given an example output as the objective version of the answer. `"
- "`But having the desired output carefully labeled can be a problem because it requires lots of curation of data, such as assembling example sentence pairs by human judgment, which is time-consuming and resource-intensive. `"
- "`Instead of being given a sentence pair, the network was given only single sentences and had to compress each one to a vector and decompress each one back to the original sentence. `"
-