General pattern
If we consider the general pattern from the previous video.
- We started with a dataset and a predictive task.
- By training for this task, we get a numeric representation as a side-effect.
- When we look at the numeric representation, it seems that clusters appear. These clusters may be useful for other tasks.
Then we can wonder what might happen when we train word embeddings instead of letter embeddings. We might still get clusters, but each point in the pointcloud would represent a word ... not a letter.
If you're working on algorithms that tries to detect specific words in text, this can be very useful. The embedding for "cat" might end up being very similar to the embedding for "dog" which will help train a classifier.