Calmcode - embeddings: results

Let's discuss the results of the letter embedding experiment.

1 2 3 4 5 6 7 8 9

Remember that embedding chart from the previous segment?

This is what the embedded space looks like before training. After training is done, it looks like this:

Notice how the vowels are all next to eachother? Just like the numbers?

General pattern

Let's go over the steps that happened here.

  1. We started with a dataset and a predictive task.
  2. By training for this task, we get a numeric representation as a side-effect.
  3. When we look at the numeric representation, it seems that clusters appear. These clusters may be useful for other tasks.

These clusters appear as a side effect of the training procedure, and it's a general "theme" when training any embeddings.