Sofar we've been discussing neural systems that train by predicting the next token or surrounding tokens. It had a diagram that looked like this:
![](/static/images/content/course/embeddings/06-flex.png)
But we're free to pick whatever diagram we like. We can also train embeddings by just training on a few general NLP tasks. As long as we can calculate a gradient, we can update the system!
![](/static/images/content/course/embeddings/06-flex-more.png)
In this diagram, we assume a sentence as input and that we have three classification labels that we'd like to predict. Also in this system, you'd have a layer in the middle that you can re-use as an embedding.