Calmcode - embeddings: multi modal

If you consider how flexible neural networks are, it's not a crazy idea to start training multi-modal models.

1 2 3 4 5 6 7 8 9

In the previous segment we discussed how you can attach a general label to also create an embedding.

A very general diagram of a neural network.

But let's now discuss a general network architecture that allows you to embed two different kinds of data into the same space. In the example below we'll associate text with an image.

Combine two different kinds of inputs via a label.

The image part would require some convolutional layers, but besides that we still end up with a gradient that can propogate backward. Updating all the weights along the way.