ELI5: Explain Like I'm 5

T-distributed stochastic neighbor embedding

So imagine you have a bunch of pictures of cute animals like puppies and kittens, and you want to put them in groups based on how similar they look. But puppies and kittens come in all sorts of shapes and sizes, so it might be hard to tell which ones are most similar just by looking at them.

That's where t-distributed stochastic neighbor embedding (t-SNE) comes in. It's like a game of "spot the differences" but for pictures. t-SNE takes all of your cute animal pictures and turns them into points on a graph. Each point represents a picture, and the closer two points are to each other, the more similar their pictures look.

The t-SNE algorithm does this by first figuring out the distance between each pair of pictures. This distance is like a measure of how different the pictures are from each other.

Then, t-SNE tries to put similar pictures close together on the graph by creating a kind of magnetism between them. The more similar the pictures are, the stronger the magnetism is between their points on the graph.

But here's the tricky part: t-SNE also tries to make sure that dissimilar pictures are far apart on the graph. So if you have a picture of a puppy and a picture of a fish, their points on the graph will be really far apart since they're not very similar at all.

t-SNE keeps doing this until it's created a graph where similar pictures are close together and dissimilar pictures are far apart. And voila! You've now got a neat visual representation of your cute animal pictures, sorted by how similar they look.