The Wasserstein metric is used to measure how different two probability distributions are from one another. Imagine you have two piles of toys, one pile has lots of cars and the other pile has lots of dolls. To find out how different they are from each other, you could count how many cars and dolls are in each pile and then compare the numbers.
Similarly, the Wasserstein metric looks at the differences between the probabilities of each possible outcome in two distributions. For example, if you had two dice and you rolled one many times and recorded the results (how often a 1 was rolled, how often a 2 was rolled, etc.), you could create a distribution for that dice. If you repeated this process with a second dice, you would have two distributions to compare.
The Wasserstein metric calculates the minimum amount of "work" it would take to transform one distribution into the other. Imagine you have two piles of toys with different amounts of cars and dolls. To make the piles more similar, you could move some toys from one pile to the other. To move the toys, you would have to do some work (like picking up the toys and carrying them over). The Wasserstein metric tells you how much work you would have to do to make the two piles exactly the same.
In summary, the Wasserstein metric measures how much work is needed to transform one probability distribution into another by comparing the probabilities of each possible outcome.