Imagine you have a bag full of different types of candies. You want to know how many red candies there are in the bag without taking them all out one by one. So, you decide to randomly pick ten candies from the bag and count how many of them are red.
However, when you look at the ten candies you picked, only one of them is red. This makes it hard to estimate how many red candies there are in the bag because you didn't pick enough red candies.
This is where importance sampling comes in. Instead of picking candies randomly, you can pick candies that are more likely to be red. For example, you could look in the bag and see that there are more red candies on the top layer of the bag. So, you could take all ten candies from the top layer and count how many of them are red.
This way, you are more likely to pick more red candies and get a better estimate of how many red candies there are in the bag. You are using importance sampling by sampling from a distribution (the top layer) that is more likely to have what you are looking for (red candies).
In broader terms, importance sampling is a method for improving the accuracy of statistical simulations by selecting a set of samples that are more likely to be relevant to the problem at hand. It can be used in many different fields, such as finance, physics, and engineering.