I probably can't do better than Strassler's explanation, but here's what I usually tell people.
Particles are specific kinds of excitations of what we call fields. To create a particle, energy from some process is put into the field, and the particle then obeys certain rules, such as having a mass within a particular range.
However, energy can also pass through a field as an intermediary in a process. That is, the field can hang onto an amount of energy just to pass it on to a different field, but the intermediary doesn't have to obey the aforementioned kinds of rules. The intermediary is called a virtual particle in this case.
Virtual particles arise essentially because when you calculate a prediction for the rate of some certain process, say, two electrons bouncing off of each other, for reasons I won't get into here, we basically taylor expand an exponential of a function, which I'll call e^S, which represents all the interactions of the theory we are considering. So the rate of this process is calculated by something like Rate = 1 + S + S^2/2 + S^3/2/3 + ...
Each power of S is a chance for the particles in question to exchange energy with another field, and so more powers of S mean more complex interactions that can happen. We add up all the possible ways we could have gotten from the initial state to the final state, through all the various kinds of routes passing energy between fields, like messengers, with some weights assigned to each possible way. For example, for electrons bouncing off each other, we have energy exchanged in the following way: (electrons -> photons) x (photons -> electrons). The photon in the middle is never measurable because it only showed up because we had to do our Taylor expansion for the purposes of the calculation. For this reason, I've heard some people describe virtual particles as nothing more than a math organization trick.
For more information on this man: https://youtu.be/jJPkLLwrzu8