Has your model seen humans in a profile view? Has it seen armor? Has it seen Van Gogh style paintings? If yes then it can create a combo of those things.
For CSAM it needs to know what porn looks like, what a child looks like and what a naked pubescent body looks like to create it. It didn't make your van Gogh painting from nothing it had an idea of what those things were.
Csam is in the training data. From a few months ago
https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse