Some sounds OK, most sounds like a series of plausible notes that nevertheless don't actually go anywhere. Most generative models tend to do OK at predicting the immediate next note, say, but they can't keep a macro-level structure unless you force that as a constraint. For me what has worked best is generating music and then re-interpreting that with some manual post-processing to get it into a less jarring form.
A fairly recent project that is close to the state of the art: https://openai.com/blog/jukebox/
One of my own experiments: https://datasciencecastnet.home.blog/2021/05/13/whistlegen-g...