It's surely a difficult decision. You either: 1.) train the model using public-d...

It's surely a difficult decision.

You either:

1.) train the model using public-domain or "free" content, get a mix of very old-style writing and a very wide range of quality blended into an overall bad result,

or, you:

2.) train the model using copyrighted content, get "so-so-but-better" results (because the model is still not able to produce quality by remixing quality), but you won't be able to release it because you can't ensure that it won't reveal parts of its training content later-on.

So... at max you can use it for PR only and never open it to anyone...

Yeah, really hard to guess what happened here.... /s