It kind of depends on your goal. In most cases, pregenerating your images is a much better option, especially if you want a decent resolution. Why do they have to be dynamic?
It would be a cool project, but you'd probably have to whitelist any context input.
For dynamic images I would stick to an image generation model myself, for the
best result with the least resources and complexity. I'd probably go for SDXL or an upscaled SD1.5 model, because you need a fair amount of control to get this right.
You will have to craft a nice prompt, like "empty stage with [best suited whitelisted word from context] in the backdrop", guide this through a ControlNet to get somewhat consistent output, add one or two LoRa's for consistency in style (these can be randomized for more variation) and then just manually composite your product over it with some outer glow.
Be aware though, image models really do not like to create images with "nothing" in the center. You may need some additional mask trickery to get it right.
I want to generate a background image using the Image-based LLM model. My goal is to place my product into this AI-generated background.
My ultimate objective is to create social media posters for promoting my product. For example, I have a product called "Baby Cart" and I want to place that baby cart onto the LLM-generated background.
Product Image URL:
Output I need: https://namesakehome.com/cdn/shop/products/qhe5pvvn7q5rlfwyg...
Please let me know which tools and LLM models would be helpful for me to achieve this task.