And stablediffusion-web-ui before that and others, yes.
When googling, txt2img and img2img, or txt2video img2video etc. (for video) are useful terms, since they encapsulate the usage in a few terms. One could search img2video comfyui workflows, for example.
I thought it would be useful for the conversation to provide these terms, not mentioned before in the thread.
Great posts! So far [2] is the only "claw" that has caught my interest, mostly because it isn't trying to do everything itself in some bespoke, NIH way.
I think about it like a series of waves in a pool. One end has wave generators (the lasers) spaced appropriately such that resulting waves hitting the other end interfere just right and create a unified wavefront (same phase, amplitude, frequency).
They never parsed your prompt. The magic word reduces the probability that the token corresponding to the end of chain-of-thought will be emitted, which increases test-time compute.
reply