They have got pretty good documentation too[1]. And Looks like we have day 1 sup...

Jayakumark · 2025-04-28T22:02:16 1745877736

Second this , they patched all major llm frameworks like llama.cpp, transformers , vllm, sglang, ollama etc weeks before for qwen3 support and released model weights everywhere around same time. Like a global movie release. Cannot undermine mine this level of detail and effort.

echelon · 2025-04-28T23:26:45 1745882805

Alibaba, I have a huge favor to ask if you're listening. You guys very obviously care about the community.

We need an answer to gpt-image-1. Can you please pair Qwen with Wan? That would literally change the art world forever.

gpt-image-1 is an almost wholesale replacement of ComfyUI and SD/Flux ControlNets. I can't underscore how big of a deal it is. As such, OpenAI has leapt ahead and threatens to start capturing more of the market for AI images and video. The expense of designing and training a multimodal model presents challenges to the open source community, and it's unlikely that Black Forest Labs or an open effort can do it. It's really a place where only Alibaba can shine.

If we get an open weights multimodal image gen model that we can fine tune, then it's game over - open models will be 100% the future. If not, then the giants are going to start controlling media creation. It'll be the domain of OpenAI and Google alone. Firing a salvo here will keep media creation highly competitive.

So please, pretty please work on an LLM/Diffusion multimodal image gen model. It would change the world instantly.

And keep up the great work with Wan Video! It's easily going to surpass Kling and Veo. The controllability is already well worth the tradeoffs.

Imustaskforhelp · 2025-04-29T10:46:20 1745923580

I don't know, the AI image quality has gotten good but it's still slop. We are forgetting what makes art, well art.

I am not even an artist but yeah I see people using AI for photos and they were so horrendous pre chatgpt-imagen that I had literally told one person if you are going to use AI images, might as well use chatgpt for it.

Also though I would also like to get something like chatgpt-image generating qualities from an open source model. I think what we are really looking for is cheap free labour of alibaba team.

We are wanting for them / anyone to create open source tool so that anyone can then use it, thus reducing the monopoly of openai but that is not what most people are wishing for, they are wishing for this to lead to reduction of price so that they can use it either on their own hardware for very few cost or some providers on openrouter and its alikes for cheap image generation with good quality.

Earlier people used to pay artists, then people started using stock photos, then Ai image gen came, and now we have gotten AI image pretty much good with chatgpt and now people don't even want to pay chatgpt that much money, they want to use it for literal cents.

Not sure how long this trend will continue, when deepseek r1 launched, I remember people being happy that it was open source but 99% people couldn't self host it like I can't because of its needs and we were still using API but just because it was open source, it reduced the price way too much forcing others to reduce it as well, really making a cultural pricing shift in AI.

We are in this really weird spot as humans. We want to earn a lot of money yet we don't want to pay anybody money/ want free labour from open source which is just disincentivizing open source because now people like to think its free labour and they might be right.

fkyoureadthedoc · 2025-04-29T12:23:45 1745929425

On the other hand, ChatGPT image generation is a lot of fun to use. I'd never pay a human artist to make the meme tier images I use it for.

lovestory · 2025-04-29T11:38:26 1745926706

Even Katy Perry started using AI for her tour backdrop visuals and it looks... well, horrendous https://twitter.com/bklynb4by/status/1915514396421337171

bergheim · 2025-04-29T01:53:54 1745891634

> That would literally change the art world forever.

In what world? Some small percentage up or who knows, and _that_ revolutionized art? Not a few years ago, but now, this.

Wow.

horhay · 2025-04-29T10:45:00 1745923500

It's pretty much expected that everything is "world shaking" in the modern day tech world. Now whether it's true or not is a different thing everytime. I'm fairly certain even the 4o image gen model has shown weaknesses that other approaches didn't, but you know, newer means absolutely better and will change the world.

Tepix · 2025-04-29T05:51:33 1745905893

Forever, as in for a few weeks… ;-)

Imustaskforhelp · 2025-04-29T10:53:31 1745924011

oh boy I had a smirk after reading this comment because its partially true.

When deepseek r1 came, it lit the markets on fire (atleast american) and then many thought it would be the best forever / for a long time.

Then came grok3 , then claude 3.7 , then gemini 2.5 pro.

Now people comment that gemini 2.5 pro is going to stay forever. When deepseek came, there were articles like this on HN: "Of course, open source is the future of AI" When Gemini 2.5 Pro came there were articles like this: "Of course, google build its own gpu's , and they had the deepnet which specialized in reinforced learning, Of course they were going to go to the Top"

We as humans are just trying to justify why certain company built something more powerful than other companies. But the fact is, that AI is still a black box, People were literally say for llama 4:

"I think llama 4 is going to be the best open source model, Zuck doesn't like to lose"

Nothing is forever, its all opinions and current benchmarks. We want the best thing in benchmark and then we want an even better thing, and we would justify why / how that better thing was built.

Every time, I saw a new model rise, people used to say it would be forever.

And every time, Something new beat to it and people forgot the last time somebody said something like forever.

So yea, deepseek r1 -> grok 3 -> claude 3.7 -> gemini 2.5 pro (Current state of the art?), each transition was just some weeks IIRC.

Your comment is a literal fact that people of AI forget.