Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Nyx.gallery – AI-generated photography (nyx.gallery)
56 points by MasterScrat on Oct 12, 2022 | hide | past | favorite | 28 comments


Authors here!

Stable Diffusion and DALL·E have made it simpler to create new images. But getting photorealistic results is still a challenge…

This is a continuation of our work from 3 months ago on “This Food Does Not Exist” (https://news.ycombinator.com/item?id=32167704). We are now using both Stable Diffusion and GANs depending on the subjects we want to render.


these are AI generated 'photorealistic' images, not 'ai generated photography'. Photography is a particular process you are not involved in during the course of these AI activities, as far as I can tell. If you somehow have Machine Learning algorithms operating cameras it would be correct.


> these are AI generated 'photorealistic' images, not 'ai generated photography'.

Amen, brother. Also, "fauxtography" was RIGHT THERE.


I will repeat the word "fauxtography" whenever I have an opportunity.


Wow. Mind blown faux-sho.


You’re right; then again, my money’s on “ai photo” becoming common enough to acquire meaning on its own. We already have a glow-worm (not a worm), funny bone (not a bone), Baby Yoda (not the Yoda), and many more.


that other people are incorrect is no reason to volunteer to be incorrect too. In fact I'd say its the opposite of good to use the wrong terms when you know better.

I suppose 'AI Generated Photography' sounds better than 'yet another stable diffusion model' but one is correct and the other isn't. This has nothing to do with photography.


"ai fauxto" (or "phauxto"?)


With lowercase “ai fauxto” it’s almost like iPhoto [1]. The name is now available I guess…

[1]: https://en.m.wikipedia.org/wiki/IPhoto


These are photos of Latent Space taken with a Autoencoder Decoder camera :p


Models for the 256px food images were previously released here:

https://github.com/nyx-ai/stylegan2-flax-tpu


> But getting photorealistic results is still a challenge…

Understatement of the year, I wonder if/when we'll get there


I was expecting to also see some photographs of traffic lights.


I see an uncharacteristically high number of malformed images. For example,

Six legged gecko: https://nyx.gallery/cdn-cgi/imagedelivery/16bz3hOZjq1MqWAKnm...

Dog with warped face: https://nyx.gallery/cdn-cgi/imagedelivery/16bz3hOZjq1MqWAKnm...

Bizarrely proportioned lion: https://nyx.gallery/cdn-cgi/imagedelivery/16bz3hOZjq1MqWAKnm...

Rabbit with fur and whisker artifacts, misplaced hind leg, and weird front paw: https://nyx.gallery/cdn-cgi/imagedelivery/16bz3hOZjq1MqWAKnm...


I liked a few weirdo pictures:

- Weird-grown broccolli (it should be more tree-like, not have stems going every which way): https://nyx.gallery/#c590eb57ef4e33ef93c147b1b705feee

- I don't know what kind of fish this weird-ass sushi is, but I wouldn't eat it: https://nyx.gallery/#8a77524d034b2733a34ae4aabe8a97f0

- A very unstable glass: https://nyx.gallery/#2dd470a9f3ab82f04c4a4391bb06df53

- If I ordered a "hot dog with catsup and mustard" and got this, with jalapenos, I would return it to the kitchen: https://nyx.gallery/#ab5a523bcc5da8689ad945af775f662b

I make fun, but this type of tech is pretty amazing to me.


Perhaps the ‘natural’ gecko is malformed, and the AI mind is showing us what they ought to look like.


I find Stable Diffusion is pretty good at generating single-subject images like this. It's really mind blowing and the novelty hasn't worn off for me yet.

But after dozens of attempts I still haven't managed to get it to show me a photograph of a duck eating a hoagie at Niagara Falls. I think it would be really interesting to try to find the simplest query that these tools cannot produce.


Because of too much use of proper noun ?


I had considered this, I'm not sure it's the issue since Niagara Falls is pretty iconic. I'll give it a shot with surrogates in place of Niagara Falls and see what happens. The bigger difficulty it seems is getting a duck to eat a hoagie.

I can get a hoagie at Niagara Falls, I can get a duck at Niagara Falls, I can even get a hoagie and a duck together near some water, I can almost get a duck eating a hoagie (I've gotten the duck near the hoagie with its mouth open), but I can't get a duck eating a hoagie at Niagara Falls.

Update: Stable Diffusion 1.5 is getting there. I have a duck, hoagie, and Niagara Falls all together, and I think this picture might be a success: https://i.ibb.co/Mhrv0sD/1894667935-A-duck-eating-a-hoagie-a...


So at this point is the stock photo companies more or less dead or do they still have business?


The stock photo companies simply pivot to AI, see companies like StockAI popping up [0]. Getty should also get in this business but it seems they're going in the opposite direction, banning AI generated images.

[0] https://www.stockai.com/


>> From FAQs: “we are using both diffusion models and GANs in combination with an extensive filtering and quality assessment pipeline that allows us to generate photorealistic images at scale.”

For every image that reaches the site, how many were generated that were filtered out by the pipeline? For example, for every photo that reaches site, 1000 were generated but did not pass the quality assessment pipeline.


It really depends on the subjects. For example cookies are quite forgiving - up to 30% of results are good. For things like coral or sushi we keep less than 1 in 10 (which is already much better than where we were at a few months ago!)

For now we still keep a close manual look at the output, but the goal as we scale up is to fully automate this selection process. Right now we have a pipeline that ranks the outputs and we select from the top results.


As a meatspace photographer, I take some comfort that the photograph in column 3, row 25 has an issue. The mountain peak has abundant snow. Its reflection in the lake doesn't. There are similar snow reflection disparities in several of the mountains-reflected-in-water pix.


reflections seem to be something that is struggled with. there are more examples where the reflection is not quite right:

      /\_
     /   \
    --------
     \  /
      \/


current AI frequently fail with any hard expectation of structure, like when it tries to genarate human bodies and it doesn't grasp at all how bones work, bending limba in unreal ways or adding extra ones, or creating heads with wrong proportions/spacing. It is pretty good at grasping how to fade textures/colors but it has little grasp on structure and "the big picture".


One thing I think would be super fascinating to have is a system that can take AI image training sets and reverse engineer which pictures were used to make the output. Take the bunny. I bet there were a lot of bunny pictures in the training set that looked very similar to the generated one. It would be interesting to have a system pick the one that is closest and display it next to it. It would be show how original (or unoriginal) these images are.


It's interesting to see what types of features these models don't distinguish well. For example, I've noticed that a lot of models have trouble with giving lady bugs distinct spots. Instead they usually end up with a big black splotch.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: