Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just can't understand how it manages to make this image http://www.pics.rs/i/2cuoM look like this image http://www.pics.rs/i/IEz0N, especially the last line of letters.


I guess the key reason why this works so well is because the lower resolution has been calculated from the higher resolution. So you have detailed information on how exactly the downscaling was calculated, which enables you to "cheat".

If you apply the same method to a completely different downscaling method, it probably wouldn't work out as nicely.

The ultimate test would be two different photographs taken at the same time - one with high and one with low resolution. Almost certainly this wouldn't work out as nicely. There is a huge difference between (a) having a low-res image due to physical effects, and (b) having a low-res image calculated by a known algorithm where all parameters are also known.


This is plain wrong. As others have pointed out, it is independent of the scaling method - it just looks for similar patches (at different sizes, mirror images, and perhaps even rotations - I don't remember) on the same image to find the best one that matches, and uses that for information.

In this specific case, you have all the letters there to match against, so there's no surprise.

Note that it uses the entire image to look for matches. If you tried to enlarge only the last line, it would probably NOT look as good.

There's a good reason that it works: Many pictures include similar elements at different scales, which lets you infer things from one scale to the other. In fact, in the eighties there was a lot of hype about "fractal compression" based on the same principle, see e.g. http://www.cs.northwestern.edu/~agupta/_projects/image_proce... ; In the end, they couldn't improve on JPEG, and has been essentially forgotten - but ... it did match the JPEG coders at the time in terms of compression rate (did much worse on speed and memory requirements); and, it could decompress pictures to much larger geometry than the original while still looking good -- technically, very similar to what is described in this paper. Everything old is new again.


Fractal Compression also stalled because Iterated Systems have many patents in the technology and licensed it under too-high-for-hobbyists terms. They should have released a low-end version as freeware/open-source and concentrated on the commercial licenses / support (similar to SQLite).


>This is plain wrong. As others have pointed out, it is independent of the scaling method - it just looks for similar patches (at different sizes, mirror images, and perhaps even rotations - I don't remember) on the same image to find the best one that matches, and uses that for information.

Which is much harder to do when the scaling method isn't known.


It doesn't even have to be scaling. It will work just as well copying only from similar images (e.g., if this was a frame from a movie, they could use nearby frames to the same effect; or if it's a picture taken of the same thing from a different angle).


Very true, but in principle it's possible to completely characterize the downscaling process effected by a physical camera.


Good point. In this case it is essentially the same as using a known algorithm with unknown parameters.

Either way, this increases the space of possible upscaled images dramatically, which makes it a lot harder (if not impossible) to produce result of the demonstrated quality.


There is some discussion of this from 2012, when this was last posted. It looks to be guessing the letters (sometimes incorrectly) based on the larger ones present in the image: https://news.ycombinator.com/item?id=4241978


Thanks, I was about to dig that up. Neat to see this again. (I should check out their more recent research now.)


I think the idea is that an image has a certain amount of self similarity. I.e. skin looks like skin. So the algorithm learns how skin is supposed to look like by looking at all the skin. So every wrinkle is improved by looking at every other wrinkle as opposed to just the pixles in an immediate vicinity.

Now replace "wrinkle" for "patch" and sprinkle some crazy stats and machine learning to get the idea to work.


As far as I understood, the letters at the bottom row are rebuilt from patterns found in other parts of the image - a few of them are false positives, see the original chart https://joyerickson.files.wordpress.com/2010/07/eye-chart.jp...


The general principal seems to be replacing being distance X from the truth with probability 1, with being distance X/10 from the truth with probability 0.99 and distance X*10 from the truth with probability 0.01.

This is consistent with another reply which states that it guesses the true letter (sometimes incorrectly)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: