Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From the article:

   Since the third party service conducted rate-limiting based on IP
   address (stated in their docs), my solution was to put the code that
   hit their service into some client-side Javascript, and then send
   the results back to my server from each of the clients.

   This way, the requests would appear to come from thousands of
   different places, since each client would presumably have their own
   unique IP address, and none of them would individually be going over
   the rate limit.
Pretty sure the browser Same Original Policy forbids this. Think about it- if this worked, you'd be able to scrape inside corporate firewalls simply by having users visit your website from behind the firewall.


> Since the third party service conducted rate-limiting based on IP

By the way, that's one of my projects. You can use a basic fibonacci-related algorithm to figure out (in the most minimal number of requests) what exactly the rate limit is. This way, you can scrape at just under the maximum limit. I am still working on this core library though. :|


Sounds pretty interesting! Be sure to share it when it's ready.


That's a great point, for most web services, this request would be blocked at the browser level by the Same Origin Policy. Fortunately for me, this site allowed client-side calls by returning a Access-Control-Allow-Origin: * header[1], specifically designed to allow this type of cross-domain access.

[1]: http://en.wikipedia.org/wiki/Same_origin_policy#Cross-Origin...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: