Hacker Newsnew | past | comments | ask | show | jobs | submit | sapski's commentslogin

Author of the underlying study here.

Facebook has been insisting that non-discrimination should be the responsibility of the purchasers, but we've shown over [0] and over [1] again [2] that even when the advertiser targets all groups proportionally (no misuse of advertising options), Facebook subselects who to show their ads to in a skewed way, leaving the advertiser and the users no recourse.

[0] https://arxiv.org/pdf/1904.02095.pdf

[1] https://arxiv.org/pdf/1912.04255.pdf

[2] https://sapiezynski.com/papers/sapiezynski2019algorithms.pdf


I just read half of the last paper (and skimmed the other half). It is surprising that the paper does not use the term "information theory" even once. The research basically bumps into the established facts that properly hiding or destroying information is really hard.

Advertising industry has known this since ancient times. Military signals intelligence has been essentially built with that tidbit as its core. Even weak proxies are formidable if you have bucketloads of them to choose and combine from.

Now, let's make one thing clear: I am not a FB apologist. In fact, I find the modern advertising systems abhorrent, immoral and outright vile. But even then, this article felt like the authors chose to miss the point. Hiding information is incredibly difficult - and conflating "information theory is damn hard" with "FB[ß] are evil and/or immoral" feels intellectually dishonest.

For what it's worth, I would actually love to see research _and_ well sourced articles about the practical net effects of information theoretical attacks, intentional or not, on the human populations as observed through the various e-stalking platforms.

ß: I'm using "FB" here as a shortcut for FANG+MS+others.



Not true, you could only search for accounts using phone numbers people had entered themselves


But your phone number would still be in facebooks database...


This article is not about facebook’s database it’s about phone numbers scraped through search which ended up in a third party database


"numbers scraped" from Facebook's database. Of course this is about Facebook's database


Your feeling is correct. It's been shown by researchers (incl. me) and confirmed by Facebook that they're doing that: https://gizmodo.com/facebook-is-giving-advertisers-access-to...


I’ve confirmed this same type of behavior in several of Google’s products as well, as part of an experiment a couple friends and I ran a several months back, using fake personas, to see how feasible it’d be for one to simply exist w/o creating a digital footprint (let’s just say that our overall conclusions left me feeling very sad).


Here's a paper that describes it, confirmed with real profiles, not just speculation: https://hal.archives-ouvertes.fr/hal-01955327/document


Thank you, I appreciate the reference. I'll take a deep look at this.


I use https://github.com/andrea-cuttone/geoplotlib in my work and highly recommend checking it out.

The lib uses openstreetmaps as background, can display scatters, heatmaps, shapefiles, calculate and display voronoi tesselation, and does a ton of other things.


It's true: Skyhook, Google, Apple, and Microsoft have been doing it for a while. Even more, there are free databases that you can use to map WiFi routers to locations (for example wiggle.net), but for some reason this is still not enough for Google to treat WiFi as equivalent to location. This also has consequences in age rating: if you explicitly require location access, you fall into a different age category than if you require "only" the WiFi permission.

You can control the scanning settings in settings -> WiFi -> advanced -> scanning always available. It's ON by default, but you can disable it there.

Apart from what you mention, what is new is the measurement of how many access point you actually need to know to track my location: it's costly to look up all the routers I see during a day, but we show that people spend a vast majority of the time close to a very small number of unique access points (~20 routers per person over 6 months).


Yes, but: 1) you can circumvent this problem by randomizing your mac between probes, as apple already does, and that doesn't help with the threat we present

2) ssids are not unique - when it says "airport" it can be any airport. When you have access to the mac of the device, you can pin point it uniquely - that's the threat we present.

3) with the threat you link, you theoretically might be able to recover some of the past locations of the user where they did connect to WiFi. With the threat we present you get the location history with time resolution of up to 20 seconds, whether the user connects to WiFi or not, and even if they disable WiFi, and you don't have to control any routers. I would say this constitutes a novelty.

=== EDIT ====

4) the link only mentions a theoretical possibility, we show that the threat is real based on real data collected over 6 months about multiple people.


They are also willing to share their knowledge: you can do 100 requests per day for free to https://developers.google.com/maps/documentation/business/ge..., or pay up to have basically unlimited access (that's for wifi routers and gsm towers, they work much better than IP addresses for the reasons you mentioned).


This is something different: they just know when you visit a location with a router that they control. We show that you don't need to control any routers to track people's location, as long as you have an app with the "WiFi information" permission (and most of the apps do have it).


(Assuming you are the author of the post)

Did you watch the network traffic that apps send home? I would be curious to know, of the top games in the app store that see wifi data, how many of them actually send it back to their servers.

I've been running mitmproxy for a project, giving me rare insight into the data that leaves my phone. It's amazing how often android/ios apps "phone home." Every few seconds, apple and google servers receive a request from my phone with fingerprint information sufficient to pinpoint my location on a map. Usually the current WiFi SSID is included in that.

It has me wondering if there is viability in a consumer-grade "man in the middle" router for auditing/filtering the traffic leaving the user's home network.


Good point, thanks! We didn't watch the traffic of these apps yet, we just point out that they have the ability to report it back, beyond the user's control.

I did however read through the privacy policies of the apps, and one of the top 20 with WiFi but not location permission mentioned collecting your location data.


Also, checkout the app that shows the findings on your own data: https://play.google.com/store/apps/details?id=dk.dtu.compute...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: