It's absurd how unusable Cloudflare is making the web when using a browser or IP address they consider "suspicious". I've lately been drowning in captchas for the crime of using Firefox. All in the interest of "bot protection", of course.
The real frustrating part is that Cloudflare's "definition" of suspicious keeps changing and expanding. VPN users, privacy-first browsers, uncommon IP ranges, they all get flagged. The people most likely to get caught by these systems are exactly the ones who care most about their privacy, and not the bots that they are apparently targeting.
So the stable state here is all humans eventually being locked out? (Bots are getting better every day; I doubt the same is true for all humans, including those with weird browsers or networks unwilling to install some dystopian Cloudflare "Internet passport".)
But hey, at least some bots are also not making it past Cloudflare!
Or else a player too big to be blocked moves into the space with a service that provides some/all of the privacy benefits, but declines to offer the other undesirable aspects of VPN (e.g. location shifting to circumvent local restrictions)
To the contrary, people running botnets or AI scrapers are likely going out of their way to mimic ordinary web traffic from consumer devices. Ultimately, these measures will only affect users who are trying to protect their privacy and security, and will be ineffective at stopping bots.
> The people most likely to get caught by these systems are exactly the ones who care most about their privacy, and not the bots that they are apparently targeting.
In my brief experience with abuse mitigation, connections coming from VPNs or unusual IP ranges were very significantly more likely to be associated with abuse.
It depends on your users. VPNs aren’t common at all, even though you hear about them a lot on Hacker News. For types of social sites where people got banned for abuse (forums) the first step to getting back on the forum was always to sign up for a VPN and try to reconnect. It got so bad that almost every new account connecting via VPN would reveal itself as a spammer, a banned member trying to return, or someone trying to sock puppet alternate accounts for some reason.
The worst offenders are Tor IP addresses. Anyone connecting from Tor was basically guaranteed to have bad intentions.
I heard from someone who dealt with a lot of e-mail abuse that the death threats, extortion, and other serious abuse almost always came from Protonmail or one of the other privacy-first providers that I can’t remember right now. He half-jokingly said they could likely block Protonmail entirely without impacting any real users.
It’s tough for people who want these things for privacy, but the sad reality is that these same privacy protections are favored by people who are trying to abuse services.
> In my brief experience with abuse mitigation, connections coming from VPNs or unusual IP ranges were very significantly more likely to be associated with abuse.
Correlating these factors with abuse implies that you already have methods of identifying abuse per se, independently of these factors. Is there no feasible way of just blocking the abuse itself when it begins, or developing much more proximate indicators to act on?
> The worst offenders are Tor IP addresses. Anyone connecting from Tor was basically guaranteed to have bad intentions.
Do you handle this by blocking known Tor exit node IPs entirely, or just adding hurdles to attempts to post from those IPs?
> It’s tough for people who want these things for privacy, but the sad reality is that these same privacy protections are favored by people who are trying to abuse services.
But naturally P(A|B) and P(B|A) are two different things.
How does the Tor network counter abuse? Like, say you're hosting a service on the Tor network, what does the Tor network offer if anything to defend against e.g. DDoS attacks?
Sure, but if the service keeps getting overwhelmed (financially or traffic-wise) or compromised (not even necessarily in the security sense but in the semantic purpose sense, like via spam floods on a message board) due to a lessened capability to combat abuse, then the user is worse off all over again, no?
All it would solve then is laundering Tor traffic from being probably malicious to being reputationally ambiguous. Though for a within-network service, that's probably assumed anyways - hard to run a Tor service if you assume all Tor users are malicious, that would be nonsensical.
Which VPNs are people using that actually care about the user's privacy? Most of them don't, sell their home IP to buyers, sell their DNS history to others, etc. Worse, some of them could require invasive MITM cert stuff most users will just click yes through.
I have yet to see a use case for VPNs for the casual internet audience, and for a tech savvy user, their better off renting through some datacenter or something, which at that point is hardly a VPN and more home IP obfuscation. All the same downsides, and at least you get real privacy.
I'm forced to use a VPN to occasionally check my US bank account, since a foreign IP address is obviously a harbinger of unspeakable evil (while the friendly Youtube advertised neighborhood VPN is obviously evidence of pure intentions).
ProtonVPN with bitcoin which you get from a monero swap is a good idea for complete privacy if you want port forwarding.
MullvadVPN is also another great one.
I have heard some good things about AirVPN, but I can absolutely attest for mullvad and to a degree ProtonVPN (Just with Proton, depending upon your threat model, do make the necessary precautions like buying with monero for example)
There are others, but mostly its the 2-3 that I trust.
How do you square "complete privacy" with the fact that you're authenticating to these VPNs with a persistent username or other credential and are then sending traffic through them, both from an IP address that might identify you, and to services that you authenticate against?
Best case, the VPN learns your residential IP and the names of every HTTPS host you connect to (if not your entire DNS traffic as well); worst case, they collude with any of the services you use (or some ad tracker they embed) and persistently deanonymize your account.
> How do you square "complete privacy" with the fact that you're authenticating to these VPNs with a persistent username or other credential and are then sending traffic through them, both from an IP address that might identify you, and to services that you authenticate against?
IIRC, Mullvad allows anonymous accounts, allows payment in cash and via other methods that don't link PII to the transaction, and claims not to log inbound connections.
>Most of them don't, sell their home IP to buyers, sell their DNS history to others, etc. Worse, some of them could require invasive MITM cert stuff most users will just click yes through.
Source? I haven't seen any evidence that the major paid VPN providers engage in any of those things. At best it's vague implications something shady is happening because one of the key people was previously at [shady organization].
I recently had the insane experience of filling out 15 consecutive captchas, after, I had checked out and entered my payment information into the payment processor widget. I just wanted to submit the order. I was logged in to their website, and the bank even needed a one time code for payment. If the bank is pretty sure I am human then your ecomm site can figure it out surely.
At least outside the US, there's 3DS as an (admittedly often high friction) high quality cardholder verification method, but in the US, that's of course considered much too consumer-hostile, so "select 87 overpasses" it is.
A while back I was buying tickets for a gondola for a trip in Europe and the checkout process failed during payment because their site didn't load their analytics/tracking stuff with proper error-handling, so when my ad-blocker prevented the tracking stuff, their checkout process failed to handle my CC's 2-factor auth and the checkout would fail. Had to contact my CC company and work with the gondola company to tell them what they're doing wrong so they could fix their website code. Pretty sad to know whoever built their stuff actually shipped a checkout flow (for a VERY popular tourist destination) without testing with ad-blockers enabled.
To be fair, this sometimes seems on the ad blocker. I've definitely seen mine accidentally nuke part of the payment Javascript (or maybe the 3DS iframe?) because some substring of it matched some common ad URL, which is obviously unrecoverable for the site itself.
Surprising really, because I'm a Firefox + Ublock Origin die hard and I never get Cloudflare captchas. Wonder what the difference is? I have CGNAT turned off, if that matters at all (probably not).
I could definitely imagine a public IPv4 with lots of good, logged-in Cloudflare traffic to act as a positive signal for their heuristics, possibly even overriding the Firefox penalty.
Most people are on a CGNAT these days, drowning in captchas is the new normal. You’re at the mercy of one of your neighbors not hosting a botnet from their home computer.
For better or for worse, CF's fingerprinting and traffic filtering is a lot more in-depth than just IP trend analysis. Kind of by necessity, exactly because of what you mention. So I'd think that's not as big a worry per se.
Yet here I am drowning in captchas every once in a while, so it's quite a big worry for me.
Maybe I just have to disable all ad blockers and Safari tracking prevention? Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
> Yet here I am drowning in captchas every once in a while, so it's quite a big worry for me.
I think I was sufficiently clear that I was specifically talking about CGNAT-caused IP address tainting being an unreasonably emphasized worry, not the worry about their detections overall misfiring. Though I certainly don't hear much about people having issues with it (but then anecdotes are anecdotal).
> Or I guess I could send a link to a scan of my photo ID in a custom request header like X-Please-Cloudflare-May-I-Use-Your-Open-Web?
Sounds good, have you tried?
Not sure what's the point of these comically asinine rhetoricals.
Not even remotely true, I genuinely have no idea what you're talking about. The only time I get captcha'ed is when I sometimes VPN around, or do some custom browser stuff and etc. I'll even say I get captcha'ed less now than maybe 5 years ago.
Again, no clue what you’re talking about. The only time I had to deal with shit was when I was travelling a bit sketchy countries. I get that “Cloudfare is verifying your connection” loading screen from time to time, but there’s no captchas involved.
Super majority of people don’t use VPNs, or rare browsers, or avoid fingerprinting and etc. When you browse like regular you don’t notice the friction. That’s the selling point of companies like CF, because website owners don’t want to lose real traffic.
Every so often, usually after a firefox update, CF will get into a "I'm convinced your a bot" mode with me. I can get out of it by solving 20 CAPTCHAs.
It's probably just a higher rate of autonomous vehicles needing stop signs and buses identified at that moment, and cognitive bias causes you to only remember when that happens when you recently performed an update. /s
>It's probably just a higher rate of autonomous vehicles needing stop signs and buses identified at that moment
I can't tell whether you're serious but in case you are, this theory immediately falls apart when you realize waymo operates at night but there aren't any night photos.
My assumption is that CF has something like a SVM that it's feeding a bunch of datapoints into for bot detection. Go over some threshold and you end up in the CAPTCHA jail.
I'm certain the User-Agent is part of it. I know that for certain because a very reliable way I can trigger the CF stuff is this plugin with the wrong browser selected [1].
In what way would that not be fair? Their product giving false positives (unnecessary challenges for a normal browser humans commonly use) to real people is definitely their fault.
That sounds like it is working as intended, not a false positive. A false positive would mean it blocked you whereas a challenge means more information is needed. You aren't noticing all of the times it correctly decides you are human, only the times when it needs to "inconvenience" you for more information because you prioritize privacy, a key similarity with some bots.
I also like privacy. I use GrapheneOS. I compartmentalize my credit cards, emails, and phone numbers. I don't use Google products, and the list continues, but I don't complain about Cloudflare because it is painless and I understand the price I pay for privacy.
I also have home services accessible via my home website, running on my home server(s). I chose to have cloudflare to host my domain specifically for the easy bot blocking, and it blocks more than 2000 bots/day that otherwise would be trying to find vulnerabilities on my servers, which contain a lot of sensitive things. I've never had an issue personally accessing my services through cloudflare. Sometimes I have to do captchas to access my own things, and that's barely an inconvenience (I am aware the domain isn't necessary to access services, but it makes more sense for my setup and intents)
No, but it's entirely within TSA's hands to make that process as frictionless as possible.
(It's a different question whether zero friction is actually desired, or whether some security theater is actually part of the service being provided, but that's a different question.)
The "quality" of TSA's screening seems be pretty bad too given how many people have to go through secondary screening vs how many terrorist they catch (0?)
they caught 11 million by now (just as arbitrary as your 0 but probably more accurate since we haven’t had a large terrorist attack since they got the gig to serve and protect and before we lost thousands of lives…)
>they caught 11 million by now (just as arbitrary as your 0 but probably more accurate
Nice try but I used "caught", not "stopped", which requires they actually apprehended someone, not just prevented some hypothetical attack.
>since they got the gig to serve and protect and before we lost thousands of lives…)
You could easily reuse this argument for cloudflare: "if it wasn't for such invasive browser fingerprinting openai would be drowning in bajillion req/s from bots."
No, using a stupid authentication/verification method with lots of false positives is always on whoever deploys it.
Imagine an apartment building with a flimsy front door lock that breaks all the time, and the landlord only telling you that that can't be helped because of all the burglars.
If it's just as easy to spoof being Chrome as it is to spoof being Firefox, then it is indeed fair to blame Cloudflare if they give Firefox users more CAPTCHAs than Chrome users.
I... Don't think it does that? It shouldn't, anyway. How long has that been a thing? They've been hit pretty hard by the slop crew lately but I couldn't imagine it being so bad they require an up to date UA
It's going on since quite a while. Want to update some GNU software, or look up something? I have to switch the user agent to "curl" to be able to visit the sites.
I’ve been getting it in safari too. It’s ridiculous frankly. My residential ip must have been flagged or something. The part that’s really annoying is its trivial for bots to bypass.
I'm getting it on iCloud Private Relay all the time. It honestly makes it kind of useless.
Maybe that's the point? But then again, doesn't Cloudflare run part of it!? And wasn't there some "privacy-preserving captcha replacement" that iOS devices should already be opting me in to? So many questions, nobody there to answer them, because they can get away with it.
> The part that’s really annoying is its trivial for bots to bypass.
Not the ethical bots, though! My GPT-backed Openclaw staunchly refuses to go anywhere near a "I'm not a robot" button.
Cloudflare makes money on both sides. It makes money from Apple to run Private Relay and it makes money from website operators to block Private Relay. It hosts the websites of DDoS services and protects them from DDoS, too.
trying using firefox and then using a cellphone network for internet. sometimes i can't access a site, because i get infinite captcha. i know what a damn bus, stairwell, stop light or motorcycle looks like.
Arguably it didn’t see widespread commercial adoption for 30 years, and you wouldn’t expect fundamental design flaws regarding commercial incentives to manifest before that.
A flaw can be fundamental but not immediate. It's probably better to say it's a fundamental flaw of the open web, that is the system collapses as the number of bad actors increases, and there is no way to prevent bad actors and have the system keep the name as open web.
Once upon a time we had whois lookup for exactly that usecase (finding a domain's owner without visiting the site). Of course now nearly everyone has meaningless entries from some domain privacy service
These days I just close sites that show that "checking if you're a bot" shit. If this is how the web is going to be now, I don't care, I'll just not use it. I didn't need to see that article or post that badly anyways. I'm tired of paying the price for the sociopathic, greedy actions of others. It's especially bad for anyone who uses an open source OS like Linux or *BSD (to the extent many sites just block me automatically with a 403 Forbidden simply for using OpenBSD + Firefox, completely free pass if I try the same site from a Windows or Linux computer).
We use Cloudflare to protect our content, but at the same time our machines mostly run Linux / Firefox so it really is quite a frustrating relationship. It really bums me out how much of Turnstile boils down to these two questions:
is it Linux (or similar)?
is it Firefox?
If yes, to one or both, you're blocked! Clearly millions of dollars of engineering talent and petabytes of data collection should be able to come up with something more nuanced than this.
I'm building Safebox and Safecloud, where this won't be the case anymore. Not only will you have a decentralized hosting network that can sideload resources (e.g. via a browser extension that looks at your "integrity" attribute on websites) but also the websites will require you to be logged in with a HMAC-signed session ID (which means they don't need to do any I/O to reject your requests, and can do so quickly)... so the whole thing comes down to having a logged in account.
As far as server-to-server requests, they'll be coming from a growing network of cryptographically attested TPMs (Nitro in AWS, also available in GCP, IBM, Azure, Oracle etc.) so they'll just reject based on attestations also.
In short... the cryptographically attested web of trust will mean you won't need cloudflare. What you will need, however, to prevent sybil attacks, is age verification of accounts (e.g. Telegram ID is a proxy for that if you use Telegram for authentication).
Why would you assume it needs to be? You don’t think that websites on the Internet might not want to allow random bots and scrapers to waste their resources, and require people to have an account in order to access non-static resources on the website? You do realize that API keys exist, right?