I been seeing people on Twitter mocking the project, but I need it...
You have no idea how much time I've wasted trying to block some products from pinging their home server with curious data stream, but failed to do so because I can't be bothered to sit my ass in front of Wireshark to sniff out all their DoH servers.
With this project, it's hopeful that in the future I can just not putting their domains in the TLS whitelist, even when they use DoH.
---
Off topic but BTW: "Fully Encrypted Traffic" is really a confusing term that can only be understood correctly if the context is correct too. How about calling those protocols "High Entropy" (HighE)? Which IMO is more specific than calling it "Fully Encrypted". It's just my two-cents suggestion.
TLS 1.3 with cert pinning and end-to-end encryption is making life hell for corporate compliance. Our Palo Alto firewalls are about as good as it gets but it's a constant battle to de-obfuscate traffic.
Google loves to mix traffic types (ad, telemetry, biz app) across protocols basically creating their own overlay which is a huge pain.
For Apple, we basically have to exempt the entire 17.0.0.0/8 as that is theirs and they pin everything.
Microsoft at least has several dynamic lists but I have my own list of their stuff that ISN'T on their lists (a few hundred IPs).
At this point I don't think there's any way to secure a network. You can't trust or verify much of the traffic.
Maybe using a synthetic NIC at the endpoint that is directly tied to the policy system is the only way forward.
> You have no idea how much time I've wasted trying to block some products from pinging their home server with curious data stream, but failed to do so because I can't be bothered to sit my ass in front of Wireshark to sniff out all their DoH servers.
You mean hardware products, right? In this case putting them in a separate VLAN would help. I you mean software running on your machine, you can set up a proxy and block all traffic not coming through it.
Mainly TV, TV box and phones with non-open-source ROMs etc. These devices do need Internet connection for normal operations, which is why it's so hard to block suspicious traffic from them.
I've had luck finding pi-hole blocklists on github for various products, if it's something quite common like a branded smart TV someone will have already done the hard work of figuring out what IP's they're trying to dial home to.
> You have no idea how much time I've wasted trying to block some products from pinging their home server with curious data stream, but failed to do so because I can't be bothered to sit my ass in front of Wireshark to sniff out all their DoH servers.
I wonder if you can run this in observe only mode to analyze/log that traffic.
The strategy used to detect "Fully Encrypted Traffic" is indeed complex, but the protocols investigated by the paper (at least Shadowsocks, VMess. Not really sure about Obfs4) works by transforming the traffic to make it "look like nothing".
So I still believe "High Entropy" is a better description than "Fully Encrypted Traffic".
I mean, you can pack the entire data stream in Base64 after sending them through a SHA256 pipeline, and it will still be "Fully Encrypted", but the entropy (in terms of traffic classification by content scanning) is not the same compare to doing it without the Base64 step.
Yes, "High Entropy" or even "HighE" is a more accurate than "Fully Encrypted Traffic".
It's interesting that the filter was observed to operate at specific ranges of entropy in the paper, and the repo has it mostly reproduced here: https://github.com/apernet/OpenGFW/blob/1dce82745d0bc8b3813a...
But presumably to keep people on their toes, the real filter, only operates some of the time. It be a cursed, "hex ensemble".
Correction: instead of `SHA256`, I've should typed `AES256`. The last time when I wrote any encryption, it was still back in 2019... that's why I lost it... Sorry :)
Would be funny if this ends up like War Thunder. Some random chinese official making a pull request out of spite because some implementations are not as how the real GFW is doing it.
the gfw was not built by some random "chinese officials", it's powered by the IT industry with contractors all the way down. There isn't a monolith wall either. There are multiple generations of TCP/IP middle boxes sold to various levels of ISP over many years.
This project is like an open-source missile for regimes like Iran and North Korea. I do admire it, but some governments could abuse it to suppress freedom.
Unless your government is technically inept they don't really need this project to have their own GFW and inept government won't be able to maintain it anyway.
Nah it's an interesting piece of technology of double-edged sword, like any other technology. A firewall can be used to censor information, but can be also used to block ads or malicious traffic.
wonderful. now the Iranian regime can stop their payments to China for their use of this tech to block Iranians from accessing open internet. funny how open source can do a full circle to prevent the very thing it set out to open.
I agree, there are some genuine use-cases for such a product. SCHOOLS.. to keep distractions at minimum for instance. I worry about malware tho, anyone can vet the team behind this??
Apernet is especially active in enabling circumventing the GFW of China, so I'm pretty sure they are aware, and have thought about the consequences of this project. In fact I assume they developed it in order to test their circumvention methods.
For YouTube if you use Android you can install NewPipe from F-Droid. Not only there are no ads, you can even bookmark videos and subscribe channels without having an account on YouTube or without being logged in.
The focus here is on detecting the usage of VPN protocols based on signatures, entropy, and other characteristics (especially the ones that try to make themselves look like TLS traffic) rather than having a blocklist of hostnames as Pi-hole does.
You have no idea how much time I've wasted trying to block some products from pinging their home server with curious data stream, but failed to do so because I can't be bothered to sit my ass in front of Wireshark to sniff out all their DoH servers.
With this project, it's hopeful that in the future I can just not putting their domains in the TLS whitelist, even when they use DoH.
---
Off topic but BTW: "Fully Encrypted Traffic" is really a confusing term that can only be understood correctly if the context is correct too. How about calling those protocols "High Entropy" (HighE)? Which IMO is more specific than calling it "Fully Encrypted". It's just my two-cents suggestion.