Hacker Newsnew | past | comments | ask | show | jobs | submit | arghwhat's commentslogin

Well, this is about USB 3.2 Gen 2x2, which is a mess created by USB IF for good old, blue USB A connectors. Not USB-C complexity.

USB 3.2 Gen 2x2 is the very rarely supported 20Gb/s variant of USB 3, and making devices now that require that for full performance is a weird decision, with high-speed capable ports generally having wider support for either USB4 or Thunderbolt3+. I imagine the reason would be that some chip with an otherwise poor market fit got cheap...

Throwing this into the mix definitely doesn't improve the USB-C "what does this port support" conundrum, but this specific one predates USB-C and is not at all something you'd normally hit.


> Not USB-C complexity

3.2 Gen 2x2 (and the occasionally relevant 1x2 if you have a weak cable) are USB C only.

USB C ports and cables have 4 USB 3 "superspeed" lanes rather than two. When you use an A to C cable only one pair of those connects. The point of the "x2" modes is that they use the second pair of lanes that would otherwise go unused.

Except of course they don't always go unused. DisplayPort Alternate Mode sends DisplayPort over those two "unused" lanes getting you USB 3 data alongside a half speed DisplayPort connection. (or alternatively full speed DisplayPort on all four and only USB 2), and then of course Thunderbolt 3 and modern USB4/TBT4 use all four lanes and tunnel everything.


10 Gb/s Ethernet interfaces do not require 20 Gb/s USB ports for reaching maximum performance, they already reach that on 10 Gb/s USB ports, despite of what the writer of TFA believes.

The main application of 20 Gb/s USB ports is to connect external NVMe SSDs, when faster USB 4 or Thunderbolt ports and SSDs are not available.

For an external NVMe SSD on USB, a 20 Gb/s USB port will double the throughput, unlike for a 10 Gb/s Ethernet interface where any improvements are completely negligible.

I do not think that 20 Gb/s USB Type C ports are "very rarely supported". Every mini-PC or desktop motherboard that I have bought during the last 10 years had at least one such USB port.

Such ports appear to be rare only on laptops, because most laptops have very few USB ports.


> 10 Gb/s Ethernet interfaces do not require 20 Gb/s USB ports for reaching maximum performance, they already reach that on 10 Gb/s USB ports, despite of what the writer of TFA believes.

While this may be theoretically (almost) possible, I’m quite sure this is absolutely not the case in practice.

For example see these benchmarks of one of the more recent USB to Ethernet chipsets [1], that can reach ~9.5 Gb/s on USB 3.2 Gen 2x2 but only between ~6.2 to ~7.3 on 3.2 Gen 2x1 laptops.

1. https://www.jeffgeerling.com/blog/2026/new-10-gbe-usb-adapte...

Edit: Haha, didn’t realise TFA was by the same author as these benchmarks but he’s done a lot of testing and benchmarking of these kind of devices over a long time, and it agrees with all the other benchmarking from other people I’ve seen too!


In Ethernet, "10 Gbps" refers to the actual Ethernet frame throughput. The raw physical coding rate is usually somewhere around 10.3125 Gbps to account for this.

In USB 3.2 Gen 2x1, the actual USB packet throughput is 9.697 Gbps and the "10 Gbps" refers to the raw encoding rate.

This difference means you are guaranteed to lose at least a few hundred Mbps off maximum performance. It's not really a practical concern, but it's not an error to say 10 Gb/s USB ports lack the bandwidth needed to support the maximum performance of a 10 Gbps USB Ethernet adapter.


Ethernet is duplex though. 20Gb/s is the max throughput a 10Gb NIC can achieve.

So is usb superspeed. The tx and rx don't flip around like low/full/high speed

>Every mini-PC or desktop motherboard that I have bought during the last 10 years had at least one such USB port.

Are you talking about USB 3.2 Gen 2x2 though? Because I've never seen any MiniPC with this port and as for motherboards, I checked my local retailer and only ~15% of currently sold ones have Gen 2x2 (mostly high-end ones).


Most of my mini-PCs have been Intel NUCs (or more recently an ASUS NUC). I also had some Gigabyte and Zotac mini-PCs and a few others from less well-known vendors. IIRC almost all had one such 20 Gb/s USB Type C port, unless they had one or two faster Thunderbolt ports.

With mini-PCs, I frequently use external SSDs, so I certainly used those ports at their full speed.

The only mini-PCs that I had in recent years without such a fast USB port were Arm-CPU based, as those are typically starved in fast peripheral interfaces in comparison with the Intel/AMD CPUs.


> 10 Gb/s Ethernet interfaces do not require 20 Gb/s USB ports for reaching maximum performance, they already reach that on 10 Gb/s USB ports, despite of what the writer of TFA believes.

The first half is true, the second half is not. Remember overhead. You don't need 20GB/sec, but you need to take into account the USB overhead.


If you read carefully (emphasis mine):

> The main problem is USB-C's bandwidth complexity - especially when paired with the Realtek RTL8159 Ethernet controller, which requires USB 3.2 Gen 2x2 (20 Gbps) to get the full rated 10 Gbps speeds

Jeff's statement wasn't that 10 Gb/s Ethernet requires 2x2. It's that that requirement comes from a very specific controller.


What about overhead? Can you truly do 10Gb/s networking on a 10Gb/s USB port? Would having such NIC on a 20Gb/s USB port not result in higher speeds?

Both 10 Gb/s Ethernet and 10 Gb/s USB have bit data rates that are 3% lower than 10 Gb/s, due to encoding (64/66 bits for Ethernet, 128/132 bits for USB).

So the their maximum speed is approximately 9.7 Gb/s.

Then for Ethernet there is a protocol-dependent overhead, e.c. depending on whether TCP or UDP is used, and depending on whether standard packets or jumbo packets are used.

The TCP overhead can reach in the worst case up to close to another 3%, reducing the achievable TCP throughput to around 9.4 Gb/s.

The USB frames add some extra overhead, but it is normally not important in comparison with other factors that can reduce the throughput.

All that a 20 Gb/s USB port can do is to reduce the overhead of the USB frames, but that is a negligible improvement. Using jumbo Ethernet frames (which are 6 times bigger than standard frames), if both ends support them, is likely more useful for increasing the throughput, than using a 20 Gb/s USB port.


10 Gig ethernet is 10GBps usable rate (before packet overhead). The line rates are higher to accommodate this. For 10GBase-R, it's typically 10.3125 GBps, with a 64/66 encoding. For 10GBase-T, it's 4 lanes with PAM-16 at 800 MBaud -> 12.8 Gbps raw.

It uses 128b/132b encoding so 10Gb/s USB ≈ 9.69Gb/s you do then have USB framing overhead but it's probably around 2% on typical 1500B ethernet frames. So all in you are losing probably 5% or so to overhead.

I am of the opinion that 5Gbe is a much more sensible speed for a laptop adapter right now as it uses half the power and can obviously run full wack on 10Gb/s USB so you're looking at like 5Gbe vs ~9.4Gbe.


Stop insisting on Cat.6A (and related) copper cables for speeds beyond 1000BASE-T (maybe beyond 2.5G by now), just use dumb multi mode fiber it's way easier technology-wise and if you want power you can have that as well.

At distances where Cat.6A is even an option the demands on the fiber are very low. And it uses less power than the BASE-T PHY. The cable at least without integrated power is very thin as well, unless you can't respect it enough to not kink it, in which case you'd want a thicker one just to prevent you from being able to break the fiber.


In fact, just to for single mode fiber. Looking on fs.com, single mode cables are slightly cheaper, and the optics (for 10G) are $30 to MMF's $25.

And you get much better future proofing with SMF. And if you do need a long fast run, SMF is what you want.


5GBASE-T interfaces often use 3x less power than 10GBASE-T

The display controller and render device are completely distinct logical devices, even though they are often grouped in a "GPU". On mobile architectures they are quite far separated, leading to annoying problems surrounding what we on Linux call "split drm devices".

Updating plane properties such as to move the cursor plane around or disable it would by itself not block on render activities, as they are completely distinct blocks.

The render hardware could be powered down, but I doubt powering it up and compositing the cursor would take long enough to complete to cause any noticable lag.

Under the Linux APIs, updates to the display controller are done through KMS atomic commits, and one mistake you could do display-server side would be to provide a fence in this atomic commit that the scheduler will use to wait on long-running GPU work before using the provided graphics buffers. Under this API, none of the changes - including mouse movements - would then be applied until that fence is signalled. Changing plane associations can lead to resource reallocations that can be a bit heavy.

Not sure if the kernel driver in macOS works anything remotely similar to this, and the driver could also just be dumb and block on unrelated things ("let's just wait another vblank to see this apply....", "as we only need one plane now let's power down hardware and wait for that to settle..."). It could also just be windowserver that waits for work to finish on its own, not providing any cursor updates in the meantime.

The reality is that it will take reverse engineering or looking at actual code to know what's going on.


Since this is but an iPhone crammed into a laptop, could this behaviour stem from the fact that iPhones generally need not render a cursor?

No, the cursor just uses an overlay plane, and mobile architectures usually have far more planes (sometimes even an arbitrarily configurable amount), and more flexible hardware compositioning overall than desktop GPUs for efficiency reasons.

EDIT: Also note that there is nothing new with the Neo here, as all Macs since the M1 have used the same chip architecture as the iPhone.

Desktop GPU designs did not focus on tiny efficiency gains, and often only has a primary plane, a single overlay plane (for e.g., a video), and a dedicated cursor plane. Some even have to share a single overlay plane between all connected displays. It's a recent thing for desktop GPUs to get more flexible in this area, in part to improve laptop battery life in the cases where the laptop is almost entirely idle.

(For those unaware, a "plane" here is the entity in the display controller you configure to show a rendered graphics buffer, in a particular location and with particular transforms. You commonly have one plane that just covers the whole screen, and then sometimes put dynamic content on top in other planes so you can avoid having to redraw the main buffer when smaller bits of it change, like a video player or cursor. You could also e.g., scroll by rendering an entire document in advance and then move the plane around to reveal parts of it.)


> Desktop GPU designs did not focus on tiny efficiency gains

I'm not sure they're all that tiny if you can squeeze out 70% of top end performance for 25% of the power draw :)


I think the implication is they focussed on the huge efficiency gains, and didn't focus on the small ones?

At least for Nvidia and AMD desktop GPUs/cards, the priorities are:

#1 (by far) "can it play [hot game of the moment] in 4K?"

#2 (maybe) "is the fan noisy in desktop mode?"

#3 (for nerds) "can I run LLMs on it?"

I don't think top efficiency in desktop mode is on anyone's list, and even if it were, it would be hard to come up with a design that uses hundreds of watts when running top-tier games or LLMs, but also uses as little as possible when idle?


No, rather that they focused on peak performance, i.e., "churn out the most raw throughput at 400W+", rather than "get as close as possible to 0W when idle or in common uses".

Very different metrics - the former is about optimizing your architecture for pushing the most operations, the latter is about being able to power as many things off as possible.


TL;DR: Low power draw for laptops and phones is about who can reduce to the lowest performance, with the most hardware turned completely off while still just barely performing the task at hand. Completely different ballgame.

Peak performance happens at peak power draw and is a matter of having as much hardware as possible pushing as many operations as possible at any given point without spontaneously combusting. Those who have the most advanced manufacturing process, or architecture with the most execution units and best able to keep all units busy wins.

Peak power efficiency is about being able to turn as much hardware off as possible, and having lower quiescent current ("leakage power"), with bigger, beefier chips naturally having higher quiescent currents.

What is talked about here is about gating hardware such that you can shave off milliwatts or microwatts when the system is completely idle, by taking tasks that otherwise use slightly larger blocks that would have to remain on and moving them to smaller, more dedicated blocks. For example, being able to play a video with most render capabilities powered down because the display server can take the output of a hardware video decoder block and feed it straight into a display controller plane.


I meant peak performance vs peak performance though. An M1 Ultra uses 25% of the energy of an RTX3090 but gets 70% of the FPS. And in synthetic tests the M1 Ultra can get as close as 95% (!).

Nvidia (and AMD and Intel) just suck at efficiency. There is no excuse for such a performance-per-watt delta. The same is true in CPU land.

Once the first ARM Steam Deck launches gamers will realize they've been had for a decade.


Something is being misunderstood, here, it’s not an iPhone crammed into a laptop in a way that requires add’l software work. A simple analogy that fits is its a MacBook with cheaper parts and a M0 series chip.

That's not quite true, as every SoC requries quite significant software bringup for an OS that makes other software engineering tasks seem miniscule in comparison - macOS and iOS just share enough common code that it's not as big of a deal for Apple.

Also not sure why you'd label it as "M0", as trivially beats the M1 on several metrics.


Just a nit: Post-CRTs, there is no longer a "standard gamma curve", but many different transfer functions and many errors stem from misunderstanding this.

Even within "SDR"/"sRGB", many mistakes crop up from people erroneously mixing content encoded with the piecewise sRGB transfer function with content encoded according to a plain gamma 2.2 transfer function. And this is before we are getting into e.g., incorrect blending spaces or mismatched primaries.

But yes, it is purely a matter of compression, with many options for exactly what dynamic range you need and how you want your content defined (e.g., sRGB, gamma2.2, scRGB, HLG, PQ, ...), with linear light primarily reserved as an intermediate space for color conversions and blending - something your display server and any software working with arbitrary color spaces will be using.


That is why I said "standard gamma curves", and not "standard gamma curve", as each standard specifies a slightly different curve, for various reasons.

Such differences in standards already existed in analog television, because, depending on how they were made, the CRTs also had slightly different transfer curves from grid voltage (where the video signal was applied) to anode current (which is proportional with the luminance of the pixel component), and the regional TV standards accounted for the dominant manufacturers of the CRTs sold in that region.


Grid voltage had no real impact, but the field rate of early monochrome broadcasts were locked to mains frequency, hence regional differences in frame rate.

NTSC was gamma 2.2, and PAL/SECAM was gamma 2.8, which was indeed initially partly caused by local manufacturing differences before international brands took over, but neither "standard" was really followed by anyone. In the end, concluding that it was all a total mess, we split the difference in the early 90's by formally defining both to gamma 2.4 in BT.709. As such, their curves are the same.

(Manufacturing derivation was outside the scope, as manufacturers did whatever was convenient or sold sets, going all over the place with their response curves regardless of what region they were from or targeted. This remains true today - see any new TVs standard color response.)


> Apple does have some good counter arguments. Where there is data, there will be bad actors who want that data - and I trust Apple far more to behave than I trust some random shell company being run by some secret service.

As a EU citizen, sharing your data between Apple and Google which puts said data under free-for-all US intelligence access - which is known to have "questionable" habits and give basically no rights or insight as a to-them foreign citizen - is effectively trusting "some secret service".

To be clear, I am not one to fear use Apple for intelligence reasons, but not through a pretense that my data is safe from it, and certainly not because I believe it would be safer than using, say, a service based in Germany or France.

I'm more concerned with data brokers trading personal data out in the open, collected with minimal control from all the other apps and webpages we use throughout our day.


I get sad reading the "What EU users are saying..." messages. A lot of them seem to be Apple fans that just bought their evasive narrative.

I fully understand that some Apple users are perfectly happy with Apple's closed garden, but they must understand that its primary and almost sole purpose is profit maximization and the counter-arguments to opening up are purely about avoiding any risk to said profit.

While there could very well have been technical considerations to be had, all their answers to DMA have been lies, non-compliance and malicious compliance, as they have no intention of discovering whether their margins relied on the walled garden or not.

I personally suspect that the impact would be quite small exactly because Apple users tend to enjoy and stay within the Apple experience (myself included when I used Apple products - there is no harm to prefering that user experience), but they don't intend to risk parting with as much as a cent regardless of what benefits users, and will happily burn money lobbying against that.

They do not have your interests in mind here, and their way of maliciously presenting this such that you as a user will be bothered and blame regulation for their "inability to deliver" is very much lobbying 101.


Go's selling point is most definitely performance, but relative to implementation effort of a given application. This is opposed to languages that focus more on maximum performance at any cost to implementation, or maximum convenience at any cost to performance.


It's not the kind of performance expectations where processor-specific SIMD support really matters (although nice if it's there)

Go's performance goals are to have much faster runtime than any interpreted language and faster build times than most compiled languages. It's a reasonable stance.


Basically a political answer that answers nothing.

It isn't performance compiling, as that is only surprising for those that never used 90's compiled languages like Modula-2, Object Pascal, Clipper and co.

It isn't performance of code execution, as even GCCGO could beat the reference implementation, unfortunately now stagnant since no one cares to update it beyond Go 1.18.

And to go back to the article, as pointed out there,

> The Go toolchain does not currently generate any AVX512 instructions.

Thus leaving performance on the table.


> > The Go toolchain does not currently generate any AVX512 instructions. > > Thus leaving performance on the table.

This just misunderstands the documentation. Compilers do not generally emit much of any AVX512, and the docs are just stating that the Go toolchain itself never emits it at the current time.

Use of AVX512 instead usually comes from explicit intrinsics or handwritten assembly in the compilers you're thinking of, and it is the same in Go. There isn't much AVX512 code in the standard library, but it is there and the GOAMD64=v4 flag means that it does not have to use runtime CPUID feature detection.

External Go libraries like simdjson-go also actively use AVX512. AVX512 does nothing for code not written with SIMD in mind anyway.


I have seen such detailed and tidy whiteboard diagrams, but the catch is that they never occur in active discussion. It doesn't make room for scribbling, and stopping a discussion for 5-10 minutes to draw slowly and nicely doesn't make sense...


True, no one can understand my whiteboard drawings the next day, not even I.


Perfect operational security!


Being able to push prints and use the printer with direct local connection, while simultaneously having remote monitoring and remote printing when cloud/internet works and is available.

This is not the case of "wanting to have their cake and eat it too", as there is nothing mutually exclusive about these things. It requires no "emulation" or hacks - having a local API open to query state and push print jobs to the queue, while the printer connects to the cloud to publish state and pull the next job, presents no conflict.

Ultimaker has a similar feature set and had full local/cloud simultaneous integration. The only thing you "lost" by pushing a job locally was that when viewed in the cloud portal, the mini 3D model preview in the queue was missing, and only because they never bothered making the cloud solution pull it from the printer for local jobs.

But then they also did like Bambu and killed local printing entirely because they are all enterprise-only now want to sell you their higher Digital Factory subscriptions.


Thanks for confirming.

> Being able to push prints and use the printer with direct local connection, while simultaneously having remote monitoring and remote printing when cloud/internet works and is available.

So isn't an obvious approach to just cut Bambu out altogether and just create a FOSS cloud alternative, supporting the remote aspects that the users want to retain?

> This is not the case of "wanting to have their cake and eat it too", as there is nothing mutually exclusive about these things.

Nothing technically mutually exclusive, but isn't this exactly the choice that Bambu is enforcing? Which is crappy corporate enshittification behaviour, but something they can do if they so choose? (I'm not arguing in their favour - just trying to fully clarify the situation.)


I used the spaghetti-detective plugin/add-on for OctoPi when I got my printer, they also hit bandwidth of video streaming over web(part of the "monitoring" area) they seemingly have been absorbed into "obico"(the github remains github.com/TheSpaghettiDetective) Every 3DPrinter software has options to replace these Bambu Cloud features, the process involves a fair bit of deep dive understanding, flashing firmwares, troubleshooting bugs, and then you could in theory use the same machines with all the Bambu Cloud features, in a local environment.

My only gripe with the community approach is, why not replace them rather than attempt to use ANY servers they have? Jeff cleverly highlighted that all the slicers originate from Slic3r, there is always a point before Bambu.


> So isn't an obvious approach to just cut Bambu out altogether and just create a FOSS cloud alternative, supporting the remote aspects that the users want to retain?

Yes, you can do this with HomeAssistant and other tools.

> Nothing technically mutually exclusive, but isn't this exactly the choice that Bambu is enforcing? Which is crappy corporate enshittification behaviour, but something they can do if they so choose? (I'm not arguing in their favour - just trying to fully clarify the situation.)

Yep. There's an argument that the method they chose (attempted takedown of a repo derived from their plugin) is an AGPL license issue. My guess is that they will switch to a more advanced authentication strategy than "a User-Agent in open source code" and the enshittification on that side will just deepen.

I think people are right to be upset that Bambu initially offered both sides (local MQTT and their cloud) and subsequently made customers choose one or the other, but I've used Bambu printers offline plenty (to the point that I had to do the research to figure out why people were annoyed in the first place) and they still work really well; they didn't really hamstring the Developer mode (for example, you can still use all of the fancy Bambu-y features, like reading filament spool status, accessing the video stream over RTSP, etc.)


What "ai" got to do with that would be that he didn't write a scraper and a clothing style ("vibe") categorizer to build a database to process entries in to pick a shop. They just prompted the "ai" (I really don't know why you're putting that in quotes), and it in turn did that for them.

Was it a technically impressive effort from the prompter? No. Are the tools created in the session somehow a massive technical achievement? No. But was it a very useful result? Yes. It took the kind of task that would likely never get done otherwise, and turned it into the kind of thing that got done on a whim.

Doesn't mean that your laptop needs "AI buttons" though.


Ah so what they meant was like a 'vibe coded' a scraper? I thought they meant something like turning descriptions/reviews/photos of clothes into embeddings, as in like sentiment classification but way beyond that? Because the latter would be somewhat cool if it's actually achievable (I doubt it is tho…)

(I mean honestly the project idea[?] they posted sounds like daydreaming some science fiction scenarios, otherwise with all the hype and investment around chat bots, this way of shopping would definitely be mainstream already. If it weren't daydreams, that is. But if my grandma had wheels, she would've been a bicycle, no…?)


You could turn clothing descriptions into embeddings and have a fashion vector database, but doing that would mostly just net you the ability to find adjacent clothings, rather than the ability to navigate available clothing or clothing fitting certain requirements.

What was done is more like using the LLM as a personal assistant that doing long manual labor to find what you might be looking for.

This way of shopping is already a thing. "Hype and investment" goes into how the companies can monetize AI harder (ads! integrated LLM shopping! business development! premium pro max enterprise data policies!), it doesn't really focus much on how the individual can save time and money through non-flashy tasks.


> You could turn clothing descriptions into embeddings and have a fashion vector database

Well, that assumes descriptions are extremely accurate down to the last seam, which is not true. You'd be better off considering reviews and photos, esp. user provided photos, you also need to take into account the model/s in the photos are not necessarily shaped the same as you, so you need to somehow counter that bias in training. This is simply not a task achievable with current ml techniques, however again, feel free to prove otherwise.

(and ftr, I'm of course making a basic assumption that we're past the topic of markov chain/'llm' based chat bots at this point? Those are completely irrelevant to the goal of categorizing clothes based on some characteristics [i.e. the so-called 'vibes'])


I have never seen an IR-based on in any store myself. Bluetooth, and possibly some proprietary RF setup, seems popular.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: