not original author but batching is one very important trick to make inference efficient, you can reasonably do tens to low hundreds in parallel (depending on model size and gpu size) with very little performance overhead
True! Devin Review doesn’t make the kind of judgements you mention, it just does its best to find bugs and help you understand the code faster. I managed to review a PR on an airplane (without starlink) with it earlier this week lol
Yeah, I wasn’t alluding to Devin reviewing and merging the changelog. This was more of a general statement, since a lot of code review tools seem to get this part wrong.
A lot of energy is being spent on making reviews faster, when reviews are intentionally meant to scale sublinearly. The goal should be: how can we make the process more convenient and less error-prone?
Woah that's actually huge. I've been very interested in tangled from an atproto perspective but I had no idea it had that as well. Wonder why that isn't talked about more. Seems like an amazing feature to potentially pull some people away from GitHub/GitLab after they've have been asking for years for a better stacking workflow.
I've been going through a lot of different git stacking tools recently and am currently quite liking git-branchless[1] with GitHub and mergify[2] for the merge queue, but it all definitely feels quite rough around the edges without first-party support. Especially when it comes to collaboration.
Jujutsu has also always just seemed a bit daunting to me, but this might be the push I needed to finally give both jj and tangled a proper try and likely move stuff over.
I was scared to learn but then a coworker taught me the 4 commands I care about (jj new, jj undo, jj edit, jj log) and now I can't imagine going back to plain git.
Obviously the working tree should be a commit like any other! It just makes sense!
Codeium (now Windsurf) did this, and the plugins all still work with normal Windsurf login. The JetBrains plugin and maybe a few others are even still maintained! They get new models and bugfixes.
(I work at Windsurf but not really intended to be an ad I’m just yapping)
Windsurf is at least 10x better than Cursor in my opinion... I'm honestly still puzzled it doesn't seem to get as much buzz on HN! I had to literally cmd+F to find a reference here and this is the only comment ;-;
I think I remember one of the original ViT papers saying something about 2D embeddings on image patches not actually increasing performance on image recognition or segmentation, so it’s kind of interesting that it helps with text!
> We use standard learnable 1D position embeddings, since we have not observed significant performance gains from using more advanced 2D-aware position embeddings (Appendix D.4).
Although it looks like that was just ImageNet so maybe this isn't that surprising.
They seem to have used a fixed input resolution for each model, so the learnable 1D position embeddings are equivalent to learnable 2D position embeddings where every grid position gets its own embedding. It's when different images may have a different number of tokens per row that the correspondence between 1D index and 2D position gets broken and a 2D-aware position embedding can be expected to produce different results.
Yes! The retransmission logic in Linux NFS is independent of transport (see the `retrans` option in `mount.nfs`).
Weirdly enough this also means that if you’re running with TCP you can have retransmits at the NFS/SunRPC level and at the TCP level depending on your configuration.
not original author but batching is one very important trick to make inference efficient, you can reasonably do tens to low hundreds in parallel (depending on model size and gpu size) with very little performance overhead