Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Making bracket pair colorization faster (visualstudio.com)
715 points by feross on Sept 29, 2021 | hide | past | favorite | 265 comments


The original author of the bracket colorizer extension gets a great shout-out in this post and was also involved in the discussions of building the feature.

Moreover, he has stated that he was tired of maintaining the extension and seems to be fully in favor of this move:

> Author of Bracket Pair Colorizer here.

> I follow this thread with interest, my extension is something that grew a bit out of control, and I grew tired of maintaining it.

The comments here suggesting that the Visual Studio Code team was somehow wrong for making this improvement or that they otherwise wronged the extension author are not reflective of the actual process nor even the extension author’s own feelings.

The blog post does a great job of describing why bracket colorization isn’t appropriate for the plug-in architecture anyway. Making it fast can only be accomplished within the core of the editor code which has more direct access to syntax parsing.

This is a win for everyone all around. The Code team did a great job with this feature and the associated blog post.


We see a lot of situations where things sort of die or fade away when the author(s) just don't have time. This seems like the ideal situation where a highly capable team takes over and the author is happy to see it.


The single most popular extension for VSCode - the one that adds Python support (https://marketplace.visualstudio.com/items?itemName=ms-pytho...) - is somewhat similar: it was originally written by Don Jayamanne, and later acquired by Microsoft. Although in that case, Don was hired by Microsoft as well, and continued working on it:

https://devblogs.microsoft.com/python/don-jayamanne-joins-mi...

This can still be seen on GitHub - if you look closely, the official repo at https://github.com/microsoft/vscode-python is a fork!

Curiously, some bits of code went kinda full circle in the process - vscode-python reused the (open source) Python debugger written at Microsoft for Python Tools for Visual Studio.


>Curiously, some bits of code went kinda full circle in the process

"Oh I see why. .. yeah that's a better idea."

Happens to me all the time ;)


Not in that sense. The VSCode extension didn't fork the debugger in question - it took the original code verbatim, and simply wrapped it into a DAP adapter.

Ironically, when we rewrote the Python debugger later, we did the same exact thing with pydevd (https://github.com/fabioz/PyDev.Debugger), and for the same reasons - why reinvent the wheel when you can take an existing one that's already better than anything you have? It's also better for the ecosystem, since improvements all flow upstream.


Code going full circle is the basis for multiplying coding estimates by pi: https://news.ycombinator.com/item?id=28667174


Well, if the author ever reads this post, thanks for making an extension that I used every single time I installed VS Code. Was a no-brainer every time. Glad to see it implemented by the core team.



In video game terms, this seems like getting upset about Valve with Counter Strike. Started as a Half-Life mod, became something much bigger.


I don’t think the blog post does a good job of letting us know they communicated ‘hey, we gun make this core’ before they decided to do it (and release this post).


I agree, but I think at least some of the animosity stems from how the author might not have been able to do this themselves if they wanted to:

> Without being limited by public API design, we could use (2,3)-trees, recursion-free tree-traversal, bit-arithmetic, incremental parsing, and other techniques to reduce the extension's worst-case update time-complexity (that is the time required to process user-input when a document already has been opened) from \mathcal{O}(N + E)O(N+E) to \mathcal{O}(\mathrm{log}^3 N + E)O(log 3 N+E) with NN being the document size and EE the edit size, assuming the nesting level of bracket pairs is bounded by \mathcal{O}(\mathrm{log} N)O(logN).

just hurts to see after having been burned by Apple's private APIs blocking legitimate app developers from doing something Apple will release next cycle themselves, even if it was to everyone's benefit and a net positive


> I agree, but I think at least some of the animosity stems from how the author might not have been able to do this themselves if they wanted to

What animosity? The animosity other people have on behalf of the author who has clearly said he has no beef with this series of events? If the author wanted to make algorithmic changes like the VS code team have done, I'm sure they would have been more than happy to discuss this on the open source issue tracker (which is very active).

> just hurts to see after having been burned by Apple's private APIs blocking legitimate app developers from doing something Apple will release next cycle themselves,

The big difference here being that the app developers are happy that this has happened, and they could have attempted to do this work because even though the APIs are private, the source is available and they readily take pull requests.


> The big difference here being that the app developers are happy that this has happened

I don't see why this would make a difference. Say I'm working on a project. There's a hole in my project. And you offer something to fill the hole.

But your action can't then bind me to refrain from fixing my project. Once my project no longer has a hole in it, you've lost a line of business. But you've lost that line of business because it solved a problem that was correctly fixed. The situation now is obviously better than the situation before. If I needed permission from you to fix my own thing, how could that conceivably improve anything?


Yes, but I feel like extensions are not considered "apps" in a way that would make people feel they should be able to write anything they want to change the editor. That way lies madness.


> That way lies madness

That way lies Emacs ;)

That's the key difference between Emacs and most other editors: there's no limited API for extensions. There's a base C runtime, and all the lisp code on top is at the same level. There is no difference in access rights between a core Emacs package (coming with Emacs) and an additional, user installed package.

Of course, there are a lot of documentation, conventions and best practices to support this. And all the code is accessible.


This article is especially interesting to me, as it shows how VS Code still doesn't have the "Emacs nature". Even though I'm a 30-year Emacs user, I do hesitate to recommend it to younger programmers because it's so alien, and VS Code has one of the essential characteristics of Emacs: the extension language and the implementation language are the same. But this article is a great example of how it doesn't — extensions are limited to using an extension API, rather than having full access to the application's internals. Maybe a good thing, if you're a mass-market product worried about malicious extensions. But I'll note that [rainbow-delimiters-mode](https://github.com/Fanael/rainbow-delimiters/) dates back to 2010, and has never noticeably slowed down loading or display of source files, even in languages with lots of delimiters like Lisp.


I think that it's not about malicious extensions. It's about compatibility. If there's no public API, then all API is public and any code change will break something. So you either break extensions or don't change code at all. With public API you can change code while keeping public API working. And if something must be changed, the damage is limited and controlled.


The Emacs maintainers seem fairly conservative about changes to the Emacs runtime, but they have managed a number of extensive changes. It even has JIT compiling now.

Partly this is because the nature of the Lisp language makes these changes easier, but it is also the case that many “extensions” are actually included with Emacs. There are over a 1.5 million lines of lisp code that are included in the Emacs repository, though most of them are not enabled by most users.

Other extensions come from a wide variety of sources (and of course many users write their own code), but over the last 10 years most of them have been moved into installable packages hosted on elpa.gnu.org. There is a little over a million lines of code there.

It takes just a couple of minutes to check out both git repositories (for Emacs and ELPA), so any time you want to change something in Emacs it is quite easy to search all of that and find out exactly what, if anything, you might break.


That's true in principle. In practice, I've seen Emacs undergo non-trivial changes and yet none of my personal Elisp code has ever broken. I'm not sure how that happens—perhaps the core abstractions are sufficiently simple that they don't need to change much themselves and only higher-level things change, so code written against the core abstractions doesn't get broken.


VSCode being open source, I wonder if you could build an ecosystem of extensions that don't use the extension API but have full access.


I gotta say, "That way lies madness" == "That way lies Emacs" gave me a good chuckle. But your point is good, there are ways to do it, but VSCode has already started down the road of security first, and the emacs model would certainly not function in a sandbox.


This is. Loser to the original Firefox extension model — pen access to everything. It later limited internal changes as internal stuff was external to extensions. It did allow powerful things like Firebug, which could not be build with todays modern and more secure locked down APIs.


This is similar to the original Firefox extension model…

(Ug, the way autocorrect works sometimes..)


What happens when multiple packages want to modify the same code?


I have only the slightest passing knowledge of ELisp, but I believe it offers Aspect-Oriented Programming (AOP) facilities; most basically, chained decorators (i.e. replacing a function with your proxy for that function, where your definition of the function gets bound dynamically to the current definition of the function at eval-time; so the call your function makes to the “inner” function, could just be to another proxy wrapper function someone else already inserted.)


That is the most complicated description of it I have ever seen, but it is accurate.

In total there are ten ways to combine new advice with the existing set of methods, but the most commonly used are :before, :after, and :around. All the :before functions are called first, then the outermost :around function (which may or may not call the next :around function), then finally the :after functions.

Also it is common to define explicit hooks, which are just lists of functions that you will call at documented points. This is functionally identical to advice, except that it is also a good signal that the author intended you to do so.

The terminology is interesting too. Advice is deliberately modeled after the generic method combinators in Common Lisp. One of the authors of the Common Lisp spec, Gregor Kiczales, went on to define the term “Aspect–Oriented programming” and later developed AspectJ. Although the phrase appears nowhere in the documentation for Emacs Lisp, it is definitely appropriate!


I'd love to hear a more-concise description of what I just flailed around trying to describe; I'm always in the market for pithy explanations :)


Hmm. I don’t think concision is really the right way to go; below a certain length a description just becomes less precise and useful rather than better. I think that my own description should have had another paragraph or two, looking back at it.

Or maybe we could both be more concise by just telling people to go read The Art of the Metaobject Protocol.


Usually packages don't modify code in core Emacs or in other packages. Often, they use package-supplied "hooks" to add behavior to other packages. Failing that, there's the possibility of 'advising' functions to add/change behavior at certain points without actually monkeypatching. I've never seen a distributed package actually monkeypatch anything, though it is something you could do in your personal config.


Yeah and even in personal config one could use el-patch[1], to make monkeypatching future proof.

[1] https://github.com/raxod502/el-patch


You know, I haven't thought much about this. I use a lot of third-party packages and have a lot of my own custom code, but things almost never interfere. In the few cases that crop up (for example, awkward interactions between org-mode's completion and the completion plugin system I use), it was so easy to add some code to my .emacs file to fix that problem that I had no real problems with it.


This is the only viable extension/modding architecture.


Tale as old as time:

VSCode vs Atom

WebExtensions vs XUL

Mod API vs patching game files

Vim vs emacs


Oddly; I don't know anyone who ever did more than demo Atom...


I used Atom before VSCode. I'm still in love with its minimalistic design and somehow font rendering in Atom better than in VSCode (still not sure why, I tried to play with settings and stuff, but nothing changed). It was the first prominent Sublime Text alternative with JS that I use every day and quickly wrote my extensions, hack it and play with it. There were even interesting decisions like a Rust-based rendering engine[1] using graphics without HTML and DOM (the same old problem with faster text rendering with web techs).

In my opinion, the ONE big thing that killed Atom was LSP[2]. VSCode was slightly better in everything, but good performance and LSP destroyed Atom and damaged all other editors (like Sublime).

But MS plays dirty here too. In VSCode, LSP works with many inner hacks. You can't get the same experience with LSP in other editors because some features are part of the editor, not just LSP. But I think LSP is excellent and use it in Emacs and Vim too. For Rust, it's maintained by the Rust team and the default "engine" for editors.

Anyway, I'm still using Atom today with a minimalistic theme and setup to edit markdown files with code and as a text editor with a friendly GUI overall.

1) https://github.com/atom-archive/xray

2) https://langserver.org/


VS Code is an amazing piece of technology, despite all the "Electron hate" (although changing to WebView2).

I recently commented on it here:

https://news.ycombinator.com/item?id=28556588

Microsoft engineers are top-notch and this is another feature I'll be using daily. Good stuff!


VS Code is the benchmark of what can be done with Electron - if you really, really care. But it's an extreme outlier.

The hate for Electron comes from how the average Electron application works. Not only almost nobody cares as much as VS Code team does, the very choice of using Electron itself is usually an act of not caring.


But this is also why simply switching to OS-native APIs and compiled languages wouldn't help much for the average case. A team that doesn't care about performance in Electron also wouldn't care about peformance in native applications, and performance isn't some magic pixie dust that automatically appears when chosing a different UI framework or programming language, it needs to be actively worked towards (some YMMV of course).


Ehh... there's more than a little YMMV: every ecosystem has common, least-effort paths with certain performance characteristics and those characteristics vary greatly depending on language.


> performance isn't some magic pixie dust that automatically appears when chosing a different UI framework or programming language

That's a little misleading. If you write the same program in idiomatic C++ and Python then it's almost guaranteed that the C++ version will be much much faster even before you have done any profiling or performance optimisation. So there is some magic pixie dust.


For a crud style app, the difference in performance between python and C++ is going to be absolutely negligible. The Python version might use more resources, and if you profile it, it might show up with a couple hotspots, but you're still going to have a responsive desktop app if it's implemented properly.

The slowdown is architectural or design based. Sending one HTTP request and not updating your UI until the request has completed fully is going to have _way_ more of an impact on the perceived performance and responsiveness of a native app.


C++ doesn't give you asymptotic Big(O) algorithmic superpowers. No language does.


The average electron program has very little data to work with and not a lot of hard algorithms to run, yet still feels sluggish, that's the main complaint here


It's because of work done on the main thread when it should be done using Workers. More about the lack of proficiency with regard to understanding UI applications themselves. Block the event loop in another language and you'll also get a laggy irresponsive UI.


There are probably a lot of different reasons why Electron apps can get laggy and unresponsive when sloppily written.

However, I notice a significant difference in responsiveness between VS Code and Sublime Text -- enough that I changed my workflow back to Sublime Text because the very slight latency difference annoyed me. So I do think there is a baseline difference between the frameworks used by those two apps that no amount of optimization can overcome. It's sort of like the acceleration difference between a truck and a sedan: sure, powerful trucks can sometimes out-accelerate anemic sedans. But if you put a similar drivetrain in both vehicles, the vehicle with less mass (or memory footprint, in the framework analogy) is going to win.


Stacking abstractions is a great way to give yourself Big O problems. Fighting the height of the stack reduces them, and "Use C++" does tend to fight the height of the stack.


C++ doesn't fight the stack more than any good JIT. In fact, it may be less capable in noticing ways to inline functions given that it's a statically compiled language. It's only current advantage with the stack is tail call optimization which is coming to V8 very soon.

Funny that the C++ guy is talking about abstractions, where them V-Tables at?


They talked about stack of libraries beneath you, not the execution stack. If an API call takes 10 microseconds because several "premature optimization is the root of all evil" abstractions in between then doing that just 10k times already nets you 100ms which is a very noticeable stutter. So with that limitation you are now forced to create elaborate data structures with caching etc to try to work around this slow API call. However doing the same querying in C++ without those abstractions where each call takes 10 nanoseconds means that you no longer have to try to create complex data structures to work around that slow API call, even if you do it a million times it would only take 10 milliseconds and maybe drop a frame.


Nowadays JS is as fast as C++, if not faster.


Let me respond for him. He meant abstraction as in "stack of dependencies." Because everyone knows that's what abstraction means, right?! Because we all know NPM is a disaster as opposed to C++ dependency management which doesn't even exist in a standard form. /s


Microsoft has a C++ API you use to make programs for windows. If you write in C++ you will code directly against that. If you write in Javascript then you will code against someone else's API that they wrote to interface with the Microsoft API. If you write in Javascript another step up then you will code your plugin against the VSCode API which then uses the Electrons wrapper around Microsofts API to do stuff, if there aren't more libraries in between. This stack of abstractions turned out to be too slow to solve the problem mentioned in the article we are talking about, so to solve it they moved the entire thing up one step in the stack.

In these situations it is common to have a problem that is really easy to solve but the API abstractions you have to work with doesn't support the operations at the speeds you need to solve it. If you have never experienced that then you aren't working on performance intensive projects where every bit counts and your input on this topic doesn't matter. I have worked on performance intensive libraries crossing programming language boundaries and the abstraction boundaries absolutely puts a limit on the amount of performance you can get. Well designed abstractions are less obstructive but plenty of them aren't well designed and even the well designed ones have overhead.


Microsoft has a COM API. C++ has a stack of macros and abstractions for communicating with a COM API. If you code in C++ you code in abstraction to an interface to the Microsoft API but not the API.

To get rid of abstractions you almost should be writing more directly in some sort of COM+ language.

(To explain the punchline for those that don't catch the joke: COM+ was the codename/early name of what became .NET.)


Microsoft has a COM API but when I used to develop, I'd just call kernel32, user32, advapi32, and other system C APIs directly. COM is a POS imo. DirectX is a decently engineered class-based API. But the rest of them have a lot of flaws.

The funny part is OP is talking all about uOPS when if he was really hardcore (like I am) he'd know that these dlls in turn call ntdll, many calls in ntdll are undocumented but much faster than their wrappers in the other dlls. But no sane person is going to strictly do ntdll calls except for the most performance critical code.

OP just doesn't know enough about C and C++, he probably grew up on C++ and forgot about the old C apis. I used to reverse engineer and delve deep into the windows API. I know a little bit more about performance than the average high level programmer.

And ultimately, .NET does a fine job with performance. C++ coders crapping all over .NET should take a look at the Objective C API of Apple. It's the default and every Objective C call incurs overhead and is basically a wrapper around the undocumented C API. But I don't think anyone ever complained about this abstraction, because it's such a stupid and small amount of performance to harp about. The convenience outweighs the tiny little uOPS loss.


Ah, so this is all coming from a Microsoft C++ guy. Well better get to using those undocumented ntdll calls since you really need those uOps. It's not like the standard libraries of other languages is written in C or anything... ;-)


JS is almost close to the metal nowadays thanks to the stunningly optimized JIT.


Can you initialize a kernel with JS?


Of course. Have you not seen the VMs written in JS?

https://bellard.org/jslinux/


That is a VM running in a web browser. If I was thinking about a joke that was so comically far away from the metal, that’s pretty much what I would use as an example.

This is why it’s very difficult to have conversations. Most programmers really don’t even know how computers work.


Heh, I know plenty about how computers work. You didn't specify what hardware you wanted targeted. Emulated devices are just abstractions.

Here is a real kernel: https://news.ycombinator.com/item?id=7958194

It's so difficult to have conversations with engineers that don't know x86 assembly. They never stop complaining. If only I could NOP them! ;-)


Claims that JITs can produce code that's routinely faster than C++ have been around since Java. So far, the promised gains haven't really materialized. It appears that optimizations that can be gleaned from dynamically profiling a running program simply don't provide enough benefit to account for optimization opportunities lost because the JIT has to be "fast enough", and from language semantics that is inherently hard to compile to fast code (of which C++, quite intentionally, has little). V8 is not an exception.

Your specific example - inlining functions - is not particularly illustrative, since C++ can inline just fine across translation unit (and thus also static library) boundaries with link-time optimization. What it can't do is inline across shared library boundaries, but large C++ apps are usually mostly statically linked for redist anyway.


Except Windows loves COM, and it is all about DLLs and out-of-process IPC.

Those gains have materialized in distributed computing, where the 1% cases where C++ wins in micro-benchmarks don't really matter, when network latency, databases, load balancers and the whole lot come into play.


Java is not V8, it isn't Dalvik either. Java is a memory hungry turd that never delivered, agreed. But let's not forget that Google made their own runtime for Android and invested millions into V8.

https://news.ycombinator.com/item?id=17648139

> What it can't do is inline across shared library boundaries, but large C++ apps are usually mostly statically linked for redist anyway.

I'm guessing you've never looked inside System32 on Windows or /usr/*/lib on Unix. I don't see shared libraries as a bad thing, they are great for reducing disk and memory usage. Let's not make bullshit up about how most applications are statically compiled with all their dependencies ;-)


Java is not V8, of course, because one is a language, and the other one is a VM.

Java is also not JS. Semantics of Java are much better suited to effective compilation than that of JS, due to static typing and (in many cases) early binding. Consequently, modern Java VMs have the best JITs in the industry - faster than V8 - and they're still not on par with C++.

Most applications dynamically link to system libraries. On Linux, this is kinda fuzzy because package managers handle everything; and yes, I agree, on Linux the norm for distro-packaged software is dynamic linking. But stuff packaged as Flatpak etc is much more likely to be statically linked to anything other than libc. And idiomatic C++ tends to involve lots of templates, which are inherently "statically linked".

(Also, native code is broader than C/C++ - it includes e.g. Go, which drops even the libc dependency, and Rust.)

On Windows and macOS, though, where "app is a self-contained folder" has been the rule rather than the exception for a long time now, dependencies that don't come from the OS are often statically linked.


The JVM is hot garbage. It is slow to start up, a memory hog - and like you mentioned previously, overabstracted - no amount of magic unboxing of primitives like int help its performance. There are 100s of different parameters one can set to control garbage collection and other performance characteristics. It's a job in itself dealing with the beast of JVM. YMMV but every time I've dealt with Java I was dissapointed. I'd much rather use C++, Julia, or pretty much any other AOT or JIT than Oracle/Sun's JVM.

You keep saving face. So you admit Linux is largely dynamically linked. Well Windows is too, even to C stdlib, look at how many versions of MSVC runtimes are in your system32 dir after just a few installs. Templates are preprocessor so obviously they are "statically linked."

Microsoft has a COM API (which a lot of C++ uses) but when I used to develop, I'd just call kernel32, user32, advapi32, and other system C APIs directly. COM is a POS imo. DirectX is a decently engineered class-based API. But the rest of them have a lot of flaws.

If you were really hardcore you'd know that these dlls in turn call ntdll, many calls in ntdll are undocumented but much faster than their wrappers in the other dlls. But no sane person is going to strictly do ntdll calls except for the most performance critical code.

I used to reverse engineer and delve deep into the windows API. I know a little bit more about performance than the average high level programmer.

And ultimately, .NET does a fine job with performance. C++ coders crapping all over .NET should take a look at the Objective C API of Apple. It's the default and every Objective C call incurs overhead and is basically a wrapper around the undocumented C API. But I don't think anyone ever complained about this abstraction, because it's such a stupid and small amount of performance to harp about. The convenience outweighs the tiny little uOPS loss.


JVM is a hog, but when it comes to raw compute perf, it's a very fast hog once it starts running. Do you have any examples of anything better (post-startup)?

Yes, I'm well aware that many system DLLs on Windows in turn call into NTDLL, where the actual syscalls are. And yes, I agree that it hardly matters in practice - but it was your premise that inlining across shared object / DLL boundaries is crucial! In practice, yes, it almost never is. And yes, .NET is perfectly fine perf-wise, and even JS is fast enough for most cases. I've actually spent most of my career writing C# and Python, after writing a bunch of C++, and I very much appreciate the productivity gains those abstractions offer.

But this is a very different point. Native code is still measurably faster where it matters, and JS/V8 can't keep up for very good reasons.


Well I guess we are finally in agreement. Right, there are few places where it matters these days.


So far, JS has beaten C++ in speed on a lot of topics.


At the very least, not in writing JS VMs it seems...


Please name three.


TCO is coming to V8? Is there a source for this?


We're talking about small, quick operations supporting a synchronous, interactive user interface here. When it comes to performance, asymptotics aren't everything.


JS has Workers. VS Code uses them wisely. Other Electron apps? Not so much.

Block the event loop in other languages and you'll also see shitty results..


Have fun implementing all the overhead just communicating with those workers. That's a serialisation pass, a serialisation pass and two event queues, all just so your application doesn't lock up.


https://developers.google.com/web/updates/2011/09/Workers-Ar...

For 10 years an ArrayBuffer has been able to be sent by reference but good effort on disinformation.


You still need to serialize/deserialize to put your object graph into the array, no?


Not if you use WebAssembly. You can simply pass an Array Buffer to be worked on, you postMessage it with transferable=true.

  transfer Optional
  An optional array of Transferable objects to transfer ownership of. If the ownership of an object is transferred, it becomes unusable in the context it was sent from and becomes available only to the worker it was sent to.

You can compile JS itself or pretty much any other language to WASM using Emscripten or the other LLVM toolkits. Looking at VSCode, it appears they use this technique for some of the heavy lifting.

JS is very performant if you know how to use this hybrid architecture. The C++ guys above are shitting all over JS when an Electron app can simply use C++ transpiled to WASM if they really wanted to. Electron isn't really for JS as a coding language. It is much more for the awesome cross-platform UI you get with HTML, CSS, and JS. A lot has been done to optimize rendering engines. The same techniques that render snappy web pages can be used within Electron. All the griping above is really just ignorance.

https://github.com/microsoft/vscode/search?q=wasm

https://github.com/microsoft/vscode/search?q=arraybuffer

https://developer.mozilla.org/en-US/docs/WebAssembly/Using_t...

https://developer.mozilla.org/en-US/docs/Web/API/Worker/post...


WASM isn't Javascript. WASM is a way to represent native code in Javascript that some Javascript runtimes then can use to run it as native code. If the Javascript runtime hasn't implemented a hack to run WASM as native code then WASM is ridiculously slow. If the VSCode team writes their code in C++, compiles it to WASM and then runs WASM and it is fast, then it was C++ that was fast and not Javascript.

If that is really how they got their performance then no wonder that few others actually managed to do it, because most teams wouldn't write their Electron app in C++ and then compile it to WASM. The difference is that Microsoft has a ton of C++ engineers so they could do it easily, but I doubt many Electron teams put out job postings to hire C++ people.


That's like saying if you use the _asm directive in any language, your entire project is now assembly. Subtlety is not your strong suite. One can use WASM for manual memory critical paths while using raw JS for other parts. And you did not hear me. You can write WASM in JS. Transpiling. Amazing, huh?


And how fast is JS compiled to wasm? Are you claiming that it's also as fast as C++ compiled to wasm?


Yes, Typescript to WASM is just as fast as C++ to WASM.


Why use C++ when JS is faster? VSCode proves that. It's orders of magnitude faster than old VS, has more features, uses less resources, etc.


VSCode does not prove that at all. The main reason why it feels faster is because it's written from scratch to be async. VS is a legacy codebase going all the way to 1997 (if not earlier, in places), with lots of code still running on UI thread, and all extensions running in-process.


Let's wait thirty years for the legacy to accrue and then compare weenies, shall we?


[flagged]


Ah, I see. I appreciate the effort but maybe try being subtler, I was almost mad until you overdid it.


The big thing they don't tell you in CS class is that constant factors matter far more than asymptotic complexity most of the time. (There are a few exceptions though.)


a text editor (especially for code) is a good example of a case where both matter. it doesn't take much latency to make typing a very frustrating experience. the common case (editing small files) needs to be very fast. but we also need to gracefully handle very large files that involve nontrivial processing (10kloc+ c++ files do unfortunately exist). I gave up on atom several years ago when it slowed to a crawl opening a 4kloc c file. vs code can handle many multiples of that without breaking a sweat, so it's my current first choice.


Not in the case of a text editor. But feel free to hand-code your editor in assembly.


Not even Rust?


Perhaps if one truly transforms into a Rustacean.. ;-)


The base performance of a native app in most operating systems is higher than the base performance of your average electron app though, in my experience.


You're not wrong, but what about feature sets? The only native IDE I can think of is xcode, the rest are either big Java ones or are lacking in features.


Yeah there's definitely some YMMV. An efficient baseline does continuously yield performance benefits, even if inefficiencies are layered on top.


It absolutely would help. It's practically tautological to say that you can create bad performance in any language, but that ignores the very real fact that some languages and systems just perform better for any given coding skill level. With native apps, you have to be extremely bad at coding to get bad performance. With electron, you have to be extremely good at coding to get good performance. VSCode is the exception, not the rule.

Even then, it's a pretty poor exception. The "good" performance of VSCode doesn't scale. It's a pretty barebones editor by itself, and once you load it down with extensions, it slows down quite significantly. The python extension, also written by Microsoft, is one that was bad enough to make me leave VSCode for good.


> the very choice of using Electron itself is usually an act of not caring.

Hmm, not sure I buy this, sure it's not as fast as going native on every platform. But let's be honest the alternative isn't 3-4 native applications built with care for each special little subgroup. It's a MacOS only app in the US market or a Windows only app everywhere else.

Businesses don't have an unlimited amount of money to spend addressing every tiny market so I'd say the complainer's focus should be on building a better X-plat story if they really care rather than whinging that some application isn't built to make the most of the 0.1% of the addressable market they find themselves in.


Electron alternatives aren't platform specific, they're other xplat toolkits like JavaFX, Swing, WxWidgets, Qt, JUCE, Flutter...


Of those, Qt is the only one with acceptable performance and ootb visuals.


Here is a Java/C++ audio workstation software,

https://www.bitwig.com

And here is a must read book to do great UIs in Swing,

http://filthyrichclients.org

Here are examples of Java applications based on Microsoft's Fluent UI design,

https://www.pixelduke.com/java-javafx-theme-jmetro/


Performance is fine on all of these compared to Electron. Every non-native toolkit needs a lot of work to look good. But it's usually less work than native on every target.


Swing is hardly slower than Electron.

And what's unacceptable about wxWidgets performance?


Unless you need to run on mobile, Qt is the answer.


That’s the dream, and it stays a dream most of the time. Business concerns pop up again when it comes to building out teams for Qt based desktop apps.

Electron comes with an enormous, deep, absolutely stupid big pool of web developers. I genuinely can’t emphasize enough how big it is, and how easy it is to hire from.

Montevideo, Montenegro, Monterrey, doesn’t matter. Hot, not particularly expensive developers are ready to churn out React UIs in your area!


> Electron comes with an enormous, deep, absolutely stupid big pool of web developers.

Looks at the web.

As Linus Torvalds put it, it's worth writing the whole thing in C for no other reason than to prevent those 'developers' from 'contributing' to it.


Qt runs like shit compared to Electron, and its license prices are way too expensive. Moreover it's C++ so a deprecated, unsafe, un-cool and uninteresting dying language no one wants to write anymore.


Really? Make a fully featured modern DE/WM/Compostor in electron that fits into 128MB of RAM and runs at 60fps on a Pentium III in daily usage then with a suite of apps.

Or better yet, make a web engine from scratch, that is, write the rendering engine, compositing engine and so on using Electron primitives.

Because both of these work very well when using Qt as a base, see LXQt and KHTML.


You do know very well that all of this is going to happen someday. Everything will be re-written in JS just because it's possible. And numbers will show that those solutions beat C++ in speed and resource usage, like VSCode does.


What does VS Code beat in speed and resource usage? Be fair and compare it to other software with only colourization, Git support and extension support.

JS can get close to C++ in speed but the same program in JS will always need more memory.


I doubt that. C++ programs have always tons and tons of memory leaks.


Which you can fix if you're competent. If we're looking at bad programs, JS programs are famous for importing 15MB+ of code for simple functions.


While using a runtime written in C++.....


In a few years or so, the runtime itself will be in JS. It's unavoidable, this is the way things are.


Until JavaScript has the language features to be able to bootstrap itself, that won't happen.


As long as JS is jitted/interpreted at runtime, I don't even know how that would work!!


Correction: LXQt uses Openbox (not Qt) for windowing, and offers a choice between no compositor (by default) or one like picom (not Qt) or KWin (Qt Quick, JS-based, I wish it wasn't JS-based but it's too late to change that).


Qt Quick is not based on JS. JS is merely an option for Qt Quick.


That must be why WinUI is written in C++, Metal uses C++ shaders, or CUDA hardware is designed according to C++ semantics, then.


Let's just wait for JS bindings for all these things and let's see what competent people will use. You already know the answer, do you? Yes, it'll be the JS bindings.


Someone has to write those bindings, and they won't live forever.


As per parent post, C++ it is not cool and hip anymore, so it won't keep CADT programmers interested long enough to get the work done.


Well, 99% of most programming languages fit that as well.


That's why according to the parent, everything is going to be rewritten in JS very soon! /rollseyes


> The hate for Electron comes from how the average Electron application works.

A lot of it, given the comments I see in relevant threads, comes from the inefficiency of every electron app having its own copies of node and chromium in memory (and on disk, and being transferred over the network when installed/updated, though those are smaller issues than run-time performance). Though that is unavoidable given how it works unless particular LTS versions can be enforced so everything is tested against those making sharing easier/possible. I'm not sure how much of an issue it really is anyway: how many people are running several instances of Electron applications at once?

Also I get the impression that a fair amount of it comes from people who are parroting what appears to be a popular sentiment, without actually understanding or having skin in the game! This says something disappointing about technical communities.


Electron apps running at the same time: VSCode, Slack, Discord, Signal, Bitwarden, Teams... Most of the apps are made with Electron now. It's an exception when something isn't made in web technologies, nowadays.


emacs has been an usable and relatively fast editor built on top of a fairly slow interpreted language. In comparison JS and V8 should be significantly faster. As you say, the issue is usually the layers of layers of layers of sometimes gratuitous abstractions build on top of some applications.


> built on top of a fairly slow interpreted language

Correction: built on top of a bytecode-compiled language that, few years ago, became a native-compiled language (both AOT and JIT) - initially as an experiment/optional feature, and as of ~5 months ago, this has been merged into the main repo branch.


Of course, I'm an happy user of native-comp as well. Still for most of its history emacs got by with a not exactly state of the art interpreter.


emacs also has the advantage of 40 years of existence, and perhaps most importantly for performance: the ability (requirement?) to choose exactly what functionality you want to load.


VSCode did more in a few years than emacs did in 40 years.


Perhaps it’s because Electron opens the door to a ton of applications that wouldn’t have existed before, so the average skews.

A lot of people complaining that Electron makes people choose an “act of not caring” but nobody actually does anything about it. Last I checked Electron didn’t make other UI kits go away.

I’m not without criticism for Electron. But I begin from a position of good faith on the subject, and appreciate that a lot of these apps simply wouldn’t have existed otherwise.


What did Microsoft specifically do in case of VS Code to make it so well even with Electron?


Even with Electron ?

They put a lot of thought into it like you should when designing any application for any platform. They also have a team of very competent senior developers who have previous experience with developing IDEs like Eclipse.


They use the WASM runtime instead of just running Javascript.


And even VSCode had issues like https://github.com/microsoft/vscode/issues/22900


I don't use VS Code, but I like having it installed; I'll occasionally launch it just to watch it update itself and/or its extensions, and enjoy the dopamine hit that comes from the satisfaction of getting my software up-to-date.


This is the most hacker-news comment I have ever seen.


And yet Electron still doesn't have native file functionality on macOS titlebars.


Wait, is the hate transitioning to WebView2 or is VS Code? If the latter, do you have a source? WebView2 is a Windows-only API, so I don't think it'd be a good match for VS Code.


WebView2 will be coming to macOS then Linux in the not-so-distant future: https://github.com/MicrosoftEdge/WebView2Feedback/issues/645...

I'm not aware of any public plans to move VS Code to WebView2, but I would be surprised if it doesn't happen eventually.


I’ve always wanted something like this extension, but where the brackets aren’t colorized, but rather change size depending on their nesting level, with the outermost brackets becoming increasingly large as the total nesting level of the expression increases — exactly the way brackets do in maths.

Ideally, unlike in TeX, additions to nesting levels wouldn’t require a re-flow (i.e. typing a bracket wouldn’t make the code constantly shimmy around on the screen), but rather these would just be changes in the perceived size of these characters, while keeping them where they are on a monospace grid (since a lot of code assumes a pure monospace layout for indentation et al.)

In my imagination, this would work by just having a set of bracket/paren/brace/etc. graphemes in each font, that have increasingly-long ascenders/descenders, while the base size of of the “functional” part of the grapheme remains constant. As if there were five versions of the letter g, with an increasing length of the stem on which the lower hook rests.

As such, these magnified brackets would be rendered similarly to the way emoji are rendered in terminals: the grapheme would eventually “leak out” of its monospace box (in the emoji case to the sides; in this case above and below), to the point of eventually overlapping surrounding graphemes, rather than re-spacing the text to give it room.


Using the "Custom CSS" extension you can do this. It looks like the bracket colorization is applied via CSS styles, via classes:

bracket-highlighting-1 ... bracket-highlighting-6

unexpected-closing-bracket


The challenge here would be that nesting levels can be added inside or outside the previous level, so to prevent reflow, bracket size would be dependent on order of addition and could vary wildly between different "nests".


You could just adjust the height, not width of the bracket. There is usually plenty of space between lines to support this.


I have a strong memory of using an IDE that behaved like this - using size and font to communicate structure - at a job around 2005. I can't currently remember the name but think it it started with 'source'?

If someone else remembers this and the name I'd appreciate it. I could be remembering wrong too about how far it went with such decoration.


SourceInsight


that probably wouldn't be sufficient to instantly grasp which bracket in a set of five opening brackets corresponded to its closing bracket. humans are not great at visually assessing minor differences in absolute size


Interestingly rainbow-delimiters in emacs seems to be instantaneous even with elisp being an slow interpreted language. For better or worse emacs does run "extensions" synchronously; this allows extremely fine grained control of the buffer, but a slow extension can kill interactive performance.


> elisp being an slow interpreted language

The issue isn't about the speed of the language, but the code not having access to information it needs from the extension API so having to go long ways around to work it out. The new version may be in a faster base language, but the majority of its speed boost comes from having access to information that the editor already knows without having to mess around re-deriving it.


Have you tried rainbow-delimiters on a 42 kilo-line file? That's the benchmark in the OP.


Just tried. Colorization seems still almost instantaneous. Emacs does generally not like such a huge file though (reindenting the whole buffer is slow for example).


Similar with `lisp_rainbow` via the slimv plugin in vim. It uses vim's built-in syntax coloring mechanism and doesn't slow down the editor.


That's what VSCode team did too -- moved the computational work from a plugin to core.


This is a well written and in depth article, it must have taken a lot of time and effort to write. Big thanks to the author and Microsoft for releasing it.


Deactivated the "Bracket Pair Colorizer" extension from CoenraadS which I've first installed years ago. Activated the internal Bracket Pair Colorizer.

First world problems: Bracket colors are different now and especially XML looks off... seems I need to fine tune that some day (but not today).


Or open an issue on GitHub with a few screenshots? Vs code devs are iterating!


As a colour blind, I distinguish colours are different but I don't care/wouldn't know what they are: perfect!


You can change the colors in the settings.


This was such a joy to read. I rarely read every line of a long article. I just sift through the longer articles and this wasn't the case. It is written in a very interesting way, describing the -- past, known issues, and the solution. Kudos to the team behind this implementation, the original author, and the people who put forth this blog post!


For those just wanting to try it out the feature can be enabled by adding the setting "editor.bracketPairColorization.enabled": true.”


There's a gui option for this but it's a bit mysterious.

It shows enabled, but immediately under has an unchecked checkbox.


Looks like it formats the underlying setting attribute name in the UI by emboldening the final dot-separated word

editor.bracketPairColorization.enabled


This document is a prove of why you need to learn algorithms, algorithm complexity, compiler, parsers :)


If you work on internals of an compiler/editor, yeah.

For everybody else, no.


Basic knowledge making you able to do rudimentary big O analysis is always useful. Unintended polynomial running time where linear is possible is a very common performance regression.


This year I started my leetcode grinding for interview purposes. But when I look back, I can tell that's all this practice and analysis of solutions made me a much better engineer than before. The most impactful thing in my 6+ year career was my understanding of how to think effectively about my code.

Before that, I got some basic knowledge about CS theory, master theorem, asymptotic, all these things, and was a typical skeptical developer who was thinking like, "yeah, that's cool, but you don't actually need it. At all, basic knowledge is enough."

Today, I will recommend that anyone go through basic algo and ds courses (Sedgwick book or course, Skiena book too) and try to practice them.

Now I'm even trying to participate in leetcode and codeforces contests. In the end, after the first months of frustration about how stupid I was and how I couldn't find a solution to easy problems, it's started to be fun, and I loved it.


I'm just amazed how much technical genius is needed just to work around their architecture - to archive almost the same responsiveness that this feature had in emacs 20 years ago.


How is it implemented in emacs?


If parent is referring to rainbow-identifiers, it seems that it uses so called syntax tables, which are used for syntactic parsing of documents, they are created by the buffer major mode and are shared between modules. They are used for all kind of things, like indenting, cursor movement, etc.

At least that's my understanding, I'm a long time emacs user, but I don't have a very deep knowledge of the internals.


I wonder why VSCode hasn't yet adopted TreeSitter for syntax parsing. Seems like it solves at least part of the performance issue via incremental parsing


(author of the blog post here, personal opinion)

Tree-sitter is much more general (and I guess way more complex) and most likely cannot use some tricks we use for "simple" bracket pair parsing. For example, we can almost always re-use nested bracket pairs when characters are inserted/deleted, because the bracket pair language is so simple and it does not matter where a bracket pair is in the AST. But when parsing C# and adding a single opening bracket at the beginning of the file, I doubt namespace/class declarations stay namespace/class declarations.

Also, long lists in Tree-sitter seem to cause linearly growing incremental parsing time - that's why we use balanced (2,3) trees. You can try it out on their playground [1] by selecting JavaScript and adding some 100k `{}`s. On my machine, adding a single character takes 50ms. When there are 200k bracket pairs, it takes 100ms. When all these brackets are contained in a single bracket pair however, adding characters after this single root pair is fast again (<1ms). But to be honest, Tree-sitter is still mind boggling fast.

[1] https://tree-sitter.github.io/tree-sitter/playground


On the first point, one idea would be to implement the simple bracket pairs language as a TS grammar, stand-alone and independent of any other syntax highlighting. The C# problem of making the syntax invalid and killing the brace highlights disappears.

The linear scanning behaviour you describe is due to the change in the left and right parse context for all of those pairs. Yes, it’s linear when you invalidate a subtree, but in the case of a simple bracket pairs language, the damage is limited to scanning through the children of top level brace pairs, and each child subtree is trivial to check.

From the IGLR paper:

> In a state-matching implementation, each node representing a nonterminal symbol contains a record of the configuration of the pushdown automaton (the ‘parse state’) when the node was shifted onto the stack. A subtree can be reused when both its left and right context are unchanged

Imagine { is prepended to abc(def, {ghi}). I believe the “abc” needs re-parsing, and so does the () subtree as their left parse state is now the “looking for }” state instead of “looking for ({[“ and “looking for )” respectively. But it’s limited to one level deeper — after you split the () subtree, every subtree inside it is still in the “looking for )” configuration on both sides. Specifically, the “def”, “,” and “{ghi}” are fully reused. There are only three possible parse states, so generally you get a lot of subtree reuse. You get even better subtree skipping performance by making long sequences without an brace into a single node, instead of eg tokenising by word and not grouping them. (In this case, “def, “ instead of splitting that.)

So realistically for your C# example, only the namespace node is split, and the linear scan is through all the top level brace pairs within the namespace but no deeper. You’ve described IGLR’s best case scenario for incremental parsing an initial brace insertion. Languages that don’t have namespaces would be worse off (but still not too bad). For more realistic languages than {}{}{}{}{}{}{}{}.js I think this approach would work very well.

If I liked brace pairs (I don’t) then I would implement this dead simple new grammar and turn it into a Neovim plugin. The existing nvim-ts-rainbow plugin uses queries on existing languages, so is very good at giving perfect/correct pairs and customising per-language, but exhibits the invalid syntax problem and also performance issues which appear to be resolved for most people, but may be back with bigger documents. (Edit — you would need a few variants to account for comments and strings. That makes it a bit more annoying.)


> when parsing C# and adding a single opening bracket at the beginning of the file, I doubt namespace/class declarations stay namespace/class declarations

It really depends on how you parse. In VS, at least, this works, in a sense that, while the resulting code is invalid, the editor can still correctly semantically highlight it, and provide code completions etc.


Interesting, I also meant this more generally. I'm surprised VSCode as a whole hasn't begun integrating TreeSitter for broader syntax parsing (beyond coloring brackets, braces, and parens).

Will have to look more into the linear growth you mention.


I can't wait until they do. TextMate grammars suck and other editors such as NeoVim are already starting to move to TreeSitter which will provide "proper" grammars for languages.


Nice, but I really want matching _variable_ colorization. Here is a beautiful demo [1]. I've yet to see it implemented in any serious editor.

[1]: https://evanbrooks.info/syntax-highlight/v2/


Jetbrains editors have that, called semantic highlighting. https://blog.jetbrains.com/pycharm/2017/01/make-sense-of-you...

per variable basis, so apparently different than the vs code thingy with the same name.

Also lots of extensions for similar stuff, often named "rainbow" something.


Do you mean what VSCode calls semantic highlighting?

https://code.visualstudio.com/api/language-extensions/semant...


No, the GP is talking about a unique color being assigned to each name (local, parameter etc) in a scope so that typos become visible at a glance. I used a Sublime Text extension for this once but haven't seen it built into an editor either.


I wonder if it would be possible to combine the two, e.g., have parameters always displayed in italics with each parameter having its distinct color.


Some commenters are upset but can't quite express why. I'll explain. The article's title is "Bracket pair colorization 10,000x faster", which implies that the previous implementation was using a slow algorithmic approach and someone made it 10k faster. Sure, the article explains the situation in detail, but the way it's laid out kinda screams (at least to me and some other people) "look at me, I'm so good that I improved the original implementation 10k times", when in reality the 10k speedup has nothing to do with algorithms but with the extension API. A more modest and fair approach would have been to name the article "Bracket pair colourization is now in core VS Code" or similar, and leave the 10k speedup as a footnote, rather than having the entire article revolve around it. It's just a matter of etiquette - it's not a situation where boasting is justified.


> Brackets are queried when rendering the viewport and thus querying them has to be really fast.

Then by querying them only when the document changes could make it another 10,000x faster ;)


Great improvement of performance for those who like to code blind!


It's amazing that modern software techniques are so slow that speeding up rainbow braces is necessary (let alone worthy of an article).

Get off my lawn (while I water it with my tears).


Going to incur even more downvotes from the MS crowd here, but seriously this is insane (and looks like the PR team are on my case).


> The core idea is to use a recursive descent parser to build an abstract syntax tree (AST) that describes the structure of all bracket pairs.

This is a really nice and clear blog post on an algorithmic challenge wonderfully addressed to a satisfactory conclusion (the rebalancing and node reuse etc work out so perfectly); must have been fun, thanks for sharing! I don't know / haven't thought much about text editors so a question for the authors or anyone who understands: from the mention of "the bracket pair AST", it seems that a separate AST is created just for this bracket-pair problem, is that right? I imagine there was/is already (at least one) another AST already being computed in the editor, for other reasons? If so, how did you decide between trying to do this with the same AST (making that one more complex), versus computing an additional AST just for this problem?


> it seems that a separate AST is created just for this bracket-pair problem, is that right?

That is right.

> I imagine there was/is already (at least one) another AST already being computed in the editor, for other reasons

Not in the renderer process.

The point is that more concrete ASTs that model more features of the language are less likely to have reusable nodes. When you have just bracket pairs, you can reuse any bracket pair that was not modified! However, when prepending a JS file with `[`, a class declaration probably does not parse as class declaration anymore, so you cannot just reuse it and make it child of the array expression (what you could do if it was just another bracket pair).


Hm so it's reusing the syntax highlighter's knowledge of comments and string literals? So then is the recursive descent parser mentioned language-independent or language-dependent?

This is cool although I feel like the caveat of the ambiguity of < and > could isn't the best experience, especially for TypeScript and C++ I imagine.


> So then is the recursive descent parser mentioned language-independent or language-dependent?

The parser is language-independent, but the tokenizer used by the parser looks up the tokens of syntax highlighting to decide if a bracket character is an opening/closing bracket or just text.


Seems interesting they chose to limit it to a maximum of 6 colors. Is there some technical reason for that? I'm currently using "Bracket Pair Colorizer" with 8 colors and even with that I can remember a couple places where I've seen this limit got exceeded and coloring started from beginning.


> Seems interesting they chose to limit it to a maximum of 6 colors.

Where do you see this? The article talks about nesting levels being bounded by O(log N) levels where N is the length of the document, and from what I can tell this is just for analysis and it supports arbitrary nesting levels… Oh, is your question about the actual colours used for the brackets, i.e. arbitrary levels of matching are supported, but the colours repeat every 6 levels (e.g. brackets at levels i and i+6 have the same colour)? If so this is a UX choice (I can't see any technical reason) and I imagine the considerations might be something like:

- There are only finitely many colours that can be usefully distinguished visually by the average person, so we need to pick some limit L,

- The limit L needs to be small enough that all L colours are easily distinguishable and memorable, but large enough that it would never be confusing whether a certain coloured bracket is at nesting level i or i ± L.

I guess the thinking may have been that 6 is large enough typically, e.g. it's likely to be clear from context whether a certain bracket (say blue) is at (say) level 2 or level 8.


If there is demand, please file an issue and we can discuss increasing this limit. However, these colors are themeable and thus there can only be finitely many.

There is no technical reason for this limit of 6.


I see, I'll definitely make one.


It's in the release notes [0].

> All colors are themeable and up to six colors can be configured.

[0] https://code.visualstudio.com/updates/v1_60#_high-performanc...


There's actually not that many simultaneously distinct and pleasing colors in RGB.


True, it's hard to make a decent theme for this purpose. I've made a tool [0] to help me with that using d3's chromatic scales.

[0] https://iaml.gitlab.io/utils/BracketTheme


> The feature can be enabled by adding the setting "editor.bracketPairColorization.enabled": true.

Interesting this NEEDS to be enabled. ...or will it be set to `true` on new installs?

Seems like a super useful feature that should be on by default to me..


I think change user's layout suddenly without explaining is definitely a way to speedrun 'how to get all user's hate' in 10 seconds. It should be either 'prompt and let you decide' or 'use new defaults on new install'


They’ve been prompting the user on new installs to set their preferences and themes, it’ll probably get rolled into that.


I think it's a good choice, since developers do not like people messing with their environments without asking. Especially if those people are Microsoft. The strategy of writing a really good, technical article explaining the feature, and getting it marketed on forums where developers can see it and get excited enough to try it seems savvy to me.

People who are resistant to change (and likely to complain) will just not be affected.

I wouldn't turn it on for existing users, but as you say, having it on for new installs does make sense though!


I think people don't spend enough time thinking how many of the problems they have are caused by the fact that programming languages are just text that need to be parsed by every single tool they use. If languages were (eg) AST-based, this would be a fairly trivial problem.


(Personal opinion)

Constructing or incrementally updating an AST is not cheap. In particular, most ASTs are very fragile and there are characters when inserted at the wrong position invalidate the entire AST (e.g. prepending a C-like document with `/*`). Even prepending documents with `{` could render all nodes of the AST invalid, as they might get a different meaning (a class declaration might now be parsed as block statement).

The AST for the language of well-formed bracket pairs is very robust. There is no character that invalidates the entire AST. There are characters though that invalidate all tokens, but there is a separate asynchronous (slow) method that updates them in the background and only then incrementally the bracket pair AST.


> Constructing or incrementally updating an AST is not cheap. In particular, most ASTs are very fragile and there are characters when inserted at the wrong position invalidate the entire AST (e.g. prepending a C-like document with `/*`). Even prepending documents with `{` could render all nodes of the AST invalid, as they might get a different meaning (a class declaration might now be parsed as block statement).

I think you have a worldview where text is being edited and ASTs are constantly being replaced via parsing.

In an AST-based language, it doesn't necessarily have meaning to add some random character at some random spot, in the way that it does with text. With text, you can only recreate the entire AST by parsing, and discover that the parse is broken.

With an AST-based language (such as https://darklang.com), the editor could inform you that that character cannot be added in that place. Or it could allow you to put it there, informing you that that particular piece of text has no valid meaning, while the rest of the AST is still valid. (Darklang uses both approaches).


I don't see why it couldn't be cheap if the editor always enforced being structurally valid? Then AST transformations are as fast as modifying a tree. A comment node would just wrap the necessary items and could unwrap them in linear/constant time.

https://twitter.com/dm_0ney/status/1414742962442014720


It would also be much harder to render or edit.


My claim - which I am testing with https://darklang.com - is that code being stored as text is a source of significant friction. There are certainly trade-offs, but the value of the non-text representation is under-considered, as is the cost of the text representation.


In q/kdb, bracket for unary function application is optional, that is, f[x] is the same as f x. I found it very convenient after a while, and wish other language follows. It is very convenient to write f g h x, than f[g[h[x]]]


This is the usual arrangement in functional programming languages - ML, Haskell etc all use juxtaposition to denote function application.

If you want something along these lines, but a bit closer to mainstream, with easy access to popular libraries etc, that would probably be F#.


Not to de-merit the creator. Looks really useful. But I tend to code using Allman-style bracketing. I find Allman bracketing and corresponding indentation makes my code more readable, without the need for much color.


Great illustration of how API-based extension schemes are fundamentally crippled.

Also, "Feel free to skip the sections on algorithmic complexities." repeated so many times sounds rather patronizing.


Slightly related question: is there a plug-in that can color my code based in scope instead of syntax?


There is one for emacs. Could be good inspo if someone wanted to make a VSCode version.

https://github.com/alphapapa/prism.el


Not exactly what you asked for, bun indent-rainbow colors the indentation space per nestedness level.

Also, I think the old bracket colorimg extension has an option to underline the entire section of code between the brackets pair you're currently in.


Towards the end, they say:

> Even though JavaScript might not be the best language to write high performance code

Why not?


Performance is not predictable, and basically everything is cache-miss.

I love JS but this statement is absolutely true.


> the checker.ts file of the TypeScript project, which has more than 42k lines of code

What the actual f** ?

42k lines of code and they mention it casually without even thinking about explaining the reason behind this monstrosity?

And yes, I fully expect the comments below to contain stuff like "This is nothing, back in the day we had a 1M loc Java class."


The file is linked from the article, so you can look at it and judge for yourself.

https://raw.githubusercontent.com/microsoft/TypeScript/8362a...


Why on earth is it their job to explain the reason behind a large file in a completely different piece of software?


This is nothing, back in the day we had a 1M loc Java class.


They just picked a big real file and used it for benchmarking. Apparently it's not even from the same project, why would they comment on it?


Great news!


Not for the extension developer though. Good thing it was not monetized.


Igelau linked to this thread: https://github.com/microsoft/vscode/issues/128465#issuecomme...

In it, CoenraadS (the extension developer) writes

"I follow this thread with interest, my extension is something that grew a bit out of control, and I grew tired of maintaining it."

and

"If it could that would be great, and my extension could be deprecated completely."

There was also discussion with the author about updating the extension to prompt users to switch to the native functionality, and it appears CoenraadS was also involved in the process/design of implementing the feature in VS code.

This appears to be a shining example of doing it right, and giving the original extension author credit.


Well, he no longer has to maintain it, and an even better version of the feature now exists



Does anyone know of an equivalent for Visual Studio (not VSCode)?


Awesome! When is the leetcode problem coming so I can practice it for my next interview loop?


tl;dr: look for "editor.bracketPairColorization.enabled" in your user settings.


Background code analysis in Visual Studio grinds my windows VM to a halt after every few keystrokes. Yet they spend time speeding up colorization of bracket pairs, which worked just fine for the last 10 releases or so.


Are you mixing up Visual Studio with VS Code? This pertains to the latter.


Seems so.

Is there no background code analysis in VS code? That would be a reason to switch.


In both VS and VSCode, semantic analysis is handled by specific language extensions (some of which might ship in the box, as C# is in VS and TypeScript is in VSCode). So it's impossible to answer this question without knowing which language you're using.


C#


VSCode uses OmniSharp for that, so at least it's different from VS.


> Yet they spend time speeding up colorization of bracket pairs, which worked just fine for the last 10 releases or so.

You must have missed the part where it is now 10,000x faster. Given your complaint about performance this should excite you!


If VS is to slow, you need to use out more for the Background code analysis to amortize.


I don't get it.


So, what's the plan? Reimplement every popular extension that doesn't have good performance?

> the asynchronous communication between the renderer and the extension-host severely limits how fast bracket pair colorization can be when implemented as an extension. This limit cannot be overcome

Nope, I'm not convinced. If it can be done internally, then you should expose more internals until it's possible to do it with the public API.

I bet this could be made really performant if vscode had incremental parsing, like tree-sitter+Atom.


There were additional issues too, like access to token information discussed here

> This is another challenge of the Bracket Pair Colorization extension that affects performance negatively: it does not have access to these tokens and has to recompute them on its own. We thought long about how we could efficiently and reliably expose token information to extensions, but came to the conclusion that we cannot do this without a lot of implementation details leaking into the extension API. Because the extension still has to send over a list of color decorations for each bracket in the document, such an API alone would not even solve the performance problem.


> access to token information

This is perhaps the core difference between VS Code and Emacs (which a lot of people believe is being superseded by VS Code as the "extensible editor") - in Emacs, there is no such thing as limiting access to information. Outside of things that get hidden accidentally[0], any bit of elisp code can access everything.

It's not just a practical difference, but also a philosophical one: Emacs plugins are not designed to expose a narrow API, because it's impossible to enforce anyway. There's a structure to it, so nothing prevents one from creating good abstractions - those abstractions just have to be designed with extensibility and interoperability in mind[1], because the users (including other package developers) always have an option to just hook into, advise, override or replace any piece of code in your package.

Performance-wise, the impact of it varies. On the one hand, this level of flexibility prevents Emacs from making important breaking changes[2]. On the other hand, nothing ever has to wait for a better API design - if there's a way to make a feature faster by hooking to a dependency's internal, people will do just that, and keep doing that until the API blesses the use case.

--

[0] - Like state the C core doesn't expose, or some implicit state shared by a bunch of closures - though you can get at the latter if you override whatever is hiding the state.

[1] - E.g. by offering hooks as a blessed, well-defined and stable way to interact with internals to cover 90% of interoperability needs.

[2] - Like doing proper multithreading, or replacing Emacs Lisp with a more polished Lisp - though myself I feel ELisp is good enough as a language.


This is, in part, because VS Code plugins run in a separate process. Each extension API needs corresponding IPC code on both the extension host process and the renderer process.

This means that plugins which work great in Emacs, where there's no IPC overhead, might be pitifully slow in VS Code. Alternatively, plugins that work great in VS Code might bog down Emacs (yes, the Emacs plugin can spawn a new thread and work there, but I don't think that's the typical approach of plugin authors)

Note that Extensions are able to directly manipulate the main application bundle a la Emacs (this is what https://marketplace.visualstudio.com/items?itemName=be5invis... does, for instance), but that is discouraged by way of a checksum failure putting "Unsupported" in the title bar. Of course, that checksum code could be removed by the extension too, but doing so without informing the user would be considered in bad taste.


This makes some sense, but I'm a bit confused on why it was so slow, to be honest. Seems that running emacs against the checker.ts example they gave, with rainbow-delimiters-mode is instant. Am I just comparing against a different type of mode?

That said, I do get the point on wanting things to be IPC based, but that feels like a large jump in complexity for most items. I'm very grateful for the model of extension in emacs, where you do have to learn complexities if you are building a complex plugin, but you can go very far before you get into the realm of complex plugin.


Idea is that in emacs the procedure is a simple function call, so if the "business" logic isn't too expensive (and I presume the Emacs folks have done a good job of ensuring this is true), it will run pretty darn quick. On VS Code it's a whole IPC maneuver, so even if the "business" is fast, there can still be a lot of overhead that bites you when calls happen frequently.

The extension model is the same in VS Code, all the author needs to do is write basic JS (or TS). VS Code core does the heavy lifting of creating the extension host process to run that code and exposing the `vscode` proxy object to the extension's code, which enables communication between the extension and the renderer in a manner that appears identical to as if the `vscode` were a simple object. See the minimal hello world sample [0] for the basic case.

End of the day it's a tradeoff between latency and throughput, emacs chooses latency, vs code chooses throughput. Both have their ups and downs, largely dependent on the size and frequency of the task at hand.

[0] https://github.com/microsoft/vscode-extension-samples/blob/m...


I don't think that is quite right, actually. There are facilities in emacs to do async things. Such that you can make a similar latency trade-off.

I agree the difference is emacs is done such that it is all exposed to all developers. There are no special parts, as it were. Such that a hello world looks like (defun my-ext () (message "hello, world")). If binding this to a key, you simply need to add (interactive) after the argument list.

Obviously, things ramp up quickly, but the point is it isn't some special framework to make extensions. It is just a function.


Importantly to my original point, if I didn't like the message your little function printed, and you didn't design your "extension" to be extensible (e.g. by putting "hello world" in a defvar or defcustom), I can just... redefine your function - and everyone else using it will pick my new definition. Or I could defadvice it to modify its behavior without changing its definition.

As you say, Emacs has no special framework for extensions - it's all just functions and variables, plus a bunch of higher-level concepts (e.g. minor and major modes, customize, autoloads) to pick and choose from.


The Emacs approach is a bit more closely followed by Atom, where extensions have full access to the window, are responsible for their own UI elements, and many "core" features are actually just extensions. Atom faced many of the same difficulties with this as Emacs -- when absolutely everything in the interface is "public", it becomes very hard to make changes without breaking userspace. But yes, it does allow for a much more diverse extension ecosystem.

Tradeoffs all the way down :) If we didn't have them we wouldn't be engineers...


The problem with Atom is that they didn't set a stable foundation for where to expose everything. That is, the window that they expose is itself under active development, if I understand correctly.

This is an odd shift of the word "stable" in software. For a long time, folks took stable to mean simply "doesn't crash." But, it also needs to mean "remains unchanged" if you want to use it as a foundation of other work.

Of course, I say that, and have to ack that the web itself has proven a major exception to this supposed rule. Nothing has been stable in that entire landscape, but it has continued to grow at an astonishing pace.


I am also in the camp that thinks elisp is fine.


> So, what's the plan? Reimplement every popular extension that doesn't have good performance?

Why not? If the extension is popular that suggests the functionality is something people want, and surely if you're working on the next version of something you'd be interested in things people want?

One hopes that they wouldn't unilaterally do this to things that are monetized, but otherwise it's hard not to see how it would be a win for everybody.


The problem is that now the core is more complex, and it still won't help if someone else wants to implement a slightly different feature.


I suppose that's how people ended up with Linux. A mish-mash of incoherent UX executions where each app uses a GUI that doesn't quite match your OS and the whole experience is subpar because no boundaries are ever enforced. Likewise, let's cram every popular extension into VS Code! It's what the users want!


> A mish-mash of incoherent UX executions where each app uses a GUI that doesn't quite match your OS and the whole experience is subpar because no boundaries are ever enforced.

This has nothing to do with, and isn't even exclusive to Linux. If you have an axe to grind, at least grind it well.


Sounds like somebody's stuck using Ubuntu 4.10


I've been using a TreeSitter-powered colouriser in Vim, it's fantastic and, like with everything TreeSitter related, extremely fast.

https://github.com/p00f/nvim-ts-rainbow


Hah, I just wondered how easy it would be to implement it using treesitter, thanks for the link!


As far as I know though, tree sitter has problems with really long lists and even incremental parsing gets slower linearly when extending the list.

Also, to my knowledge, tree sitter cannot move nodes around (which is also very hard for languages that are not as simple as the Dyck language [1]).

For bracket pairs, you can easily reuse all (...)-pairs in the following example, even though their height in the AST changes significantly: old text: (...)[(...)] new text: {{(...)}(...)}

[1] https://en.wikipedia.org/wiki/Dyck_language


That sounds awesome for VS Code users in general, but I guess it's sad news for the independent developer (CoenraadS) who made the pair colorization extension popular and famous.

With the editor's native performance 10,000 times better, I suepect the number of installs of their extension will plummet.

It's hard to compete with 1st party powers, as all the hoopla around app stores (and before that, the famous "Embrace, extend & extinguish" [1] strategy) show all the time.

That said, I'm not saying it's a hostile move, it's just ... interesting.

[1] https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis...


The original developer doesn't sound too worried and seems to have collaborated on the feature: https://github.com/CoenraadS/Bracket-Pair-Colorizer-2/issues...

Plus it helps him getting rid of 399 open issues.


That's great, I did not research this beyond reading (most of, it got pretty technical!) the article.

In retrospect I guess I'm a bit disturbed that I even think about these things in terms of fame/popularity, when it should be solely about tool capabilities and power for its users. :) Shame on me, back to reading FSF doctrine.


I don't think it's wrong to assume that fame/popularity plays a role in open source. Open source developers have to get motivation from somewhere. Taking some pride in developing something popular is fine in my opinion.

That being said, my first impression from the article was quite the opposite of yours. I thought it was great news for CoenraadS. He developed something so useful, that it was integrated into the core. That's impressive! The plugin is obsolete now, but in this case it looks like "mission accomplished".

Of course this move would look different in other app/extension ecosystems where extensions are monetized and/or not open source.


Author of the blog post here.

We openly discussed various approaches with CoenraadS and other extension authors here: https://github.com/microsoft/vscode/issues/128465#issuecomme...


Quoting CoenraadS:

> Author of Bracket Pair Colorizer here.

> I follow this thread with interest, my extension is something that grew a bit out of control, and I grew tired of maintaining it.

> If there are some quick wins, I can still apply them, but I think my extension is so hacky it is easier to do 1.b or 1.c

Seems CoenraadS is completely in favor of the native implementation (and probably could have been called out in the blog post as few people will find this context).


Thanks for your awesome work! Can't wait to try it.


Isn't that kinda what you want though?

- Make an extension to solve a problem that probably should have been in the core

- Product owners notice and realise it should be in core

- Work with extension developers to get it in to core

- No longer need to work on extension


Sounds ideal for anyone who writes an extension to enhance their experience rather than because they find joy in writing and maintaining the extension.


What? If the independent developer didn't want his technology to be used, he would have patented it or made it closed source. This is not what EEE is about.


So Microsoft can never-ever integrate any feature implemented in any popular extension in VSCode, which is also mostly open source and free (in both meanings) I might add (except for some small things).

That's just silly.


I can see your viewpoint so I'm not sure why you are getting voted down.

> I follow this thread with interest, my extension is something that grew a bit out of control, and I grew tired of maintaining it.

The quote above is from the maintainer. Looks like he is happy to let it go!

Quote source: https://github.com/microsoft/vscode/issues/128465#issuecomme...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: