No it wouldn't, as there are garbage collected systems programming languages with them, already 40 years ago, but as usual in Go, we ignore computing history.
Additionally, just having actual enumerations, Pascal / Algo style, not ML style, would already be an improvement over the iota/const hack.
The author of this blog is an internet commenter, not a representative of Go. I don’t see how such a post justifies saying that Go ignores computing history.
I think this is mistaken. Go already has a way to represent 'open' union types (interfaces), so all of these runtime problems have already been solved. What's missing is just the type system support to do exhaustive matching on the members of the union. With the addition of generics, 'all' that would be necessary is to make the following a legal variable definition:
var foo interface {
struct { A int } | struct { B string }
}
It currently fails with the following error:
"cannot use type interface{struct{A int} | struct{B string}} outside a type constraint: interface contains type constraints"
Interfaces aren't bit-packed and they force storing all values as a separate allocation that the interface contains a pointer to (escape analysis may allow this separate value to be on the stack, along with the interface itself). I believe that Go used to have an optimization where values that fit in a pointer were stored directly in the interface value, but abandoned it, perhaps partly because of the GC 'is it a pointer or not' issue. In my view, some of what people want union types for is exactly efficient bit-packing that uses little or no additional storage, and they'd be unhappy with a 'union values are just interface values' implementation.
A separate allocation is not forced. The implementation could allocate a block of memory large enough to hold the two pointers for the interface value together with the largest of the types that implements the interface. (You can't do that with an open interface because there's no upper bound, but the idea here is to let you define closed interfaces.)
In cases where there is a lot of variance in the size of the different interface implementations, separate allocations could actually be more memory efficient than a tagged union. In any case, I'm not sure that memory efficiency is the main reason that people miss Rust-style enums in Go.
The problem with allocating bit-packed storage is that then you are into the issue where types don't agree on where any pointers are. Interface values solve this today because they are always mono-typed (an interface value always stores two pointers), so the runtime is never forced to know the current pointer-containing shape of a specific interface value. And the values that interface values 'contain' are also always a fixed type, so they can be allocated and maintained with existing GC mechanisms (including special allocation pools for objects without pointers and etc etc).
I agree with you about the overall motivation for Rust-style enums. I just think it's surprisingly complex to get even the memory efficiency advantages, never mind anything more ambitious.
The bigger problem is mutability. Any pointers into the bit-packed enum storage become invalid as soon as you change its type. To solve this you can either prohibit pointers into bit-packed enum storage, which is very limiting, or introduce immutability into the language. Immutability is particularly difficult to add to go, where default zero-values emerge in unexpected places (such as the spare capacity of slices and the default state of named return values)
The problem with that is that in Go, I need to be able to put methods on those things, for reasons possibly unrelated to the interface in question. For that they need to be named. For that you might as well do what has worked since Go 1.0 and just put an unexported method in the interface and declare several instances of that interface in your package.
Honestly interfaces with unexported methods are 90%+ of what people want. It's just not spelled the way they expect. And if you're not going to be happy except at absolutely 100%, a position I can and do respect, there's no point waiting for Go to get any better because I can guarantee you no Go proposal for sum types will fix that you will be forced to have a "nil" value in the sum type, so there's no point in waiting.
You might want to layer some sugar on top to enable naming of the variants. I was just trying to keep my example as close as possible to valid Go code.
Even if you have to manually define separate named structs, you still have the benefit of exhaustivity checking in type switches. That's arguably the other 10% that people want.
Nobody cares about the categorical properties of the coproduct. But in any case, there's no theoretical issue here, since it would be just as if every enum had a 'Nil' variant implicitly defined.
Forget Result, just allow the type system to express non-nullable object references. Use the same layout, just let the compiler know when something is guaranteed to exist and force null-checking when it isn't
This doesn't cover everything people might want to do with unions, but it covers the billion-dollar mistake and doesn't run against the grain of the entire language (as far as I know)
> doesn't run against the grain of the entire language
Not an expert, but my gut says maybe it runs against zero values? As in, "what's the zero value for a non-nullable reference?" Maybe the answer is something like "you can only use this type for parameters", but that seems very limiting.
Half of the language is already non-nullable and is accomplished by allowing for zero values. Non pointer variables are guaranteed to be never nil.
What is missing is the ability to have pointer variables and have the compiler ensure that it will be never nil. I believe this was a design choice, not some technical limitation.
Like the sibling comment seems to be saying: a non-nil pointer would have to be set to some real (non-nil) pointer value anyway. So having a zero value does not seem to apply?
The important detail that has bogged down almost every union type discussion is "zero values". What would be the zero value of a union type?
If you've written Go you know the entire language is built around zero values, disabling it for some types is not an option.
You would be wrong. Any use annotations complicates the language and compiler.
As for first entry, this has been discussed many times without ever reaching any consensus, see - https://github.com/golang/go/issues/19412
> At one level we easily do something that looks like a Result type in Go, especially now that we have generics. You make a generic struct that has private fields for an error, a value of type T, and a flag that says which is valid, and then give it some methods to set and get values and ask it which it currently contains. If you ask for a sort of value that's not valid, it panics. However, this struct necessarily has space for three fields, where the Rust enums (and generally union types) act more like C unions, only needing space for the largest type possible in them and sometimes a marker of what type is in the union right now.
That seems better than not having algebraic data types at all.
This is sorely needed to simplify error handling and getting rid of nil pointers panics. Would love to see a linter written for Go after something like this is created to ensure absolutely no naked pointer is ever returned.
I've been using the Go full time since about 2013 and it's a massive issue, and has woken me and my teammates up at night. Specially, when junior engineers are involved. Why hope to catch these issues with code review if the type system/compiler can do it for you instead?
Some of the most common areas that infest the code with nullable pointer types are when you have to deal with de-serializing data a lot. This is due to lack of a common built-in Optional type (I know you can define one easily, but you can't force libraries that you rely on to use that type).
The best we have now is https://github.com/uber-go/nilaway and it's improving with time. It does static analysis, but it has a very difficult job to do, so right now it's super slow to run, and is prone to having false-positives.
> One core requirement for this is what Rust calls an Enum and what is broadly known as a Union type
Union types and enum types are not the same thing, and this misunderstanding invalidates the entire article. An enum type includes a marker which indicates which value it contains. The garbage collector would be able to read this tag value and know.
I don't know what to tell you, there are just multiple ways to use the same words.
What Rust calls enums are called tagged unions in most other languages. Enum usually refers to a simple collection of named values, what Rust calls fieldless enums or unit-only enums.
> are called tagged unions in most other languages
Not sure about that. Swift also calls them enums, in Java and Kotlin (and maybe Scala? I forget) they're sealed classes/interfaces and in PL theory and many typed FP languages they're called "sum types".
The "tagged" part is important. They are in fact not called "unions" in most languages, only C-derived languages. And in C, unions are untagged. Only later extensions added tagged union types. So I would contend that unqualified "union type" means an untagged union.
It sounds like you don't have exposure to other languages. C defaults to untagged unions but there are plenty of languages where a union is always tagged.
To put it another way, a Rust `enum` is not the same as C's untagged unions but it is still a union.
The article says the garbage collector can't know which variant the union type is. I don't know how to interpret that other than an (incorrect) assumption of untagged unions.
It is written a little confusingly but that's not what they're saying. They were saying you can't implement a union in user code with the current garbage collector because there's no way to tell it which variant your union is:
> let's ask why we can't implement such a union type today using Go's unsafe package to perform suitable manipulation of a suitable memory region.
They are aware that you could add union types to Go - their point is that it would require garbage collector modifications which may be difficult:
> The corollary to all of this is that adding union types to Go as a language feature wouldn't be merely a modest change in the compiler. It would also require a bunch of work in how such types interact with garbage collection, Go's memory allocation systems (which in the normal Go toolchain allocate things with pointers into separate memory arenas than things without them), and likely other places in the runtime.
Meaning of union and enum changes based on which programming language terminology you use. Enum of Rust is tagged union of C with compiler support. Enum in C is just basic enum of Rust without any fields. The list goes on like this.
Yes, the Go garbage collector would need to support some new memory layouts to make certain kinds of unions efficient. A union between a double, a pointer, and small integer types (as done in languages like JavaScript) might be a good start?
I've thought about this while working with Go more frequently over the past few months.
Metadata. The same way shapes of functions let Go know which generic 'profile' to use for a given thing are metadata. Just like when a reference to something is taken that type is compile time metadata.
Structure. The precise layout of structures and other data fields is also compile time metadata. They don't even need to remain the same between versions of go, or even builds if somehow they're randomized. That isn't how programmer's think (at least any who were also trained in assembly???). When I lay out a struct I do expect undersized fields to get padded, but I expect every field in order, and I'd prefer some way of forcing the issue for precise padding.
However, 'union' of types is just syntax sugar. Give the programmer the above basics and add one more: builtin.*reshape()*. reshape() would allow any similarly shaped structures to replace the type of the reshaped item. E.G. reshape({x, y, z uint64},{x, y, z int64}, A, B) would convert a 192bit chunk of 3 ints of one type to the other. It could also convert anything else similarly.
That's a trivial example, what about some private structure from a library? I'd think the unsafe package's version should allow violation of the private field space, but the normal safe version might force the unexported fields to 'pad' (inaccessible) space. I don't think this would alter garbage collection, as that process likely has to keep it's own track of regions of memory and places that point within them. That's runtime (maybe compiler time sometimes?) metadata which reshape() would have to work with.
Generics even _sort_ of do this already when prefixed with ~type in the list of allowed types; the compiler's allowed to use the passed thing as that type of value and the return it back the same as input... and I really don't see why a reshape function couldn't do the same thing.
There is something else though: reshape() likely needs to consume / claim the resource, since it wouldn't initialize anything. So maybe it needs to return the recast value to be assigned or passed somewhere and further invalidate usage of the variable past that use. Alternately it could take a single value variable and modify the type as part of it's call (also providing the value as it's return would be useful sometimes too).
Additionally, just having actual enumerations, Pascal / Algo style, not ML style, would already be an improvement over the iota/const hack.