Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Generators are such a game-changer for clean separation of data traversal from data processing. It's my favorite trick in python and I miss it dearly in other languages.

In languages without generators it's easy to write code that traverses some data structure in hard-coded way and does something configurable on every node, but it's hard to write code that traverses a data structure in configurable way and does some hardcoded thing on every element.

In Python it's trivial both ways and even both at once.



> it's hard to write code that traverses a data structure in configurable way and does some hardcoded thing on every element

Wouldn’t this just be a function that operates on an iterator? I suppose generators make the creation of lazy iterators easier, but generally the solution for languages without generators is to have your traversal build a list. So you lose the laziness but the code remains simple. Then map your per-node processing to the list.


> generally the solution for languages without generators is to have your traversal build a list

Yes, but then you move from O(1) to O(n) in memory usage. And if you want to optimize it you have to restructure your code, when in python lists and generators are drop-in replacements.


Also, producing a list doesn't work very well as a substitute for infinite generators.


Until you want to walk the generator twice -- stateful generators are a design flaw, in my opinion.


The C++20 coroutines has the potential of being this and much more. As with many things C++, its adoption will be very slow (they are supported by latest compiler versions, but there is no support in the standard libraries yet).


> It's my favorite trick in python and I miss it dearly in other languages.

Which other languages? Generator support is common:

https://en.m.wikipedia.org/wiki/Generator_(computer_programm...


This list really sort of stretches the imagination on what language level generator support looks like.

Certainly in Java you can use the `Iterator` interface to get generator like behavior. But that's both a whole lot more complex and more verbose. Primarily because you have to coordinate the `hasNext` method with the `next` method.

With python and yield syntax, that naturally falls out without a bunch of extra code juggling.


Sure, some of those (C, Java, C++ templates) are pretty weak, but C#, Ruby, ES6, and lots of the other language or library implementations are not. Its far from a unique Python feature.


Java methods like Stream#filter and Stream#flatMap are easier than rolling your own iterator; you can even ask that work be done in parallel.


"In Haskell, with its lazy evaluation model, everything is a generator"


It’s also trivial and easy in Haskell — you just need an instance of `Foldable` or `Traversable` on your collection, and then you can fold or traverse it in a configurable way. Or for recursive structures, use https://hackage.haskell.org/package/recursion-schemes. Or even just pass a traversal function as an argument for maximum flexibility.


How absent is generator support in languages in general? As far as I know, even PHP has had yield for something like a decade, and JS for longer (at least as far back as 2006, when Yield Prolog got discussed on LTU[0]). Or is there something about the Python approach to generators that's distinctive?

[0] http://lambda-the-ultimate.org/node/1732


Language tends to converge on good features, luckily.

Although some generator implementations are not as complete as python ones, and the ecosystem generally donc have much support for it

E.g : Very few languages have an equivalent to itertools.


The Ecmascript 4 proposal included the `yield` keyword, but the proposal as a whole was abandoned in 2008. That’s why the version names jump from ES3 to ES5. Some of the features then returned in later versions, including generators with ES6 in 2015.


This sounds interesting. An example would really be helpful, I have a vague idea of what you're talking about but have a hard time concretizing it.


You have (psuedo)code like this:

    def print_every_file_in(root_path):
       for path in children(root_path):
           if is_file(path):
              print("file: " + path + ", size:" + get_size(path))
           else:
              print_every_file_in(root_path)
If you want to extract the printing part it's trivial in any language:

    def print_file(path):
       print("file: " + path + ", size:" + get_size(path))

    def print_every_file_in(root_path):
       for path in children(root_path):
           if is_file(path):
              print_file(path)
           else:
              print_every_file_in(path)
And you can easily parametrize what function to call on each node.

But in Python it's equally trivial to extract the traversal:

     def every_file_in_path(root_path):
       for path in children(root_path):
           if is_file(path):
              yield path
           else:
              yield from every_file_in_path(path)

     def print_every_file_in(root_path):
        for file in every_file_in_path(root_path):
           print("file: " + path + ", size:" + get_size(path))
or even extract both:

     def print_every_file_in(root_path):
        for file in every_file_in_path(root_path):
           print_file(file)
which for me is a very clean way to implement this and it's nontrivial to do in languages without generators.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: