1. 21
    1. 6

      It’s a real shame we didn’t get these in ES2015 because Google. Regular folks in JavaScript could use it, but I’m happy at least WASM won’t be limited without tail call like the JVM, etc.

      1. 8

        My understanding is that Google implemented it, but there was a developer rebellion against the change, because it removes stack frames from stack traces, making it hard to debug code. Imperative programmers don’t like the feature, even if functional programmers do like it.

        WebAssembly resolved this conflict by adding new opcodes that make the tail call explicit. If you use the old op codes, you get the old semantics, and no stack frames are removed from your stack trace.

        more info: https://stackoverflow.com/questions/42788139/es6-tail-recursion-optimisation-stack-overflow

        1. 6

          I am wondering if “because it removes stack frames” is the whole reason here.

          Really, these are two different language features:

          • Guaranteed Tail Call Optimization, where compiler can re-use stack frame if the stars are aligned
          • Explicit Tail Calls, where the programmer sytacticaly marks “I want tail call here” places

          The first one is what people like to ask, but that’s a horrible feature to have in the language! The second one is a great tool for both:

          • writing iterative algorithm in recursive style if you feel like it
          • implementing threaded-code interperters

          Why GTCO is horrible? Because it changes semantics of the code, but there’s nothing in the code signalling if this is intentional or not, so it very easily is refactored away.

          1. 1

            Hm can you specify what semantics you’re referring to? In Golang or in general?

            1. 2

              If TCO kicks in, function uses O(1) stack space, and O(N) otherwise. As stacks are short, O(N) actually means outright crash with stack overflow.

              1. 1

                Mm I see. I just wouldn’t call that semantics, it’s an implementation detail. Semantics is what the meaning of the code is, and the code you write is not telling you “every time this function is called X bytes are allocated in a region of memory that grows upwards/downwards”. That’s an ABI thing, and tail call optimization is an optimization of that, of this implementation detail.

                1. 1

                  Example, the C standard doesn’t mention call stack or frames at all. Whereas the amd64 ABI does.

                  1. 1

                    If a programming language supports Proper Tail Calls (don’t call it TCO) as a feature, then this gives you a way to perform a tail call with O(1) space complexity in a chain of tail calls. Stack space doesn’t get used up. Analyzing the space and time complexity of an algorithm using big-O notation is the domain of Computational Complexity Theory, a branch of computer science.

                    In the C++ standard, the space and time complexity of many parts of the standard library is specified (as part of the standard). For example, in std::vector, the time complexity of accessing an array element is O(1).

                    I know you wouldn’t call that “semantics”. But it is information that you need in order to formally reason about whether a particular program meets or does not meet its requirements.

                    1. 1

                      Yes, that’s apparent to everyone who has studied compsci and unrelated to this discussion. I asked “can you specify what semantics you’re referring to?”, out of curiosity, thinking it was related to programming language semantics.

                    2. 1

                      Proper Tail Calls (don’t call it TCO)

                      It’s ok to call it TCO.

                      It’s just a label. We all know it means the same thing.

                      Yes it is - in the context of implementing scheme - more than just a mere optimization. but we really don’t need to split hairs about the terminology.

                2. 1

                  That’s why the part about “stack being short” is important. The difference between “it crashes” and “it returns a result” is semantic, and it’s that difference that tailcalls give you on typical implementations. (If an implementation allocates all frames on the heap, then there’s no practical semantic difference between tco and no tco, just performance)

                  1. 1

                    I’d say “okay why not?” but the reason I commented in the first place is that semantics in PLT is a defined thing, and doesn’t concern ABI details at all. I understand that you didn’t mean it that way however.

                    https://en.wikipedia.org/wiki/Semantics_(computer_science)

                    1. 1

                      Dunno, total and partial functions seem pretty different to me, semantically-wise

                      1. 1

                        This just sounds like a difference of perspective. From the program’s point of view, a function that unconditionally crashes is not only total, but has any return type you like, and can be pretty useful to have! However, of course, as a user, it’s probably not ideal if the program unconditionally crashes.

        2. 4

          JSC does perform TCO in strict mode (the stack is observable for non-strict mode functions) and no one seems to mind. I do feel like in many cases people freak out about losing call stack information because any conversation about TCO starts off with an aggressive “a big downside is you will lose stack frames”, which primes people to over-weigh the downside.

        3. 2

          …weird. Seems obvious to just have a “count” in the stack frame of a function that is tail recursive, and make the recursive call increment it? Then you can have a stack trace that prints out:

          foo()
          bar()  [recurses 1391 times]
          baz()
          

          …which frankly seems preferable to the other kind. You add a single addition to all your tail recursive calls, but that’s still cheaper than making a new stack frame.

          1. 5

            I think you are only considering the case where a function tail-recursively calls itself.

            The proper tail call feature in JS will elide stack frames even when there is no recursion. There can also be groups of mutually recursive functions that call each other. If the pattern of mutual recursion is simple, you might get something like this:

              a -> b -> c -> a -> b -> c -> ...
            

            but if the tail calls are conditional then the pattern of calls could be more complex.

            In cases like this, simply printing out a count of the number of elided stack frames doesn’t show you the pattern of calls that occurred.

          2. 2

            Execution might go through different functions that are tail recursive, for example consider two functions that call each other, even() and odd() The stack without tail calls would be the sequence even, odd, even, odd etc. whereas with tail calls they would use the same frame.

            1. 1

              Hmmm. Trickier, though I would find it vaguely sensible to see:

              foo()
              even()  [recurses 1391 times]
              odd()   [recurses 1390 times]
              baz()
              

              or maybe even:

              foo()
              even() + odd()  [recurses 1391 times]
              baz()
              

              You could argue that “you now need to know what this new type of message means”, but honestly, how many people write mutually-recursive functions but balk at learning a bit more about how tail recursion works in their tool?

        4. 1

          I wonder how expensive it would be to maintain the stack trace information separately in a buffer on the heap. For a compiled language with clear debug and optimized modes, such as Rust, this could be compiled into debug builds and compiled out in optimized builds, but I guess JavaScript would need to decide at runtime, if at all (maybe with some toggle in the developer console).

          1. 3

            A common use case for proper tail calls is implementing a state machine, where you jump to the next state using a tail call. If the state machine is the core logic of a server, or of your text editor, then it could be running for days or weeks. The requirement for proper tail calls is to use constant storage no matter how long the chain of tail calls. Accumulating a stack trace in the heap doesn’t satisfy this requirement, and could cause a long running program to run out of heap space.

            The debugging requirement (understanding the history of what a program did, during debugging) is better met by using a time travel debugger, which automatically keeps a complete history, rather than just a stack trace. The amount of heap space used to record history becomes a debugger configuration, rather than history being all or nothing based on the build type.

            The best way to satisfy the needs and wants of both imperative and functional programmers, within a traditional imperative language with a well established tool chain, is to provide an explicit keyword for proper tail calls.

          2. 3

            The best approach is to use a constant size ring buffer. the point of tco is you get recursion in constant space.

        5. 1

          IMO this doesn’t make it harder to debug code.

          It is also easily solved with a ring buffer.

        6. 1

          If that’s true, then this sheds some light onto this Rob Pike quote: “Our programmers are not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.”

    2. 1

      It’s so sad that they created this universal VM that you can implement any language on top of.. except scheme. If you like scheme go do a slow interpreter.

      1. 3

        Guile is making good progress towards (what should become) performant Scheme on Wasm. Their implementation assumes Wasm VMs will support both this tail calls extension and also the GC extension. Both extensions are implemented in at least V8 and making good progress in other Wasm hosts as well, so Scheme on Wasm should be possible soon.

    3. 1

      I became pretty sceptical on tail call optimization after I read the Guido van Rossum reasoning of not having it in Python: http://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html

      1. 8

        The article is about tail call support in WebAssembly. This is important because it is useful when compiling programs into WebAssembly. In C++20, proper tail calls are used to implement coroutines, even though C++ doesn’t have proper tail calls as a user-accessible feature. Likewise in Swift, proper tail calls are used to implement async, even though Swift doesn’t provide this feature to users. The clang implementation of C/C++ has a non-portable ‘musttail’ attribute that marks a tail call, and it has been used for some stunning performance hacks, such as this:

        Parsing Protobuf at 2+GB/s: How I Learned To Love Tail Calls in C

        Zig, a new systems language, already has proper tail calls, and other systems languages will follow.

        Because proper tail calls are now important for high performance systems programming, they are no longer some weird niche academic functional programming thing, like they were in 2009 when Guido wrote that now very dated blog post.

      2. 2

        tail call optimization is required for scheme, it isn’t required for python.

        it’s a completely different style of programming. even if you prefer python, it’s valid.

      3. 1

        This is only a problem if your language’s only method of error reporting is based on exceptions and stack traces.