Threads for deepchasm

  1.  

    England should drive on the right side of the road like everyone else, right?

    Except the NPV (net present value) of the switching costs end up being more than the NPV of the payoff. Perhaps the cost of deprecating JavaScript will be higher, for decades, than just supporting it.

    In general, WASM means that LLVM targets from your favorite new breed of languages will run in a browser and other infrastructure.

    1. 2

      Still recovering from the COVID virus, but past the sleeping all day phase. Making a map of the house and yard. Using ChatGTP as junior programmer. Plotting to take over the world.

      1. 1

        So, Nintendo is dead now?

        I really don’t need a console game system anyway. It’s not like my eyeballs lack for diversions.

        1. 3

          I don’t think so. Because most normal customers are not affected. They buy a Console with some games and play them. To play games on there own Computer is not a feature most users want/need.

          Also this case is a bit different, because Nintendo don’t choose to be evil just to be evil. They currently want to prevent pirates from playing a not jet released game. I don’t think this will help, but I can clearly understand the reasoning.

          1. 1

            Hah. It’s just another day in the “Nintendo being assholes” news cycle.

            Even amongst the small group of people who care right now, a good chunk are going to stop caring as soon as the next installment of ${NOSTALGIC_NINTENDO_ONLY_FRANCHISE} gets announced.

            1. 1

              People were super mad at Sony a few years back, and they’re still around, so I think Nintendo will be fine.

            1. 5

              Very cool. However, while I’m not sure how they exactly define a transition it would seem that their claimed transition from 2 to 6 goes via 5.

              1. 6

                The most straightforward definition of a transition is “a process during which a system starts at state X and ends at state Y”. Forbidding the system from passing through any of a set of states can be interesting, but that’d be an additional requirement. (Maybe there’s some way to define a transition that incorporates this requirement as an inherent property, but I can’t think of it.)

                Also, the state of the system includes the velocity and the acceleration of each node [1]. The transition from 2 to 6 goes through a state where the relative position of the nodes are the same (or very close to) state 5, but it does not go through state 5 itself because the system isn’t stationary at that moment.

                [1] Otherwise you can just swing the pendulum randomly at different speeds; the arms will accidentally line up from time to time, and you can claim to have achieved the transitions.

                1. 2

                  In this case, I am somewhat confused.

                  The first minute shows transition between each state. For example, 5 -> 0 -> 3, if intermediate states are allowed.

                  What am I missing?

                  1. 3

                    What am I missing?

                    That it does them in one single fluid motion.

                    1. 2

                      Hmm I suppose you mean because all the transitions between 0 and X are shown, the transition from X to Y can simply be made by X to 0 and 0 to Y?

                      That is correct - but the other transitions shown are shorter. I imagine those are the shortest the creator found.

                1. 17

                  There will always be many responses of “the leopard has not, to date, eaten my face” and “risking a face is well worth it to have a leopard”. Sometimes the answer is simply to decide not to play. Signal will, in the fullness of time, sheepishly admit to a slew of security problems, sharing the complete message stream with various governments, and enact new monetization methods.

                  The leopard will always eat your face.

                  1. 13

                    Is this based on anything or are you just saying you get the wrong “vibe” from Signal? If so, what would you suggest to replace it? Signal has done a better job at bringing secure, encrypted communications to the masses than any other group or app that I’m aware of, and it’s detractors always seem to have arguments like it feels wrong. And then they typically suggest replacing it with options that either don’t even encrypt basic metadata (like Matrix) or are wildly difficult to use and could never possibly gain mass adoption (like Briar).

                    1. 3

                      There are parts in the security design of Signal that are lacking (though it also did innovate better security in quite a few parts). Signal also fails on some basic security practices, e.g. https://github.com/signalapp/Signal-Desktop/ does have commits and releases that are not signed. I offered to help them specifically with that on their bug tracker, but nobody from Signal signalled interest nor was it fixed in the years since then. Other suggestions that would IMHO improve security they declined with incorrect technical reasoning. I think there are a few of these arguments that are scientifically accurate and would not clash with their anti-federation stance. But IMHO the leopards would still eat your face even if anti-federation is the only issue.

                      https://github.com/simplex-chat/simplex-chat does read like it has massive security design improvements over some parts that Signal is lacking, however I neither reviewed it in detail nor tested it yet. It seems to satisfy your requirements for suggestions, can you confirm?

                    2. 4

                      Maybe. Signal’s already answered subpoenas in US court (see their writeup here) and they would be in an immense, project-ending amount of legal trouble if it turned out that they were technically able to provide more data, but chose not to reply. I don’t want to accuse you of FUD, but I do want to point out your lack of evidence.

                      1. 1

                        Technically OP didn’t say decrypted messages nor past message stream, which leaves the stream of still encrypted future messages (and their metadata). If the signal organisation or individual employees are being compelled by a court or otherwise under duress to modify the servers to share this, then no technical mechanism is in place to prevent this. Such a protection mechanism is practical.

                        Furthermore getting the cleartext is also possible in a similar way. None of the signal clients are protected against signal being compelled to create an update that exfiltrates the cleartext of old stored or new messages. It could be prevented by the combination of having security reviewers in multiple jurisdictions, using verified reproducible builds, and using an updater fetching from an observed global append only log / binary transparency.

                        Is this sufficient evidence for you that it is possible?

                        AFAIK Signal has no willingness to accept help to implement either of these protections nor willingness to implement themselves. I’m interested in contrary evidence.

                    1. 8

                      This looks very productive and reminds me of python. If somebody from the go core team reads this: Please, please take inspiration from this!

                      1. 4

                        The syntax does 1:1 map into Python. Python has long placed code being beautiful (clean; readable out loud) as a top design constraint. Zig does away with the range() and zip() constructions.

                        • Zig:for (elems) |x| vs Python:for x in elems.
                        • Zig:for (a..b) |n| vs Python:for n in range(a,b), and the extension in Python of for n in range(a, b, stepvalue)
                        • Zig:for (elems, nats) |e, n| vs Python:for e, n in zip(elems, nats)
                        • Zig:for (elems, nats, 0..) |e, n, idx| is messy in Python. Usually Python:for idx, (e,n) in enumerate(zip(elems, nats)). One could do Python:for e, n, idx in zip(elems, nats, itertools.count()) but it seems less clear.

                        There are a some languages, like GDScript, that back away from these constructs because they encourage a style of programming that creates intermediate lists. This is perceived as slower, though it can be wrong. For example, for a in [i for i in thing if rare_condition(i)] has an intermediate list and is marginally faster then using an intermediate iterator. At least, its complicated.

                        It would be good for a simple nomenclature to cover these so that all languages would provide this functionality.

                        1. -2

                          Python’s popularity is a sign of the end times. Prove me wrong.

                      1. 6

                        While an actual COBOL is likely to more an oddity and emergency tool for ancient banks.

                        On the other hand, there is a time to really look at COBOL, the work-horse of a generation, and ask the important question: “What did it do right that we can steal?”

                          1. 1

                            His joke lost me when he essentially said “I hate syntax so much I want to invent a new language with even more syntax! And I want that syntax to be even harder to learn and reason about!”

                            There’s stuff we should take from COBOL. Its syntax, however, is an experiment which failed.

                        1. 2

                          TL;DR: Typing is a useful tool making software cheap to maintain in the long term, which the author now needs.

                          Strong typing has a small cost for the individual developer and provides a benefit in helping coding between authors or across long time spans. The author has changed jobs from a goal of cheap in the short term to cheap in the long term and now extols typed languages.

                          1. 1
                            • State, or the structure of data, representation encompasses complex choices about the meaning and scope of a program.
                            • Algorithms also encompass complex choice about the meaning and scope of a program.

                            The leads to the novel and startling conclusion: Algorithms + Data Structures = Programs

                            More seriously, the art of development is about alignment of interests. Various forms of programming (functional, object oriented, declarative, imperative, etc.) are attempts to satisfy competing interests:

                            • The system can ship MVP quickly (initially cheap)
                            • The system can be add new features without changing architecture (longer term cheap)
                            • The system has enough information to create a fast binary (initially fast)
                            • The system can scale to an exponential growth of users (longer term fast)
                            • The system has restrictions to provide safe and complete execution (initially good)
                            • The system can be worked on by hundreds of independent teams *(longer term good) *

                            There are different interest groups of different strengths in organizations. Aligning the interests into architecture to fully capture intent from developer’s attention is still between an art and a craft.

                            1. 2

                              Sorry, your history is incorrect.

                              XML was heavily pushed by IBM during the early 1990s as a way to replace CSV files, fixed width data files, and “just dump the record to disk” formats. It outperforms these in every measure but size. As the IBM model at the time was to provide consultants to a zoo of hardware and software, this allowed a mechanism to slowly wean these systems off legacy data formats. As is often the case, the usual suspects came out to push for XML as a contract negotiation system, a universal solution for some imagined problem, etc. For some time, IBM had some proprietary compression schemes lower in the network stack.

                              That said, XML has tricky corners and some line noise is valid XML.

                              1. 1

                                Still futzing around trying to capture something of COBOL. It is universally despised by computer science, and, yet it added quite a lot of value during its time. I’m trying to capture the ideas of declaring all types in advance, specifying deeper than what we think of as ‘base types’, and adding active vocabulary to the documentation.

                                The hardest part is getting past the inculcated ‘ick factor’ of COBOL and the real factor of COBOL having so many sharp edges, like a JCL.

                                1. 6

                                  Glah! These editors are terrible. Where are my breadcrumbs (automatic saves between commits, for me not the team)? My debuggers have no evil or fuzzy schedulers to do optimally nasty swapping between jobs. I have no internationalization support to speak of, no type discovery from test cases, and no line/spec/test correlation.

                                  We need to cultivate an understanding of how great the possible is, lest we get complacent with the commonplace.

                                  1. 1

                                    Truly, old molehills make the best mountains.

                                    1. 3

                                      Sounds fun! I look forward to it.

                                      I appreciate when people take effort to improve the world.

                                      1. 2

                                        The most under written project file is the Contributing.md. Most examples never talk about where the bar is for suggestions, what type of responsiveness should happen, and what size of commit to push. This lack of clarity is often expressed by repositories with aging, unloved, unreviewed pull requests.

                                        1. 4

                                          Odd, over time I start enjoying dynamic typing more. At the risk of a competing article, here are my small praises:

                                          • There exists the “Good, Fast, Cheap” triangle. I tend to heavily optimize for “Cheap” so that the developer can explore the problem space quickly. O(n) doesn’t matter if you only run it twice before deciding the problem has more complexity and needs a different approach. Once the code stabilizes, start using more developer time to move the trade-offs for “more good, less cheap”.
                                          • Tricks used to try to dynamically make code work better than decorators. For example, instead of calling ’place_window(w, point, level), I might, even in the debugger, call trace_place_window(w, point, level)`. The function did not exist before the call and is just the function wrapped in a tracing function.
                                          • I wish my software discovered types better. For example, if I used a tuple of (int, string, string) more than once, it should recognize this is really a type and show it as such.
                                          • I truly hate that adapters fall out of date with what they wrap and are just piles of boilerplate.

                                          At the end of the date, we are to solve problems. We ship code to solve problems. Test cases, template specifications, git logs, and even documentation is just a cost that is worthwhile only if reduces the net cost of shipping the code.

                                          1. 1

                                            Higher level languages often do have some guarantees for UB, but not much. For example, the UB in your Java program may not crash the JVM. It may hang it, but not crash it.

                                            1. 2

                                              AFAIK Java has no undefined behaviour.

                                              1. 2

                                                There’s definitely less UB in Java, but not zero. The outcomes of misusing the functionality in the Unsafe class are most certainly undefined: https://blogs.oracle.com/javamagazine/post/the-unsafe-class-unsafe-at-any-speed

                                                Rust and C# have similar “UB only with unsafe” designs.

                                                And if you think that UB in Java can’t crash the JVM, try using Unsafe to scribble random bytes at random offsets in memory for a bit :) At a previous job, I was in charge of a component that used Java Unsafe and bugs in that component frequently caused programs to die with a segfault and without a Java stack trace.

                                                1. 1

                                                  Hmmm…. I no longer program in Java. For years after working at JavaSoft, I claimed I did not know Java.

                                                  If you were able to get undefined behavior with poor synchronization. From 17.4.8: We use f|d to denote the function given by restricting the domain of f to d. For all x in d, f|d(x) = f(x), and for all x not in d, f|d(x) is undefined.

                                                  There are others I didn’t get burned by: https://itecnotes.com/software/java-undefined-behaviour-in-java/

                                                  The existence of a test suite (new to include in a language) did cut down the undefined stuff. We had a policy of ‘excusing’ any test failure and letting people still brand as Java on a release. To release the next update, they had to pass every compatibility test they failed previously.

                                                  Still, we ran out of Russian software professors desperately working from the spec when it was too dangerous to ship them workstations. The lack of undefined behavior is merely a pleasant fantasy.

                                              1. 4

                                                This is the triumph of the slowly rising tide of open source.

                                                The lifecycle is clear. A freely available program (git) gets effectively captured by a company built to support the community (GitHub). Company purchased by a larger company (MicroSoft) where updates are not in its best interest. Other large companies (e.g., FaceBook), usually competitors, create a fork for internal use. For whatever reason, the fork is later released to the public. The cycle of life continues.

                                                1. 13

                                                  The fork is actually of Hg and the reason the fork exists is because like most companies of the size of Facebook and with the scale of the tech stack like facebook end up needing a particular type of tooling. Google has something similar to this internally as well that hasn’t been open sourced.

                                                  This particular thing is not core to the companies value proposition despite being necessary tooling however so offering as Open Source has no almost no downside.

                                                  1. 9

                                                    Company purchased by a larger company (MicroSoft) where updates are not in its best interest

                                                    GitHub has had lots of updates since the Microsoft acquisition, no?

                                                    1. 2

                                                      Yes, such as adding achievements and a proprietary service made using open-source code.

                                                      1. 1

                                                        It has had many. For example, it provides more barriers to new contributors by refusing to run continuous integration scripts on pull requests. It provides better integration for VSCode with authentication methods that cause more work for new contributors. Everyday brings a new elaborate system to replace a previous step.

                                                        Lots of updates.

                                                    1. 2

                                                      TreeSitter grammars. I have fallen into that old trap of trying to make a new programming language for fun. This one is based on: Python wakes up with a hangover next to COBOL!

                                                      1. 13

                                                        This is an interesting observation and I think the message is valid, but I don’t think the benchmark results paint an accurate picture. The author is completely utilizing the computer’s disk I/O bandwidth, but the CPU cores are only utilised at (I’ll take a guess) 12.5%, assuming 8 hyper threads available. I think this distinction is particularly important in the big data case mentioned at the end.

                                                        1. 9

                                                          It may be far worse. As Python has no “@associative” built in to counter, it is probably running on a single core. Said core handling overhead of data movement as well.

                                                          That said, it is a guess. My humble view is that we are no longer capable of optimizing without careful analysis of actual profiling data. Our guesses and logic are no match for chip pipeline and thread oddness.

                                                          1. 2

                                                            Yes, there are definitely straight-forward ways to parallelize this particular problem. That’s part of the reason I put “big data” in quotes. Often data just isn’t “big” enough to worry about it – if I can get the answer I want out of a 400MB or even 4GB input file in a few seconds, I’ll probably stop there. However, if I have to write a more general tool that I’ll use over and over, I’d parallelizing it if it was slow (for some value of slow).

                                                            1. 2

                                                              Maybe it relates to the international nature of Internet communication/English as a second language effects, but a lot of readers seem to not grasp the use of scare quotes for “non-standard” or “difficult to interpret” or “context dependent/no standard even exists, really”. FWIW, I also love them as a pithy “author understands interpretational risk” signal. I don’t have any better answer than more elaborate text – like “some value of”/“in context”/etc. qualifier constructions, but the resulting more complex grammar can be a problem for the very same readers.

                                                              This is all especially an issue in performance of anything where almost everything “all depends” upon, well, quite a lot. E.g., these days (post Nehalem Intel/L3) DIMM reads themselves depend quite a lot on whether done in parallel or serial. So, I have an NVMe that goes 7 GiB/s while your DIMM estimate was 10 G, but I’d bet your parallel DIMM reads are 25..30. Big “scalable” Intel multi-core could see a range of from 8 to 50 GiB/s (or more, even a whole order of magnitude). Personally, it annoys me that I am sometimes forced to go parallel (not necessarily multi-threaded) with all its attendant complexity on an otherwise very serial problem just to saturate a serial memory bus (yes, I know NUMA can be a thing). As this paragraph maybe shows, qualifying everything gets indigestible fast. ;-)

                                                              Anyway, @enobayram, to add a little more detail on your point - this is one way to parallelize the problem in Nim and note that scaling up by just doing 10X something like the KJ bible keeps hash tables artificially small relative to natural language vocabulary use which matters since fitting each vocabulary-scaled histogram in the private (i.e. maybe non-hyper threaded) L2 can make a big perf difference at some data scales. Whether hyper-threaded CPU resource sharing helps also “all depends”. At least on Linux, something like the taskset command might aid in reasoning about that.