points by stavros 5 years ago

> αcτµαlly pδrταblε εxεcµταblε

As a Greek, if you do this, I hate you. Why the hell do you have to make me read "actmally pdrtable execmtable"? At least this is one of the less offensive cases.

EDIT: Solidarity to our Cyrillic friends!

jart 5 years ago

Author here. I wanted to honor Greece for the amazing cultural impact they've had, similar to how mathematics honors Greece. We got a lot of comments like this in the last thread. What dang said about it was really smart: https://news.ycombinator.com/item?id=24264514

  • JxLS-cpgbe0 5 years ago

    What dang said about it was not smart.

    > it's good for readers to have to work a little

    Unless they're using assistive technologies. In that case it's a nightmare. Don't make your users work.

    > it's not hard for any HN reader to do the bit of work to figure it out

    Unless they're using assistive technologies. Or just want to read it without work.

    Or, say searching for it. This post comes up. The one you linked to doesn't.

    https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

    Respect to you for wanting to honor Greece. I think using the letters* correctly would honor them more. (thanks for the correction)

    • StavrosK 5 years ago

      > I think using Cyrillic correctly would honor them more.

      (Greece doesn't use Cyrillic but I agree with you otherwise)

    • ClawsOnPaws 5 years ago

      Yes. My screen reader, at least Voiceover on my phone, had a stroke reading that. I had to navigate letter by letter and guess what it meant. But it's also quite common so I'm used to doing that regardless.

  • ActorNightly 5 years ago

    I know this is a loaded question, but are there any resources you can point to in learning the linux syscall stuff, or perhaps writing a C compiler from scratch? I thought I had a fairly good grasp of this stuff but after looking through cosmopolitan code, I realized Im not even close.

    • jart 5 years ago

      Rui is writing a book for the chibicc compiler in the cosmo codebase. I should probably write a book on system interfaces since there's no school for it. I had to go straight to the primary materials, i.e. the source to pretty much every existing kernel and libc along with the historical ones in order to understand the origin of influence. That's what helped me have a razor sharp focus on the commonalities which made this project possible.

      So I'd say that the SVR4 source code would be a good place for you to start. It's like ambrosia and once you've read it you can always tell by reading modern code which developers have and haven't seen it. There's also the Lions' Commentary on Unix. I highly recommend Richard W. Stevens. The last book on the required reading list is BOFH.

    • matheusmoreira 5 years ago

      > learning the linux syscall stuff

      I've been studying this for a while. Turns out Linux has an amazing interface. It's stable and language-agnostic. All you need to do is put the values in specific registers and execute a special instruction. The result comes back in one of those same registers.

      The high level documentation is here:

      https://man7.org/linux/man-pages/man2/syscall.2.html

      https://man7.org/linux/man-pages/man2/syscalls.2.html

      https://www.kernel.org/doc/Documentation/ABI/stable/syscalls

      https://www.kernel.org/doc/Documentation/ABI/stable/vdso

      On Windows there is a similar interface but it is not stable. The system call numbers can change. Developers are supposed to use the good old Microsoft DLLs in order to get anything done. Just like how everyone uses libc on other systems.

      Linux is different. The system call binary interface is the Linux interface. So it's actually possible to trash all of GNU and rewrite the entire Linux user space in Rust or Lisp or whatever. It doesn't have to be written in C. It doesn't even have to be POSIX compliant. Could be GUI-focused!

      All you need to make any x86_64 Linux system call is this code:

        long
        system_call(long number, long _1, long _2, long _3, long _4, long _5, long _6)
        {
            register long rax __asm__("rax") = number;
            register long rdi __asm__("rdi") = _1;
            register long rsi __asm__("rsi") = _2;
            register long rdx __asm__("rdx") = _3;
            register long r10 __asm__("r10") = _4;
            register long r8  __asm__("r8")  = _5;
            register long r9  __asm__("r9")  = _6;
      
            /* r8, r9 and r10 may be clobbered but can't be in the clobbers list
               because the compiler won't use clobbered registers as inputs.
               So they're placed in the outputs list instead. */
            __asm__ volatile
            ("syscall"
      
             : "+r" (rax),
               "+r" (r8), "+r" (r9), "+r" (r10)
             : "r" (rdi), "r" (rsi), "r" (rdx)
             : "rcx", "r11", "cc", "memory");
      
            return rax;
        }
      

      This is all you need to do anything. You can perform I/O. You can allocate memory. You can obtain your terminal's dimensions. You can perform ioctl's to your laptop's camera. You could make a new programming language today and all it really needs to be complete is this single function. What if instead of having this function the compiler could simply emit code that conform to this binary interface? The language could have a system_call keyword that generates Linux system call code!

      Once I realized this I tried to turn it into a library called liblinux... I stopped working on it when I found out the kernel already has an awesome single file header you can include that lets you build freestanding Linux executables for a ton of architectures. They use it on the kernel to build their own tools!

      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

      It even includes process entry point code! Linux copies the argument and environment vectors to the stack before entering the executable. The process start up code obtains those pointers and passes them to the main function. It also ensures the exit system call is called.

      The process entry point is usually called _start because that's what linkers look for by default. In reality the ELF header has a pointer to the program's entry point, the actual symbol doesn't matter. You can tell the linker to set it to any other address or symbol. Also note that it's an entry point, not a function. There is no return address. Allowing that code to terminate results in a segmentation violation. Hence the need to ensure exit is called before that happens.

      The only feature that seems to be missing is support for the table of auxiliary values:

      https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

      The auxiliary values are placed on the stack immediately after the environment vector. So all you need to do to find this pointer is loop through it until it goes out of bounds. I wrote this code and it works:

      https://github.com/matheusmoreira/liblinux/blob/master/start...

        struct auxiliary { Elf64_Off type; Elf64_Off value; };
      
        static void *after(void *vector)
        {
            void **pointer = (void **) vector;
            while (*pointer++ != 0);
            return pointer;
        }
      
        int liblinux_start(void *stack_pointer)
        {
            long count;
            char **arguments;
            char **environment;
            struct auxiliary *values;
      
            count = *((long *) stack_pointer);
            arguments = ((char **) stack_pointer) + 1;
            environment = arguments + count + 1;
            values = after(environment);
      
            return start(count, arguments, environment, values);
        }
      

      You can just loop over the pointer to the structure until you find one with type equal to AT_NULL. Example here:

      https://github.com/matheusmoreira/liblinux/blob/master/examp...

      • jart 5 years ago

        Author here. You would like this project: https://chromium.googlesource.com/linux-syscall-support/ Thank you for reminding me of the joy I felt when I discovered this. I feel like you should publish this and post it on Hacker News. Because too many people who post here hold the viewpoint that SYSCALL is evil and you must link the platform libc dynamic shared object or else you're a very horrible person who deserves to have their binaries broken like Apple did to Go. But they wouldn't feel that way, if they could just see the beauty you described.

        • matheusmoreira 5 years ago

          Thanks! Your projects are so inspiring. I too felt great joy discovering all this. Every time I see someone asking about system calls I respond by writing about everything I know. I usually don't get many replies... So happy to see another person who understands.

          > You would like this project: https://chromium.googlesource.com/linux-syscall-support/

          Yes, I would! I saw references to this library in your source code, specifically your jump slots implementation. I had no idea Chromium had this and I've been meaning to explore it later. I'm gonna do it now.

          > Because too many people who post here hold the viewpoint that SYSCALL is evil and you must link the platform libc dynamic shared object or else you're a very horrible person who deserves to have their binaries broken like Apple did to Go.

          I know what you mean! Using system calls are heavily discouraged by libc maintainers and even users. Using calls like clone will actually screw up the global state maintained by glibc threads implementation. It gets to the point where they don't even offer wrappers for system calls they don't want to support. I don't like it... What's the point of an amazing system call that lets you choose exactly which resources you want to share with a child task if all it's ever used for is some POSIX threads implementation?

          Even the Linux manuals do this for some reason: the documentation I linked in my above post actually describe the glibc stuff as if it was part of the kernel and leaves the actual binary interfaces as an afterthought. Linux manuals also inexplicably host documentation for systemd instead of a generic description of how a Linux init system is supposed to interface with the kernel. It makes no sense to me!

          I even asked Greg Kroah-Hartman about it on Reddit:

          https://old.reddit.com/r/linux/comments/fx5e4v/im_greg_kroah...

          I actually think using the system call interface is better than using the C library. No thread local errno business, no global state anywhere, no buffering unless you do it explicitly, no C standard to keep in mind... It's just so simple it's amazing. It's also stable unlike other operating systems which ship user space libraries as the actual interface. On Linux there's no reason not to use it!

          > I feel like you should publish this and post it on Hacker News.

          I wrote a liblinux library, the README describes part of my journey learning about this system call stuff. Lots of LWN sources!

          https://github.com/matheusmoreira/liblinux/blob/master/READM...

          I've been thinking about expanding on it in order to describe everything I know about the Linux system call interface. You really think I should publish this?

          The reason I didn't post liblinux here is it's in a very incomplete state and actually less practical than the kernel nolibc.h file. I only discovered the header much later into development and figured there was no point anymore since the kernel had a much better solution not only available but in actual use. I ended up rewriting autoconf in pure makefiles instead...

          • jart 5 years ago

            We're pretty much on the same page. I'm not sure if I share your enthusiasm for clone(), but I think the canonical Linux interface is what's going to save us from the dark patterns we see in userspace. I think everyone should learn how to use raw system calls. Because the first thought that's going to cross their mind is "wow I thought my C library was doing all these things" and then they're going to want a C library that offers more value than putting a number in the eax register.

            For example if you want to call fork() using asm() then on Linux it's simple:

                int fork(void) {
                  int ax;
                  asm volatile("syscall" : "=a"(ax) : "0"(57) : "rcx", "r11", "memory", "cc");
                  if (ax > -4096) errno = -ax, ax = -1;
                  return ax;
                }
            

            But if you want to support XNU, FreeBSD, OpenBSD, FreeBSD, and NetBSD too, it gets a little trickier:

                int fork(void) {
                  char cf;
                  int ax, dx;
                  ax = IsLinux() ? 57 : 2;
                  if (IsXnu()) ax |= 0x2000000;
                  asm volatile("clc\n\t"
                               "syscall"
                               : "+a"(ax), "=d"(dx), "=@ccc"(cf)
                               : "1"(0)
                               : "rcx", "r11", "memory", "cc");
                  if (cf) ax = -ax;
                  if (ax > -4096) errno = -ax, ax = -1;
                  if (ax != -1) ax &= dx - 1;
                  return ax;
                }
            

            Cosmopolitan abstracts stuff like that for you, but right now that's only if you're the kind of person who doesn't need threads. I imagine you are, since folks who do smart things with multiprocessing models like Go and Chromium usually don't want C libraries potentially stepping on their toes. Oh gosh threads. The day I figure out how to do those, will be day the whole world will want to use this thing. But I want people who use Cosmopolitan Libc to know what value it's providing them. I think the best way to do that is by raising awareness of the systems engineering fundamentals like this. Because that's something you're right to point out that the Linux community leadership has room for improvement on.

            • matheusmoreira 5 years ago

              I remember simple use cases for clone() such as spawning child processes with just enough shared resources to execve(). I remember reading a lot of old emails from Torvalds about it, can't find them anymore.

              I used to value portability but now I believe in using Linux everywhere and for everything. I like OpenBSD too but Linux is the stable one you can build anything on. What I wanted to eventually accomplish is a 100% freestanding Linux user space with no libraries at all. Maybe boot straight into the program I want to use, just like we can pass init=/usr/bin/bash in the kernel command line. How far could this go? Using nothing but system calls it's actually possible to get a framebuffer and use software renderering to draw some graphics. I'm guessing pretty far.

              By starting from scratch like this it's possible to fix all the historical problems with our systems. For example, I think it's unacceptable when libraries keep global state. This can't be fixed without getting rid of libc and its buffers and caches and errno. Removing this cruft would actually simplify a threads implementation. And then there's completely insane stuff like .init and .fini sections:

              https://blogs.oracle.com/solaris/init-and-fini-processing-wh...

              A similar statically-linked user space project I found years ago:

              https://github.com/arsv/minibase

              • jart 5 years ago

                That seems kind of contradictory. Your biases are tuned towards what works for big codebases but you're taking a first principles approach. A simplified threads implementation is called Java. Plus it gives you cool classes like Phaser. Threads will always be a horror show with C/C++. There's a reason why Linux is the only kernel that implements clone(). It's controversial.

  • soheil 5 years ago

    Aren't the Greek symbols used in math void of implicit meaning? You're taking a meaningful English sentence and replacing its letters with Greek letters while making it extremely difficult for people with disability on screen readers, those two things are not the same.

  • stavros 5 years ago

    Ah, I don't want to make a fuss about it (my comment was tongue-in-cheek), it's really not a big deal, but it is annoying to spend 2-3 seconds trying to figure out if you're having a stroke, and then some more trying to suss out what the sentence is actually trying to say.

    If you want to honor Greece, use the letters as they're meant to be used! "Acτuaλλy πoρτabλe εxecuταbλe" would be much better (though I've intentionally tried to give English readers a stroke with this one :)!

    • user-the-name 5 years ago

      The entire project is built on not using things the way they are meant to be used, though. The name is kind of doing the exact same thing the code is.

      • stavros 5 years ago

        Though, oddly enough, the English letters are used exactly how they're meant to be used :P

      • anoncake 5 years ago

        So the code is "valid" on all platforms, but actually crashes on the "Greek" one?

  • zem 5 years ago

    I have to admit, every previous time I saw this linked I didn't bother clicking through, because from the title I thought it was a post mocking the concept of portable executables.

    • TeMPOraL 5 years ago

      I'm guessing this the unfortunate consequence of the pattern "actually, " becoming a pejorative meme in the past year or so.

      • zem 5 years ago

        no, it's that the greek letters reminded me of the twitter "Im MoCkInG SoMeThInG sTuPid" format

  • defgeneric 5 years ago

    It's actually not smart at all. Replacing the letters in the Roman alphabet with Greek letters based on superficial resemblance is not any different from replacing the "R" with "Я" when writing about anything Russian-related (you see it stupidly used in book covers, t-shirts, etc).

    How does this do anything to honor the cultural legacy of Greece? Perhaps we could honor the legacy of 19th century mathematics by using Fraktur characters when they resemble Latin ones?

    When people who can read Greek are telling you it's bad taste maybe take their word for it! Not dang.

    What you're really saying is that the Greek alphabet (and by extension its language community) is so insignificant compared to Latin that the cost of potential misrecognition is so low that it can be disregarded. This is chauvinism, not "honoring Greek mathematics"!

    • ulzeraj 5 years ago

      Word. I'm still trying to find out who Doidld Tyatsmr is and why is he so hated in the US.

  • simonebrunozzi 5 years ago

    > The quality of this post is so high that it doesn't feel right to override any aspect of what the author created, including quirks like the title.

    I agree with dang's feelings/thoughts about the issue.

    Perhaps a solution would be to add the "normal" meaning between parenthesis, after the one in greek alphabet?

  • simonebrunozzi 5 years ago

    By the way, Justine: great work. Besides the obvious HN recognition, I wanted to tell you explicitly as well.

    What are you going to work on in the near future? Curious to hear about it. If you don't want to post in public, $my_hn_username at gmail

  • climech 5 years ago

    I appreciate the good intentions, but confusing Greek readers doesn't seem to me like a good way to honor the cultural impact of Greece.

yosefk 5 years ago

As a Russian speaker, I LOVE these things. Both ways (Russian letters abused to spell English words and vice versa.) In fact I miss old phones with just English keyboards where abuse to spell Russian words (eg CCCP) was an art form, for a brief period.

jcburnham 5 years ago

Especially since if you want to port the title to Greek lettering, you have upsilon and omicron for u and o:

αcτυαlly pοrταblε εxεcυταblε

  • stavros 5 years ago

    Omicron looks exactly the same as the English o (it's not visually distinguishable in most typefaces) so it doesn't matter much, but upsilon is an "ee" sound usually, not an "oo" like in "actually" and "executable", so it wouldnt' work exactly. It would read "actially execeetable".

    EDIT: For completeness, the full transliteration (or as close to it) would be "άξουαλι πόρταμπολ εξεκιούταμπολ". The extra "o" in "portabol" and "execiutabol" is actually a schwa, I think, so it can be omitted.

    • jcburnham 5 years ago

      Upsilon is admittedly an "i" sound in Modern Greek, but in Attic Greek (which is what I studied, sorry) it did have the "oo" sound.

      Edited: I missed the pi and rho completely though, my bad

      • StavrosK 5 years ago

        Ah yes, you are correct!

Tenal 5 years ago

Deep breaths. Deep, slow breaths. You're going to be just fine.

soheil 5 years ago

Because it makes you look like you know what you're doing, not too different than obfuscating javascript for the sake of security, which does kind of work, at least on the lowest common denominator type of attacks, and this does kind of works too by having people think you're more of a genius than previously thought because you can turn boring English letters into something exotic which appeals to the ignorance of the masses [0].

[0] https://en.wikipedia.org/wiki/Argument_from_ignorance

  • blueline 5 years ago

    you might be overthinking this