More

killercup · on Nov 7, 2023

Because this code block looks quite complex, I want to add that it can also be just

    smol::block_on(async {
        println!("I'm async!");
    });

(I thought tokio had a helper like this too but could only find `tokio::runtime::Runtime::new().unwrap().block_on(async { println!("I'm async!"); });`.)

cedws · on Nov 7, 2023

Needing a helper library for something as simple as async so you don't go mental is really not good enough. I see the same thing with error handling - every Rust project I see imports a helper because it's too clunky otherwise.

mplanchard · on Nov 7, 2023

If you don't want to pull in a helper library to run async code in a sync context, then why pull in an async library at all?

Rust is not a batteries-included language like Python. There are lots of libraries that are very commonly used in most projects (serde, thiserror, and itertools are in almost all of mine), but this is a conscious choice. They say in Python that the stdlib is where projects go to die. I'd rather have the flexiblity of choosing my dependencies, even for stuff I have to use in every project.

Elinvynia · on Nov 7, 2023

The problem is that a large number of popular libraries has converted to async, 95+% of them to Tokio.

So you are stuck with smaller, less battle tested products if you'd rather not pull in 100+ crates of dependencies that are doing nothing but inflating the build times and file sizes (for your particular usecase).

Example: reqwest vs ureq

insanitybit · on Nov 7, 2023

OK, but like, can we just be honest then that the problem here is that your build times go up? People act like it's an insurmountable problem rather than just a trivial trade-off where, yes, your build times will go up because of some extra dependencies on an async runtime.

Increased build times are not great but holy shit the way people talk you'd never know that that's the actual trade-off here, an extra 3 seconds on a clean build.

qup · on Nov 7, 2023

Adding dependencies is not easy in some organizations. You have to trust them, in addition to waiting 3 seconds to compile them.

insanitybit · on Nov 7, 2023

OK, in some niche scenarios I can see the cost being larger, but I think this is totally overstated.

bryanlarsen · on Nov 7, 2023

Usually you're reaching for block_on because a library you want to use is async. Almost certainly the library you're using will have already be depending on an async library, so by pulling it in yourself you're not adding additional dependencies.

dymk · on Nov 7, 2023

Then use the code I posted if you don't want a helper library. Or just wrap it in your own function if it's too complex for your tastes.

killercup · on June 14, 2021

Ripgrep includes one of the most prominent algorithms from Hyperscan internally for some expressions.

Longer story: Ripgrep uses Rust's regex library, which uses the Aho-Corasick library. That does not just provide the algorithm it is named after, but also "packed" ones using SIMD, including a Rust rewrite of the [Teddy algorithm][1] from the Hyperscan project.

[1]: https://github.com/BurntSushi/aho-corasick/tree/4499d7fdb41c...

kevincox · on June 14, 2021

This post by the author is a great introduction to the techniques used in ripgrep https://blog.burntsushi.net/ripgrep/

boredprograming · on June 14, 2021

That's awesome!

killercup · on March 29, 2021

> If the work is done I expect Rust to be faster than C and C++ for the same reason that C++ can sometimes be faster than C: a more advanced type system can allow better optimization in many cases.

Every now and then I check in on whether LLVM can deal with rustc spamming "noalias" on all references. You can find the latest change in [1]. While in theory this unlocks _a ton_ of optimizations, noalias is used very rarely in C/C++ code so these compiler passes are not exercised a lot by existing LLVM tests and/or not realized in full.

[1] https://github.com/rust-lang/rust/pull/82834

amelius · on March 29, 2021

A more advanced type system may also drive you into a corner where you make bad decisions wrt performance.

Blikkentrekker · on March 29, 2021

Only so long as it be mandatory to obey.

Rust was specifically designed so that one can ignore all the restrictions if one be confined and tell the compiler “I know better, and I know it is safe.”.

Of course, if one be wrong in such confidence, u.b. lurks around the corner.

I suppose one big problem with Rust is that it's less specified than in C what is and isn't safe, so it's harder to be so confident.

titzer · on March 29, 2021

Can you give an example?

Arnavion · on March 29, 2021

Not amelius, but one case that happened to me is that Rust requires wrapping a `T` in a `RefCell` if two closures use it as `&mut T`. This happens even if you the caller know that the closures are invoked from a single thread and do not invoke each other, and thus only one `&mut T` will be in effect at any time. This is because closures are effectively structs with the captures as fields, so both struct values (and thus both `&mut` borrows) exist at the same time even though their respective fields are not used at the same time.

Not only do you have to use `RefCell`, but you now also have panicking code when the `RefCell` borrow "fails", even though you know it can't. rustc is also not smart enough to notice the exclusivity at compile-time and elid away the RefCell borrow flag test and the panic branch.

    fn foo(mut cb1: impl FnMut(), mut cb2: impl FnMut()) {
        for _ in 0..10 {
            cb1();
            cb2();
        }
    }

    let mut x = String::new();
    foo(|| x.push_str("cb1,"), || x.push_str("cb2,"));

https://play.rust-lang.org/?version=stable&mode=debug&editio...

Fixed using `RefCell`: https://play.rust-lang.org/?version=stable&mode=release&edit... . Inspect the ASM and trace the use of the string constant "already borrowed"; you'll see it being used for the borrow flag test because it wasn't elided.

The equivalent non-Rust program could use String pointers for the two closures. I'm not sure whether they could be noalias or not, but at the very least they wouldn't need to generate any panicking code.

comex · on March 29, 2021

FWIW, another option is to use `std::cell::Cell`. That only allows replacing the value rather than borrowing it in place, so you would have to take the value out and put it back after you're done with it, which also results in unnecessary code generation. But there'd be no branch and no panic, so the impact should be less than RefCell. There's also no borrow flag to take up space (not that it really matters when this is a single value on the stack).

Arnavion · on March 29, 2021

This is a good point. I always forget Cell can be used with non-Copy types too.

Diggsey · on March 29, 2021

The "used from a single thread" aspect is a red herring: RefCell can only be used from a single thread anyway, and the compiler enforces this statically.

The "state" value in a RefCell is overhead, although it's fairly minor given that it doesn't need any synchronization to access. The extra panic branches are probably the largest overhead.

That said, these overheads stem from Rust's safety guarantees rather than its strong type system: you can have a language with a strong type system that does not do these checks.

Furthermore, there are of course ways to avoid this overhead within safe Rust: if you can use the type system to prove that the cell cannot be borrowed at the same time, then you don't need to do the checks, and in that sense a strong type system can actually help avoid overheads that were introduced by being a safe language.

Arnavion · on March 29, 2021

>That said, these overheads stem from Rust's safety guarantees rather than its strong type system: you can have a language with a strong type system that does not do these checks.

The difference in semantics between a `&mut T` and a `*mut T` is a type system one. `&mut T` requires that two do not exist at the same time, regardless of whether they are used at the same time or not; this is the contract of the type.

>Furthermore, there are of course ways to avoid this overhead within safe Rust: if you can use the type system to prove that the cell cannot be borrowed at the same time, then you don't need to do the checks, and in that sense a strong type system can actually help avoid overheads that were introduced by being a safe language.

Correct, which is why I made the effort of pointing out that rustc is not smart enough to do it, not that it's impossible to do it.

codeflo · on March 29, 2021

This may be splitting hairs a bit, because we all agree that this is a good example where using Rust in this straightforward manner leads to suboptimal performance. But I agree with the grand parent that this is mainly an issue with safety, not with the type system itself.

To show why, consider two alternative languages.

“Weak Rust”: an equally safe Rust with a weaker type system. It might not distinguish & and &mut, but it would still need those checks, because you might use those shared references to break a data structure invariant. It would have to detect such unsafe usage at runtime and raise the equivalent of Java’s ConcurrentModificationException.

“Unsafe Rust”: a less safe Rust with an equally strong type system. It wouldn’t need to do those checks. In fact, that’s basically C++.

adwn · on March 29, 2021

Just use `UnsafeCell` instead of `RefCell` [1]: It has zero overhead, but you have to be sure that there's really no simultaneous write/write or read/write access – just like using raw pointers in C or C++.

[1] https://doc.rust-lang.org/beta/std/cell/struct.UnsafeCell.ht...

Arnavion · on March 29, 2021

Yes, I'm not averse to using `unsafe`, but one has to justify it on a case-by-case basis. Eg if you're doing this in a library, then keep in mind that some users are very adamant about using unsafe-free crates, so you may prefer to take the hit.

colejohnson66 · on March 29, 2021

> then keep in mind that some users are very adamant about using unsafe-free crates

Couldn’t you just put the use of unsafe as a default and add a feature flag to force the safe (but slower) behavior. Then you get the best of both worlds: those who don’t care get performance for “free”, while those who care can force it when they want.

est31 · on March 29, 2021

If anything you'd have to go the opposite way: use safe by default and add the option to turn off runtime checks like bounds checks on slice access. Because when you write safe code, you tell the compiler about the invariants of your code, while with unsafe code, you keep them in your mind yourself. They might not even translate to any safe Rust constructs at all. E.g. if you pass a pointer in C, what is the recipient of the pointer supposed to do with it? Is the memory content initialized? Who is responsible for deallocation? On the other hand, if the compiler is told invariants in terms of safe code, it's easy to avoid any runtime checks for them.

Arnavion · on March 29, 2021

The users I was thinking of were more along the lines of people that run cargo-geiger etc, which just looks for "unsafe" in the source rather than anything dynamic based on selected features.

gpm · on March 29, 2021

Apart from rust not being smart enough to see what you're doing in this sort of situation, the access pattern you're trying to use would actually result in undefined behavior with &mut pointers (or I assume C restrict pointers) because of the aliasing guarantees. For example one optimization you could imagine the compiler actually doing would result in the following

   let mut x = String::new();
   
   let str1 = x; (store x in a local pointer, no one else is touching it because we have aliasing guarantees)
   let str2 = x; (store x in a local pointer, no one else is touching it because we have aliasing guarantees)
   
   for _ in 0.. 10 {
       str1.push_str("cb1,");
       str2.push_str("cb2,");
   }

   x = str1; (restore x to the original variable before our aliasing guarantee goes away)
   x = str2; (restore x to the original variable before our aliasing guarantee goes away)

And you just:

- Leaked str1

- Created an x that just says cb2 repeatedly instead of alternating between cb1 and cb2.

Obviously it's possible to fix this problem by having different guarantees on pointers (C's pointers, rust raw pointers), but it's not clear that the occasional overhead of some metadata tracking (refcell) isn't actually going to be more performant than the constant overhead of not having aliasing guarantees everywhere else. The most performant would be obviously having both, but as we've seen with C asking programmers to go around marking pointers as restrict is too much work for too little benefit.

Arnavion · on March 29, 2021

>the access pattern you're trying to use would actually result in undefined behavior with &mut pointers (or I assume C restrict pointers) [...] Obviously it's possible to fix this problem by having different guarantees on pointers (C's pointers, rust raw pointers)

Yes, I said as much in the last paragraph.

>but it's not clear that the occasional overhead of some metadata tracking (refcell) isn't actually going to be more performant than the constant overhead of not having aliasing guarantees everywhere else.

Generating panicking code where it's not needed is bad in general. It adds unwinding (unless disabled), collects backtraces (unavoidable), and often pulls in the std::fmt machinery.

Yes, in general it's almost certainly true that non-aliasing pointers produce more benefits regardless. My comment was in the context of the very specific example it gave.

nicoburns · on March 29, 2021

You could rewrite this as an iterator, which would avoid this problem and make the code much nicer.

Arnavion · on March 29, 2021

The hypothetical `foo` is a third-party library function.

(The real code which I reduced to this example is https://github.com/Arnavion/k8s-openapi/blob/1fcfe4b34a1f4f1... , and the callee does happen to be another crate in my control. While this can't be reduced to something like an Iterator, it can be resolved by making the calle take a trait with two `&mut self` methods instead of taking two closures. That still requires changing the callee, of course.)

Too · on March 29, 2021

In c/c++, Marking every argument as const throughout a deep call chain to later find some edge case where you need to mutate one member far down the call stack, and where this would not have broken the top level contract of the function. Forcing you to do expensive copies instead.

api · on March 29, 2021

This is why I think Rust could eventually be faster than C and C++ for a lot of things. The work has to be done though. You're right that noalias enabled optimizations are neglected because you can rarely use them in C code.

On the Rust side I think the language needs some way to annotate if's as likely/unlikely. This doesn't matter in most cases but can occasionally matter a lot in tight high performance code. It can allow the compiler to emit code that is structured so as to cause the branch predictor to usually be right, which can have a large impact.

steveklabnik · on March 29, 2021

That's tracked by https://github.com/rust-lang/rust/issues/26179 by the way.

drran · on March 29, 2021

Why not just use PGO (Profile Guided Optimizations)?

Sadly, PGO does not work with cross-language LTO, because of conflict of LLVM module name ('Profile').

killercup · on Dec 30, 2020

Technically, he only got tested the day after Christmas, so I'm assuming they picked the more dramatic date.

rnestler · on Dec 30, 2020

True. Which leads us back to that it is a "Misleading sensationalist headline." as somebody else mentioned here.

killercup · on Dec 30, 2020

This is as suprising as the amount of comments assuming the nurse is female.

Edit for those who just comment after reading the headline: The article clearly states the nurse is Matthew W. and uses the he pronoun in the following sentences.

_Microft · on Dec 30, 2020

In Germany, at least three quarters of nurses are women from what I would guess. So if you actually want to take a bet, betting on them being a woman makes you at least three times as likely to be correct as other bets.

Edit: first source I could find says that in 2007, 86% of nurses were women in Germany:

https://de.wikipedia.org/wiki/Krankenpfleger#Besch%C3%A4ftig...

killercup · on Dec 30, 2020

There's no need to take that bet if you can instead just read the article, however. No idea why you bring up a German statistic.

true_religion · on Dec 30, 2020

It’s to point out that people are basing their bias in reality.

Articles are mostly a jumping off point for conversation, some people just go by the headline or the comments on the page.

zed88 · on Dec 30, 2020

No more surprising than people assuming that a "plumber" is usually male.What are you trying to imply here?

killercup · on Dec 30, 2020

I'm implying they did not read the article.

xiphias2 · on Dec 30, 2020

The HN FAQ writes about better and more respectful ways to behave when you think that somebody hasn't read the article. This is not Reddit, and many of us are here _because_ this is not Reddit.

layer8 · on Dec 30, 2020

More importantly, you’re implying that it’s not surprising that people don’t read articles whose titles they’re commenting on. (I don’t disagree.)

bArray · on Dec 30, 2020

At the time of writing, you mean all of two comments got the gender wrong. Not _that_ surprising. Of course, people not reading the articles before making a comment is common anywhere.

As other comments have suggested, in most places you have ~90% female nurses, so it would be a reasonable assumption for somebody that didn't read or skimmed.

hh3k0 · on Dec 30, 2020

It's a reasonable assumption, considering that 90% of the nurses in the US are female.

blackrock · on Dec 30, 2020

Nurses make really good money in the United States. And it is always in demand.

With the union rules and all the mandatory overtime, they can earn well over six figures.

The salaries are easily comparable to some senior software engineering salaries. And they can reach that level a lot faster than some engineers can. They can also work at multiple hospitals, to further boost their income.

Given all that, it’s reasonable to assume that some men will enter the profession also.

ErikBjare · on Dec 30, 2020

But why make the assumption at all? 90% or not, in this case it was wrong.

hh3k0 · on Dec 30, 2020

That's true -- maybe they merely skimmed over the article. That's the kind of info I'd probably miss when just skimming over something, too.

melling · on Dec 30, 2020

If you condition on having read the article, you would know it’s a male nurse:

“ Matthew W., a nurse at two different local hospitals, said in a Facebook post on December 18 that he had received the Pfizer vaccine”

tremon · on Dec 30, 2020

Not sure what you're referring to. The amount of comments using a personal pronoun to refer to the nurse, at the time of your posting, is one.

killercup · on Nov 19, 2020

dupe of https://news.ycombinator.com/item?id=25149969

killercup · on June 25, 2020

> I don’t understand what a ‘packer’ is

You've found the right series of articles then!

chrisseaton · on June 25, 2020

I read it - I did't find an explanation of what it was that he was building, just how he was building it.

saagarjha · on June 25, 2020

Think of it of decompression software bundled along with a compressed version of the program you actually want to run. So execution starts in the uncompression part, it unzips all the code into memory, and then starts running the program you actually cared about.

abiogenesis · on June 26, 2020

See https://en.wikipedia.org/wiki/Executable_compression . They have been around since forever.

icedchai · on June 26, 2020

Packers were common in the 80's and 90's to save disk space. I mostly remember them from Amiga warez and demos.

killercup · on June 25, 2020

Start at part 1.

No, he doesn't give you an answer, but you might just find it interesting enough to no longer have the qusstion.

killercup · on June 23, 2020

Author here. Please note that the "cheap" in the title refers to the effort needed as well as how sophisticated the tricks are; benchmarking and optimizing your algorithms is still super important! This was discussed quite a bit on the Rust subreddit [1] when it was published.

[1]: https://www.reddit.com/r/rust/comments/fdbszu/cheap_tricks_f...

_3sno · on June 23, 2020

Perhaps "low cost" in the title would have made it clearer since "cheap" as a word has a lot of connotations.

Language can be very hard to get right and I have struggled with this a lot personally.

bebna · on June 23, 2020

I prefer low effort.

killercup · on Jan 14, 2020

Tar was released in '79 iirc, so it's about the same age.

boutad · on Jan 14, 2020

I imagine most modern users of tar are using GNU Tar or libarchive bsdtar. Are there any current tar implementations that can be directly traced to the original?

kjs3 · on Jan 15, 2020

The BSD's tar is derived from the old 4BSD tar which was derived from the original.

Pretty sure SCO Unix (yes, you can still buy it) is System V tar. Probably Solaris, too.

I recall AIX tar is...something else, none of the above. I don't recall the details.

boutad · on Jan 15, 2020

According to the man page for bsdtar that ships with Ubuntu

A tar command appeared in Seventh Edition Unix, which was released in January, 1979. There have been numerous other implementations, many of which extended the file format. John Gilmore's pdtar public-domain implementation (circa November, 1987) was quite influential, and formed the basis of GNU tar. GNU tar was included as the standard system tar in FreeBSD beginning with FreeBSD 1.0. This is a complete re-implementation based on the libarchive(3) library. It was first released with FreeBSD 5.4 in May, 2005.

kjs3 · on Jan 15, 2020

Ah...my bad. OpenBSD tar has Berkeley copyrights. Looks like Free/NetBSD don't. I made a bad assumption there. Thanks.