This is an assortment of unfortunate, regrettable decisions made by the Rust standard library. They're all very minor, inconsequential things - gotchas that you notice the first time you hit them and then learn to live with them. So this article isn't meant to be a rant or anything of that sort. It's just a list that's been mulling in my mind for a long time that I decided to put to paper. I've also needed to reference these points when talking on IRC, so it'll be easier to just provide URLs.
I link to the libstd docs when relevant, but I assume basic Rust knowledge from the reader.
This list is neither objective nor complete. It's built from my own experience using Rust since 2016, as well as the discussions I've seen in its IRC channels involving experienced users and new users alike.
All of these things could be resolved in a Rust "2.0", ie a release that is allowed to make backward-incompatible changes from the current "1.x". I personally hope that such a release never happens, despite being the author of this list, because I don't know any backward-incompatible forks of languages that have gone well.
Alternatively, Rust's editions could be used to fix some of these.
Editions currently cannot add or remove trait impls for libstd types,
because trait impls are generally program-global, not crate-scoped.
However, it is planned to add an IntoIterator
impl for
arrays but syntactically enable it only when the crate is compiled with
edition 2021, so that existing edition 2015 and 2018 code that tries to
use arrays as an IntoIterator
continues to fall back to the
slice IntoIterator
impl via unsize coercion. It remains to
be seen how much havoc this might cause with macros like the 2015 ->
2018 edition transition did. But if successful, this creates the
precedent for a limited form of "backward-incompatible libstd"s
available to crates to opt in to based on syntax.
Changing these would be backward-incompatible
#iteratorext
TheIterator
trait is the largest trait in the standard library. It's so large because Rust has a lot of combinators defined for iterators, and they're all methods of this trait. At one point, the docs page of this trait would kill browsers because the page would attempt to expand all the impls ofIterator
for the ~200 types in libstd that implement it, leading to an extremely long web page.In general, when a trait contains default methods, it's because it wants to give you the ability to override them. For example,
Iterator::try_fold
andIterator::nth
have default impls in terms ofIterator::next
, but may be overridden if the type can impl more efficiently.However, the methods that return other iterators have return types that can only be instantiated by libstd, so it is not possible for a user impl to override them. For example,
Iterator::map
returnsstd::iter::Map<Self, F>
, and this type is opaque outside libstd. Since it also references bothSelf
andF
, it is not even possible to return the result of invokingIterator::map
on any other iterator instead ofSelf
, say if you wanted to delegate to an inner iterator's impl. The only possible way to implement this method is the implementation that is already in libstd.Outside of libstd, a common convention is to have two separate traits. There is one
trait Foo
, which contain methods that either must be implemented or could be useful to override. The other istrait FooExt: Foo
with a blanket impl for allT: Foo
, which contains extra methods that need not / should not / can not be overridden. For example, seefutures::stream::Stream
andfutures::stream::StreamExt
(a direct analogue toIterator
), ortokio::io::AsyncRead
andtokio::io::AsyncReadExt
Unfortunately splitting the
Iterator
intoIteratorExt
would be backward-incompatible, even ifIteratorExt
was also added to the prelude so thatiter.map(...)
continues to compile, since it would still break any code using UFCSIterator::map(iter, ...)
#cow
TheCow
type, as its documentation says, is a "clone-on-write" smart pointer. This type is an enum ofBorrowed(&B)
andOwned(T)
variants, eg aCow<str>
can be either aBorrowed(&str)
or anOwned(String)
. I believe, based on the code I've written personally as well as my day job's codebase, that most of the uses ofCow
are for the ability to hold either a borrow or an owned value. For example, consider code like this:fn execute(query: &str) { ... } fn get_by_id(id: Option<&str>) { let query = match id { Some(id) => format!("#{}", id), None => "*", }; &query); execute(}
This won't compile because one of the
match
arms returns aString
and the other a&'static str
. One way to solve this would be to use.to_owned()
on the&'static str
to make it aString
too, but this is a wasteful allocation sinceexecute
only needs a&str
anyway.Cow
is a better approach:let query: Cow<'static, str> = match id { Some(id) => format!("#{}", id).into(), // Creates a Cow::Owned(String) None => "*".into(), // Creates a Cow::Borrowed(&'static str) }; &query); // &Cow<str> implicitly derefs to &str execute(
But what exactly does "clone-on-write" mean anyway, given it was important enough to name the type after? The answer lies in one of the two methods that
Cow
impls:fn to_mut(&mut self) -> &mut B::Owned
For example, if used on a
Cow::Borrowed(&str)
, this method will clone the&str
into aString
, changeself
to be aCow::Owned(String)
instead, and then return a&mut String
. If it was already aCow::Owned(String)
, it just returns a&mut String
from the same string. So it is indeed a "clone-on-write" operation.However, of all the times I've used
Cow
, I've used this method very rarely. Most of my uses have been to just store either a borrow or an owned value, as mentioned above. Occasionally I've used the other method thatCow
impls,fn into_owned(self) -> B::Owned
, but this is just "convert", not "clone-on-write", since it consumes theCow
.In fact,
Cow
does impl the standardClone
andToOwned
traits (the latter via its blanket impl for allT: Clone
). Butclone
ing orto_owned
ing a&Cow::<'a, B>::Borrowed(B)
gives anotherCow::<'a, B>::Borrowed(B)
, not aCow::<'static, B>::Owned(B::Owned)
. (It couldn't do that anyway, becauseClone::clone
must returnSelf
, so the lifetimes need to match.) SoCow
has two methods of cloning itself that are unlike the other two methods of cloning it has, and specifically the method namedto_owned
doesn't necessarily produce anOwned
value.The end result is that new users trying to figure out how to store either a
&str
or aString
don't realize that the type they're looking for is namedCow
. And when they ask why it's named that, they learn that it's because, out of the many other ways it can be used, one specific one that they're unlikely to use is "clone-on-write".It may have been a better state of affairs if it was called something else, like
MaybeOwned
.#tryfrom-fromstr
TheTryFrom
andTryInto
traits represent fallible conversions from one type to another. However these traits were only added in 1.34.0; before that fallible conversions were performed using ad-hocfn from_foo(foo: Foo) -> Result<Self>
methods. However, one special kind of fallible conversion was there since 1.0, represented by theFromStr
trait andstr::parse
method - that of fallible conversion of a&str
into a type.Unfortunately, when the
TryFrom
trait was stabilized, a blanket impl forT: FromStr
was not also added - it would've conflicted with the other blanket impl ofTryFrom
for allT: From
. ThereforeFromStr
andTryFrom
exist independently, and as a result libstd has two kinds of fallible conversions when the source is astr
. Furthermore, none of the libstd types that implFromStr
also implTryFrom<&str>
, and in my experience third-party crates also tend to only implementFromStr
.As a result, one cannot write code that is generic on
T: TryFrom<&str>
and expect it to work automatically withT
s that only implFromStr
. It is also not possible to write a single function that wants to support bothT: TryFrom<&str>
andT: FromStr
due to the orphan rules; specialization may or may not allow this when it's stabilized.#err-error
Speaking ofTryFrom
andFromStr
, the former's assoc type was namedError
even though the latter's was namedErr
. The initial implementation ofTryFrom
did useErr
to be consistent withFromStr
, but this was changed toError
before stabilization so as to not perpetuate theErr
name into new code. Nevertheless, it remains an unfortunate inconsistency.
These can't be changed, but they can be deprecated in favor of new alternatives
#result-option-intoiterator
TheResult
type implsIntoIterator
, ie it's convertible to anIterator
that yields zero or one elements (if it wasErr
orOk
respectively). Functional language users will find this familiar, sinceEither
being convertible to a sequence of zero (Left
) or one (Right
) elements is common in those languages. The problem with Rust's approach is that theIntoIterator
trait is implicitly used by for-loops.Let's say you want to enumerate the entries of the
/
directory. You might start with this:for entry in std::fs::read_dir("/") { println!("found {:?}", entry); }
This will compile, but rather than printing the contents of
/
, it will print just one line that readsfound ReadDir("/")
.ReadDir
here refers tostd::fs::ReadDir
, which is the iterator of directory entries returned bystd::fs::read_dir
. But why is the loop variableentry
receiving the whole iterator instead of the elements of the iterator? The reason is thatread_dir
actually returns aResult<std::fs::ReadDir, std::io::Error>
, so the loop actually needs to be written likefor entry in std::fs::read_dir("/")? {
; notice the?
at the end.Of course, this only happens to compile because
println!("{:?}")
is an operation that can be done on bothReadDir
(what you got) andResult<DirEntry>
(what you expected to get). Other things that could "accidentally" compile are serialization, and converting tostd::any::Any
trait objects. Otherwise, if you actually tried to useentry
like aResult<DirEntry>
, you would likely get compiler errors, which would at least prevent bad programs though they might still be confusing.The
Option
type also has the same problem since it also implsIntoIterator
, ie it's convertible to anIterator
that yields zero or one elements (if it wasNone
orSome
respectively). Again, this mimics functional languages whereOption
/Maybe
are convertible to a sequence of zero or one elements. But again, the implicit use ofIntoIterator
with for-loops in Rust leads to problems with code like this:let map: HashMap<Foo, Vec<Bar>> = ...; let values = map.get(&some_key); for value in values { println!("{:?}", value); }
The intent of this code is to print every
Bar
in the map corresponding to the keysome_key
. Unfortunatelymap.get
returns not a&Vec<Bar>
but anOption<&Vec<Bar>>
, which meansvalue
inside the loop is actually a&Vec<Bar>
. As a result, it prints all theBar
s in a single line like a slice instead of oneBar
per line.These problems wouldn't have happened if
Result
andOption
had a dedicated function to convert to anIterator
instead of implementingIntoIterator
. They could be solved by adding this new function and deprecating the existingIntoIterator
impl, though at this point the compiler does not support deprecating trait impls.At least clippy has a lint for these situations.
#insert-unit
All the libstd collection methods that insert elements into the collection return()
or some other value that is unrelated to the element that was just inserted. This means if you write code that inserts the value and then wants to do something with the inserted value, you have to do a separate lookup to get the value you just inserted. For example:// Inserts a new String into the given Vec, and returns a borrow of the newly-inserted value. fn foo(v: &mut Vec<String>) -> &str { let new_element = bar(); .push(new_element); v // This is accessing the element that was just inserted, so there's no way this could fail. // // But still, to satisfy the typesystem, one must write .unwrap(). // The compiler is also not smart enough to detect that `last()` can never return `None`, // so it will still emit the panic machinery for this unreachable case. let new_element = v.last().unwrap(); &**new_element }
The same issue exists with
BTreeMap::insert
,BTreeSet::insert
,HashMap::insert
,HashSet::insert
,VecDeque::push_back
, andVecDeque::push_front
. It's even worse for the maps and sets, since the lookup requires the key / value that was consumed by the insert, so you'd probably have to haveclone()
d it before you inserted it.There is a workaround for
BTreeMap
andHashMap
, which is to use theirentry()
APIs which do have a way to get a&mut V
of the value that was just inserted. Unfortunately this is much more verbose than a simple call toinsert()
. And even these APIs don't return a&K
borrowed from the map that can be used after the entry has been inserted.These functions can't be changed without being backward-incompatible. Even changing the functions that currently return
()
to return non-unit values would not be backward-compatible, since they may be used in contexts where the return type is used to drive further inference. But new functions could be added that do return borrows of the newly inserted values.