I have a proof-of-concept patch to MRI that caches
#to_s values for immutable values. It is implemented using a few fixed-size hash tables. http://github.com/kstephens/ruby/commits/to_s_maybe_frozen/
It reduces the number of
#to_s result objects by 1890 during the MRI test suite for
It requires a minor semantic change to Ruby core. This minor change could cascade into a huge performance improvement for all Ruby implementations — as will be illustrated later:
#to_s may return frozen
This appears to not be a problem since any callers of
#to_s are likely to anticipate that the receiver may already be a
String and are not going to mutate it —
#to_s is a coercion. The current MRI test suite passes if some
#to_s results are frozen.
For code that may expect
#to_s to return a mutable, an
Object#dup_if_frozen method might be helpful. This method will return
self.dup if the receiver is
#frozen? and is not an immediate or an immutable. (Aside: a fast
#dup_unless_frozen method might be helpful for general memoization of computations!)
This caching technique could be extended into other immutables (e.g.: the
Numerics) and objects whose
#to_s representations never change (e.g.:
Module?) and for
#inspect under similar constraints.
In the patch,
Fixnum#to_s is not cached because
Fixnums are often incremented during long loops; any cache for it is quickly churned. However, this could be enabled if it proves useful in practice.
If this new semantic for
#to_s is reasonable, I recommend explicitly storing frozen strings for
nil.to_s and storing
Symbol#to_s with each
Symbol, likewise for
In practice, most Ruby
String literals become garbage immediately. If
Symbol#to_s was guaranteed to be always be cached, this would enable the use of:
puts :"some string"
puts "some string"
as an in-line memoized frozen String that creates no garbage when calling
puts which will call
#to_s on its argument, but never mutate the result. A parser or compiler could recognize
Symbol#to_s as an operation with no side-effect and elide it, providing a true
String constant. This idiom would eliminate the pointless
String garbage created by the evaluation of every
This is far more expressive and concise than:
SOME_STRING = "some string".freeze
The alternative to
:"some string" might be to memoize all
String literals as frozen. This is a superior syntax and semantic — old code would need to change on a massive scale, but any issues would be easy to diagnose:
str = '' # Make a mutable empty string.
str << "foo" # "foo" is garbage
str << "bar" # "bar" is garbage
str = ''.dup # Make a mutable empty string.
str << "foo" # "foo" is not garbage
str << "bar" # "bar" is not garbage
The latter is backwards-compatible with the current
String literal semantics.