Most Rubyists know about monkeypatching: opening up someone else’s class (often, something like String or Object) to modify some of its methods after the fact. It’s both incredibly powerful when used judiciously, and incredibly dangerous the rest of the time. I’ve spent countless hours trying to debug conflicting definitions of #to_json, or trying to untangle ActiveRecord’s astonishing levels of dynamic method aliasing.
I’m here to introduce you to a far more exciting threat: set_trace_func. This invidious callback is invoked on every function call and line of the Ruby interpreter. Most people, if they’re aware of it at all, correctly assume it’s intended for profiling.
They couldn’t be more wrong.
class Fixnum
def add(other)
self + other
end
end
set_trace_func proc { |event, file, line, id, binding, classname|
if classname == Fixnum and id == :add and event == 'call'
# We can, of course, find the receiver of the current method
me = binding.eval("self")
# And the binding gives us access to all variables declared
# in that method's scope. At call time only the method arguments will be
# defined.
args = binding.eval("local_variables").inject({}) do |vars, name|
value = binding.eval name
vars[name] = value unless value.nil?
vars
end
# We can also *change* those arguments.
args.each do |name, value|
if Numeric === value
binding.eval "#{name} = #{value + 1}"
end
end
end
}
puts 1.add 1 # => 3
Note that this allows you to interfere with methods you’ve never seen before, simply by relaxing the class or id restrictions. Spooky action at a distance!
It Never Happened
Nobody suspects the value of integer arguments to change when a function is called. However, a suspicious rubyist might open up that class and add some debugging statements, uncovering our treachery. Let’s be a little more subtle.
previous = {}
depth = 0
set_trace_func proc { |event, file, line, id, binding, classname|
if event == 'c-call'
if depth == 0 and rand < 0.5
# Get the caller's local variables
locals = binding.eval("local_variables").inject({}) do |vars, name|
vars[name] = binding.eval name
vars
end
# Pick some strings
strings = locals.delete_if do |name, value|
not value.kind_of? String
end
i = rand strings.size
str1 = strings.keys[i]
str2 = strings.keys[(i + 1) % strings.size]
# And play musical chairs
previous[str1] = strings[str1].dup
previous[str2] = strings[str2].dup
binding.eval "#{str1}.replace #{previous[str2].inspect}"
binding.eval "#{str2}.replace #{previous[str1].inspect}"
end
depth += 1
elsif event == 'c-return'
depth -= 1
if depth <= 0
# Whoops, the music stopped! Everyone grab your original seat!
depth = 0
previous.each do |name, value|
binding.eval "#{name}.replace #{value.inspect}"
end
previous = {}
end
end
}
a = "hello"
b = "world"
puts [a, b]
# => "world\nhello"
# Sometimes.
For best results, re-order the arguments to functions which take more than 2 non-hash arguments in a deterministic way.
Next Level Language Maneuver
For the Haskell and Erlang enthusiast, might I suggest:
# Enforce immutable programming. Silently.
lambda {
default_frame = lambda do
{
:locals => {}
}
end
# Stack contains the bound local variables for each method call.
stack = [default_frame[]]
set_trace_func proc { |event, file, line, id, binding, classname|
if event == 'call' or event == 'c-call'
stack << default_frame[]
elsif event == 'return' or event == 'c-return'
stack.pop
stack << default_frame[] if stack.empty?
end
binding.eval("local_variables").each do |var|
# Get the original and current values of this variable
old = stack.last[:locals][var]
new = binding.eval var
if old != nil
unless old == new
# The variable has changed!
binding.eval("lambda { |v| #{var} = v }")[old]
end
else
# We haven't seen this variable before
begin
original = new.dup
# Immediately replace this variable with a *different* duplicate of
# itself to prevent mutator methods from leaking across contexts, or
# corrupting our stack
binding.eval("lambda { |v| #{var} = v}")[new.dup]
rescue
# Guess you can't dup that
original = new
end
stack.last[:locals][var] = original
end
end
}
}.call
# Any grade schooler could tell you this would have been nonsense.
x = 1
x = 2
puts x # => 1. Ahhhh, much better.
# Your functions are idempotent, right? Well, they are now!
array = [1, 2, 3]
array.delete 2
p array # => [1, 2, 3]
# Makes destructive methods more relaxing!
string = 'good'
puts lambda { |str|
str.replace 'evil'
}[string] # => evil
puts string # => good
# Blocks don't close over their arguments, sadly.
elem = 0
[1,2,3].each do |elem|
puts elem
end
puts elem # => 0, 0, 0, and more 0.
Generalization to class and global variables is left to the reader.
Note that you can do this with fewer copies required, but keeping track of which bindings include references to a given mutated object is nontrivial.
Suggested Exercises
- PHP programmers may want to try implementing $REGISTER_GLOBALS for Rack.
- Take it one step further and convert all variables to global scope.
- Leak variables named ‘username’ and ‘password’ to unexpected places.
- Automatically initialize variables which are not explicitly set to helpful values.
- Override the assignment operator.
- Swap the values of similarly named variables.
- Automatically memoize a function. Try using throw/catch, signal handlers, or redefining methods in the binding to affect control flow.
- Unroll .each blocks “for speed.”
Thanks for the cool article. your website is awesome! No big deal, just fyi: I noticed the indentation of your form.comment-form is a little off. The Name, Email, Http Labels are somewhat overlaid by their input boxes. I am using the Chrome Browser with 15" screen, if that helps. Cheers!