Most Rubyists know about monkeypatching: opening up someone else’s class (often, something like String or Object) to modify some of its methods after the fact. It’s both incredibly powerful when used judiciously, and incredibly dangerous the rest of the time. I’ve spent countless hours trying to debug conflicting definitions of #to_json, or trying to untangle ActiveRecord’s astonishing levels of dynamic method aliasing.

I’m here to introduce you to a far more exciting threat: set_trace_func. This invidious callback is invoked on every function call and line of the Ruby interpreter. Most people, if they’re aware of it at all, correctly assume it’s intended for profiling.

They couldn’t be more wrong.

class Fixnum
  def add(other)
    self + other

set_trace_func proc { |event, file, line, id, binding, classname|
  if classname == Fixnum and id == :add and event == 'call'
    # We can, of course, find the receiver of the current method
    me = binding.eval("self")

    # And the binding gives us access to all variables declared
    # in that method's scope. At call time only the method arguments will be
    # defined.
    args = binding.eval("local_variables").inject({}) do |vars, name|
      value = binding.eval name
      vars[name] = value unless value.nil?

    # We can also *change* those arguments.
    args.each do |name, value|
      if Numeric === value
        binding.eval "#{name} = #{value + 1}"

puts 1.add 1 # => 3

Note that this allows you to interfere with methods you’ve never seen before, simply by relaxing the class or id restrictions. Spooky action at a distance!

It Never Happened

Nobody suspects the value of integer arguments to change when a function is called. However, a suspicious rubyist might open up that class and add some debugging statements, uncovering our treachery. Let’s be a little more subtle.

previous = {}
depth = 0
set_trace_func proc { |event, file, line, id, binding, classname|
  if event == 'c-call'
    if depth == 0 and rand < 0.5
      # Get the caller's local variables
      locals = binding.eval("local_variables").inject({}) do |vars, name|
        vars[name] = binding.eval name

      # Pick some strings
      strings = locals.delete_if do |name, value|
        not value.kind_of? String

      i = rand strings.size
      str1 = strings.keys[i]
      str2 = strings.keys[(i + 1) % strings.size]

      # And play musical chairs
      previous[str1] = strings[str1].dup
      previous[str2] = strings[str2].dup

      binding.eval "#{str1}.replace #{previous[str2].inspect}"
      binding.eval "#{str2}.replace #{previous[str1].inspect}"

    depth += 1
  elsif event == 'c-return'
    depth -= 1

    if depth <= 0
      # Whoops, the music stopped! Everyone grab your original seat!
      depth = 0

      previous.each do |name, value|
        binding.eval "#{name}.replace #{value.inspect}"

      previous = {}

a = "hello"
b = "world"

puts [a, b]
# => "world\nhello"

# Sometimes.

For best results, re-order the arguments to functions which take more than 2 non-hash arguments in a deterministic way.

Next Level Language Maneuver

For the Haskell and Erlang enthusiast, might I suggest:

# Enforce immutable programming. Silently.
lambda {
  default_frame = lambda do 
      :locals => {}

  # Stack contains the bound local variables for each method call.
  stack = [default_frame[]]

  set_trace_func proc { |event, file, line, id, binding, classname|
    if event == 'call' or event == 'c-call'
      stack << default_frame[]
    elsif event == 'return' or event == 'c-return'
      stack << default_frame[] if stack.empty?

    binding.eval("local_variables").each do |var|
      # Get the original and current values of this variable
      old = stack.last[:locals][var]
      new = binding.eval var
      if old != nil
        unless old == new
          # The variable has changed!
          binding.eval("lambda { |v| #{var} = v }")[old]
        # We haven't seen this variable before
          original = new.dup 
          # Immediately replace this variable with a *different* duplicate of 
          # itself to prevent mutator methods from leaking across contexts, or
          # corrupting our stack
          binding.eval("lambda { |v| #{var} = v}")[new.dup]
          # Guess you can't dup that 
          original = new
        stack.last[:locals][var] = original

# Any grade schooler could tell you this would have been nonsense.
x = 1
x = 2
puts x    # => 1. Ahhhh, much better.

# Your functions are idempotent, right? Well, they are now!
array = [1, 2, 3]
array.delete 2
p array   # => [1, 2, 3]

# Makes destructive methods more relaxing!
string = 'good'
puts lambda { |str|
  str.replace 'evil' 
}[string]     # => evil
puts string   # => good

# Blocks don't close over their arguments, sadly.
elem = 0
[1,2,3].each do |elem|
  puts elem
puts elem     # => 0, 0, 0, and more 0.

Generalization to class and global variables is left to the reader.

Note that you can do this with fewer copies required, but keeping track of which bindings include references to a given mutated object is nontrivial.

Suggested Exercises

  1. PHP programmers may want to try implementing $REGISTER_GLOBALS for Rack.
  2. Take it one step further and convert all variables to global scope.
  3. Leak variables named ‘username’ and ‘password’ to unexpected places.
  4. Automatically initialize variables which are not explicitly set to helpful values.
  5. Override the assignment operator.
  6. Swap the values of similarly named variables.
  7. Automatically memoize a function. Try using throw/catch, signal handlers, or redefining methods in the binding to affect control flow.
  8. Unroll .each blocks “for speed.”

Thanks for the cool article. your website is awesome! No big deal, just fyi: I noticed the indentation of your form.comment-form is a little off. The Name, Email, Http Labels are somewhat overlaid by their input boxes. I am using the Chrome Browser with 15" screen, if that helps. Cheers!

Post a Comment

Comments are moderated. Links have nofollow. Seriously, spammers, give it a rest.

Please avoid writing anything here unless you're a computer. This is also a trap:

Supports Github-flavored Markdown, including [links](, *emphasis*, _underline_, `code`, and > blockquotes. Use ```clj on its own line to start an (e.g.) Clojure code block, and ``` to end the block.