Do Not Resuscitate

Learning static analysis of Ruby by building a dead code detector

Jason Gedge

Founding Engineer

Gadget

Abstract Syntax Tree (AST)

It is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language.

Wikipedia 
ruby-parse -e "
  # a simple class
  class MyClass;
    def initialize
      @ivar = 1
    end
  end
"
(class
    (const nil :MyClass)
    nil
    (def :initialize
        (args
        (ivasgn
            :@ivar
            (int 1)))))

node_dump.c 

Where you can find all of the possible node types you can encounter.

static void
dump_node(VALUE buf, VALUE indent, int comment, const NODE * node) {

  // ...

    case NODE_CLASS:

  // ...

}
case NODE_CLASS:
	ANN("class definition");
	ANN("format: class [nd_cpath] < [nd_super]; [nd_body]; end");
	ANN("example: class C2 < C; ..; end");
	F_NODE(nd_cpath, "class path");
	F_NODE(nd_super, "superclass");
	LAST_NODE;
	F_NODE(nd_body, "class definition");
	return;
case NODE_CLASS:
	ANN("class definition");
	ANN("format: class [nd_cpath] < [nd_super]; [nd_body]; end");
	ANN("example: class C2 < C; ..; end");
	F_NODE(nd_cpath, "class path");
	F_NODE(nd_super, "superclass");
	LAST_NODE;
	F_NODE(nd_body, "class definition");

(class
  (const nil :MyClass)
  nil
  (def :initialize
    (args)
    (ivasgn :@ivar
      (int 1))))

Parser options

  • 💎 parser
  • 👮 Rubocop
  • 🌲 RubyVM::AbstractSyntaxTree

🚨 🚨 🚨 🚨 🚨

This class is experimental and its API is not stable, therefore it might change without notice. As examples, the order of children nodes is not guaranteed, the number of children nodes might change, there is no way to access children nodes by name, etc.

If you are looking for a stable API or an API working under multiple Ruby implementations, consider using the parser gem or Ripper. If you would like to make RubyVM::AbstractSyntaxTree stable, please join the discussion at https://bugs.ruby-lang.org/issues/14844 .

🚨 🚨 🚨 🚨 🚨

RubyVM::AbstractSyntaxTree.parse(string)

RubyVM::AbstractSyntaxTree.parse_file(pathname)

RubyVM::AbstractSyntaxTree::Node

ast.c 

shows the ordering of children, which can be different than ruby-parse / node_dump.c.

case NODE_CALL:
  return rb_ary_new_from_args(
    3,
    NEW_CHILD(ast_value, RNODE_CALL(node)->nd_recv),
    ID2SYM(RNODE_CALL(node)->nd_mid),
    NEW_CHILD(ast_value, RNODE_CALL(node)->nd_args)
  );
depth-first search

Designing our dead code detector

For every Ruby file in current path, collect two pieces of information:

  1. Function calls
  2. Method definitions
def traverse(node, &blk)
  return unless node.is_a(RubyVM::AbstractSyntaxTree::Node)

  blk.call(node)

  node.children.each do |child|
    traverse(child, &blk)
  end
end

traverse(RubyVM::AbstractSyntaxTree.parse_file(pathname)) do |node|
  # TODO
end
calls = Set.new
defs = Set.new

traverse(RubyVM::AbstractSyntaxTree.parse_file(pathname)) do |node|
  # TODO
end

puts "Calls: #{calls.to_a}"
puts " Defs: #{defs.to_a}"
calls = Set.new
defs = Set.new

traverse(RubyVM::AbstractSyntaxTree.parse_file(pathname)) do |node|
  case node.type
  when :FCALL
    calls << node.children[0]
  when :CALL
    calls << node.children[1]
  end
end

puts "Calls: #{calls.to_a}"
puts " Defs: #{defs.to_a}"
calls = Set.new
defs = Set.new

traverse(RubyVM::AbstractSyntaxTree.parse_file(pathname)) do |node|
  case node.type
  when :FCALL
    calls << node.children[0]
  when :CALL
    calls << node.children[1]
  when :DEFN
    defs << node.children[0]
  when :DEFS
    defs << node.children[1]
  end
end

puts "Calls: #{calls.to_a}"
puts " Defs: #{defs.to_a}"
calls = Set.new
defs = Set.new

traverse(RubyVM::AbstractSyntaxTree.parse_file(pathname)) do |node|
  case node.type
  when :FCALL
    calls << node.children[0]
  when :CALL
    calls << node.children[1]
  when :DEFN
    defs << node.children[0]
  when :DEFS
    defs << node.children[1]
  end
end

puts "     Calls: #{calls.to_a}"
puts "      Defs: #{defs.to_a}"
puts "Maybe dead: #{(defs - calls).to_a}"
$ ruby find_dead_code.rb unused_example.rb

     Calls: [:include, :attr_reader, :new, :value, :create, :puts, :to_s, :squared]
      Defs: [:create, :initialize, :squared, :incremented, :<=>]
Maybe dead: [:initialize, :incremented, :<=>]