ActiveSupport’s #descendants Method: A Deep Dive

Rails adds many things to Ruby's built-in objects. This is what some call a "dialect" of Ruby and is what allows Rails developers to write lines like 1.day.ago. Most of these extra methods live in ActiveSupport. Today, we're going to look at perhaps a lesser-known method that ActiveSupport adds directly to Class: descendants. This method […]

Rails adds many things to Ruby's built-in objects. This is what some call a "dialect" of Ruby and is what allows Rails developers to write lines like 1.day.ago.

Most of these extra methods live in ActiveSupport. Today, we're going to look at perhaps a lesser-known method that ActiveSupport adds directly to Class: descendants. This method returns all the subclasses of the called class. For example, ApplicationRecord.descendants will return the classes in your app that inherit from it (e.g., all the models in your application). In this article, we'll take a look at how it works, why you might want to use it, and how it augments Ruby's built-in inheritance-related methods.

Inheritance in Object-Oriented Languages

First, we'll provide a quick refresher on Ruby's inheritance model. Like other object-oriented (OO) languages, Ruby uses objects that sit within a hierarchy. You can create a class, then a subclass of that class, then a subclass of that subclass, and so on. When walking up this hierarchy, we get a list of ancestors. Ruby also has the nice feature that all entities are objects themselves (including classes, integers, and even nil), whereas some other languages often use "primitives" that are not true objects, usually for the sake of performance (such as integers, doubles, booleans, etc.; I'm looking at you, Java).

Ruby and, indeed, all OO languages, has to keep track of ancestors so that it knows where to look up methods and which ones take precedence.

class BaseClass
  def base
    "base"
  end

  def overridden
    "Base"
  end
end

class SubClass < BaseClass
  def overridden
    "Subclass"
  end
end

Here, calling SubClass.new.overridden gives us "SubClass". However, SubClass.new.base is not present in our SubClass definition, so Ruby will go through each of the ancestors to see which one implements the method (if any). We can see the list of ancestors by simply calling SubClass.ancestors. In Rails, the result will be something like this:

[SubClass,
 BaseClass,
 ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
 ActiveSupport::ToJsonWithActiveSupportEncoder,
 Object,
 PP::ObjectMixin,
 JSON::Ext::Generator::GeneratorMethods::Object,
 ActiveSupport::Tryable,
 ActiveSupport::Dependencies::Loadable,
 Kernel,
 BasicObject]

We won't dissect this whole list here; for our purposes, it's enough to note that SubClass is at the top, with BaseClass below it. Also, note that BasicObject is at the bottom; this is the top-level Object in Ruby, so it will always be at the bottom of the stack.

Modules (a.k.a. 'Mixins')

Things get a bit more complicated when we add modules into the mix. A module is not an ancestor in the class hierarchy, yet we can "include" it into our class so Ruby has to know when to check the module for a method, or even which module to check first in the case of multiple modules being included.

Some languages do not allow this kind of "multiple inheritance", but Ruby even goes a step further by letting us choose where the module gets inserted into the hierarchy by whether we include or prepend the module.

Prepending Modules

Prepended modules, as their name somewhat suggests, are inserted into the list ancestors before the class, basically overriding any of the class' methods. This also means you can call "super" in a prepended module's method to call the original class' method.

module PrependedModule
  def test
    "module"
  end

  def super_test
    super
  end
end

# Re-using `BaseClass` from earlier
class SubClass < BaseClass
  prepend PrependedModule

  def test
    "Subclass"
  end

  def super_test
    "Super calls SubClass"
  end
end

The ancestors for SubClass now look like this:

[PrependedModule,
 SubClass,
 BaseClass,
 ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
 ...
]

With this new list of ancestors, our PrependedModule is now first-in-line, meaning Ruby will look there first for any methods we call on SubClass. This also means that if we call super within PrependedModule, we will be calling the method on SubClass:

> SubClass.new.test
=> "module"
> SubClass.new.super_test
=> "Super calls SubClass"

Including Modules

Included modules, on the other hand, are inserted into the ancestors after the class. This makes them ideal for intercepting methods that would otherwise be handled by the base class.

class BaseClass
  def super_test
    "Super calls base class"
  end
end

module IncludedModule
  def test
    "module"
  end

  def super_test
    super
  end
end

class SubClass < BaseClass
  include IncludedModule

  def test
    "Subclass"
  end
end

With this arrangement, the ancestors for SubClass now look like this:

[SubClass,
 IncludedModule,
 BaseClass,
 ActiveSupport::Dependencies::ZeitwerkIntegration::RequireDependency,
 ...
]

Now, SubClass is the first point of call, so Ruby will only execute methods in IncludedModule if they are not present in SubClass. As for super, any calls to super in the SubClass will go to IncludedModule first, while any calls to super within IncludedModule will go to BaseClass.

Put another way, an included module sits between a subclass and its base class in the ancestor hierarchy. This effectively means they can be used to 'intercept' methods that would otherwise be handled by the base class:

> SubClass.new.test
=> "Subclass"
> SubClass.new.super_test
=> "Super calls BaseClass"

Because of this "chain of command", Ruby has to keep track of a classes ancestors. The reverse is not true, though. Given a particular class, Ruby does not need to track its children, or "descendants", because it will never need this information to execute a method.

Ancestor Ordering

Astute readers may have realized that if we were using multiple modules in a class, then the order we include (or prepend) them could produce different results. For example, depending on the methods, this class:

class SubClass < BaseClass
  include IncludedModule
  include IncludedOtherModule
end

and this class:

class SubClass < BaseClass
  include IncludedOtherModule
  include IncludedModule
end

Could have behave quite differently. If these two modules had methods with the same name, then the order here will determine which one is taking precedence and where calls to super would be resolved to. Personally, I'd avoid having methods that overlap each other like this as much as possible, specifically to avoid having to worry about things like the order the modules are included.

Real-World Usage

While it's good to know the difference between include and prepend for modules, I think a more real-world example helps to show when you might choose one over the other. My main use-case for modules like these is with Rails engines.

Probably one of the most popular Rails engines is devise. Let's say we wanted to change the password digest algorithm being used, but first, a quick disclaimer:

My day-to-day use of modules has been to customize the behavior of a Rails engine that holds our default business logic. We are overriding the behavior of code we control. You can, of course, apply the same method to any piece of Ruby, but I would not recommend overriding code that you do not control (e.g., from gems maintained by other people), as any change to that external code could be incompatible with your changes.

Devise's password digest happens here in the Devise::Models::DatabaseAuthenticatable module:

  def password_digest(password)
    Devise::Encryptor.digest(self.class, password)
  end

  # and also in the password check:
  def valid_password?(password)
    Devise::Encryptor.compare(self.class, encrypted_password, password)
  end

Devise allows you to customize the algorithm being used here by creating your own Devise::Encryptable::Encryptors, which is the correct way to do it. For demonstration purposes, however, we'll be using a module.

# app/models/password_digest_module
module PasswordDigestModule
  def password_digest(password)
    # Devise's default bcrypt is better for passwords,
    # using sha1 here just for demonstration
    Digest::SHA1.hexdigest(password)
  end

  def valid_password?(password)
    Devise.secure_compare(password_digest(password), self.encrypted_password)
  end
end

begin
  User.include(PasswordDigestModule)
# Pro-tip - because we are calling User here, ActiveRecord will
# try to read from the database when this class is loaded.
# This can cause commands like `rails db:create` to fail.
rescue ActiveRecord::NoDatabaseError, ActiveRecord::StatementInvalid
end

To get this module loaded, you'll need to call Rails.application.eager_load! in development or add a Rails initializer to load the file. By testing it out, we can see it works as expected:

> User.create!(email: "[email protected]", name: "Test", password: "TestPassword")
=> #<User id: 1, name: "Test", created_at: "2021-05-01 02:08:29", updated_at: "2021-05-01 02:08:29", posts_count: nil, email: "[email protected]">
> User.first.valid_password?("TestPassword")
=> true
> User.first.encrypted_password
=> "4203189099774a965101b90b74f1d842fc80bf91"

In our case here, both include and prepend would have the same result, but let's add a complication. What if our User model implements its own password_salt method, but we want to override it in our module methods:

class User < ApplicationRecord
  # Include default devise modules. Others available are:
  # :confirmable, :lockable, :timeoutable, :trackable and :omniauthable
  devise :database_authenticatable, :registerable,
         :recoverable, :rememberable, :validatable
  has_many :posts

  def password_salt
    # Terrible way to create a password salt,
    # purely for demonstration purposes
    Base64.encode64(email)[0..-4]
  end
end

Then, we update our module to use its own password_salt method when creating the password digest:

  def password_digest(password)
    # Devise's default bcrypt is better for passwords,
    # using sha1 here just for demonstration
    Digest::SHA1.hexdigest(password + "." + password_salt)
  end

  def password_salt
    # an even worse way of generating a password salt
    "salt"
  end

Now, include and prepend will behave differently because which one we use will determine which password_salt method Ruby executes. With prepend, the module will take precedence, and we get this:

> User.last.password_digest("test")
=> "a94a8fe5ccb19ba61c4c0873d391e987982fbbd3.salt"

Changing the module to use include will instead mean that the User class implementation takes precedence:

> User.last.password_digest("test")
=> "a94a8fe5ccb19ba61c4c0873d391e987982fbbd3.dHdvQHRlc3QuY2"

Generally, I reach for prepend first because, when writing a module, I find it easier to treat it more like a subclass and assume any method in the module will override the class' version. Obviously, this is not always desired, which is why Ruby also gives us the include option.

Descendants

We've seen how Ruby keeps track of class ancestors to know the order-of-precedence when executing methods, as well as how we insert entries into this list via modules. However, as programmers, it can be useful to iterate through all of a class' descendants, too. This is where ActiveSupport's #descendants method comes in. The method is quite short and easily duplicated outside Rails if needed:

class Class
  def descendants
    ObjectSpace.each_object(singleton_class).reject do |k|
      k.singleton_class? || k == self
    end
  end
end

ObjectSpace is a very interesting part of Ruby that stores information about every Ruby Object currently in memory. We won't dive into it here, but if you have a class defined in your application (and it's been loaded), it will be present in ObjectSpace. ObjectSpace#each_object, when passed a module, returns only objects that match or are subclasses of the module; the block here also rejects the top level (e.g., if we call Numeric.descendants, we don't expect Numeric to be in the results).

Don't worry if you don't quite get what's happening here, as more reading on ObjectSpace is probably required to really get it. For our purposes, it's enough to know that this method lives on Class and returns a list of descendant classes, or you may think of it as the "family tree" of that class' children, grandchildren, etc.

Real-World Use of #descendants

In the 2018 RailsConf, Ryan Laughlin gave a talk on 'checkups'. The video is worth a watch, but we'll just extract one idea, which is to periodically run through all rows in your database and check if they pass your models' validity checks. You may be surprised how many rows in your database don't pass the #valid? test.

The question, then, is how do we implement this check without having to manually maintain a list of models? #descendants is the answer:

# Ensure all models are loaded (should not be necessary in production)
Rails.application.load! if Rails.env.development?

ApplicationRecord.descendants.each do |model_class|
  # in the real world you'd want to send this off to background job(s)
  model_class.all.each do |record|
    if !record.valid?
      HoneyBadger.notify("Invalid #{model.name} found with ID: #{record.id}")
    end
  end
end

Here, ApplicationRecord.descendants gives us a list of every model in a standard Rails application. In our loop, then, model is the class (e.g., User or Product). The implementation here is pretty basic, but the result is this will iterate through every model (or, more accurately, every subclass of ApplicationRecord) and call .valid? for every row.

Conclusion

For most Rails developers, modules are not commonly used. This is for a good reason; if you own the code, there are usually easier ways to customize its behavior, and if you don't own the code, there are risks in changing its behavior with modules. Nevertheless, they have their use-cases, and it is a testament to Ruby's flexibility that not only can we change a class from another file, we also have the option to choose where in the ancestor chain our module appears.

ActiveSupport then comes in to provide the inverse of #ancestors with #descendants. This method is seldom used as far as I've seen, but once you know it's there, you'll probably find more and more uses for it. Personally, I've used it not just for checking model validity, but even with specs to validate that we are correctly adding attribute_alias methods for all our models.

Source: Honeybadger