Mixins: a refactoring anti-pattern
I spend an unusually large amount of time thinking about interactions between what I call āpast meā and āfuture me.ā It seems that my life changes significantly every few years, and I like to ground myself by imagining how odd it would be if ācurrent meā could tell āpast meā things like āSomeday, youāll be speaking at OSCON.ā
Itās not always general life stuff, though, itās often a trick or technique with code. How many times have you learned something new, and then had that terrible urge to go back and re-write all of that terrible, terrible code youāve written in the past? This happens to me constantly.
So here, let me show you something I realized that I used to do wrong all the time.
Letās do it via a quiz. Of course, because Iām framing it this way, youāll get the answer, but seriously, just play along.
Two factorizations
Which code is better factored?
# app/models/post.rb
class Post
include Behaviors::PostBehavior
end
# app/models/behaviors/post_behavior.rb
module Behaviors
module PostBehavior
attr_accessor :blog, :title, :body
def initialize(attrs={})
attrs.each do |k,v| send("#{k}=",v) end
end
def publish
blog.add_entry(self)
end
# ... twenty more methods go here
end
end
or
class Post < ActiveRecord::Base
def publish
blog.add_entry(self)
end
# ... twenty more methods go here
end
One line of reasoning asserts that the first example is better. Hereās why: all of the behavior is in one place, and the persistence is in another. A clear win.
Another line of reasoning asserts that the second is better. Hereās why: itās significantly simpler. A clear win.
So which is right?
Theyāre both wrong
Both of these justifications are wrong. As far as Iām concerned, these two classes are equally complex, but because there is a lot more ceremony in version one, version two is preferable. Assertion one is wrong because both are still in the same place, theyāre just in two different files. Assertion two is wrong because itās not simpler than version one, itās equivalently complex.
Measures of complexity
Whenever we refactor, we have to consider what weāre using to evaluate that our refactoring has been successful. For me, the default is complexity. That is, any refactoring Iām doing is trying to reduce complexity. So how do I measure complexity?
One good way that I think about complexity on an individual object level comes from security practices: the āattack surface.ā We call this āencapsulationā in object oriented software design. Consider these two objects:
class Foo
def bar
end
def baz
end
end
class Bar
def quxx
end
end
Bar has one less way into the object than Foo does. Hence the idea of a āsurfaceā that an attacker would have access to. A bigger surface needs more defense, more attention, and is harder to lock down.
Note that I said āone less.ā If you said āFoo has a surface of two, and Bar has a surface of one,ā youād be wrong. Luckily, Ruby provides us with a simple way to calculate attack surface. Just use this monkey patch: (Iām not suggesting that you actually use this monkey patch.)
class BasicObject
def attack_surface
methods.count
end
end
This gives us
1.9.2p318 > Foo.new.attack_surface
=> 59
1.9.2p318 > Bar.new.attack_surface
=> 58
Oooooh yeah. Object.new.methods.count
is 57⦠we forgot about those. This is why mixins do not really reduce the complexity of your objects. Both versions of Post above have the same number of methods. How many, you ask?
> Post.new.methods.count
=> 337
> Post.new.methods.count - Object.new.methods.count
=> 236
In this case, my Post
is an object that inherits from ActiveRecord::Base
and defines nothing else. ActiveRecord adds in 236 methods to our object. That is a pretty huge increase.
Reducing complexity through encapsulation
Several people have been dancing around this issue for a while by saying something like āConsider ActiveRecord a private interface, and never use it from outside of your class.ā They feel this exact pain, but enforce it through social means rather than reducing the complexity of objects. Now, there are good reasons to do that, so I donāt think itās categorically wrong, but it certainly is a compromise.
An implementation of DataMapper would be able to solve this problem,
class Post
def initialize(title, body)
@title = title
@body = body
end
end
class PostMapper
def save(post)
@database.insert(:posts, [post.title, post.body])
end
end
PostMapper.new.save(Post.new("Hello DataMapper", "DM rocks"))
or the repository pattern:
class PostRepository
def first
title, body = @database.select(:posts, "LIMIT 1")
Post.new(title, body)
end
end
All incredibly theoretical and pretend. But note that the Post
in both instances has no methods defined on it that deal with persistence whatsoever. Weāve used encapsulation to hide all of the details of the database in the database, the details of the post logic in the Post
, and we use the Mapper or Repository to mediate between the two.
Using Mixins the right way
Mixins are awesome, though, and can be used to reduce duplication. Theyāre for cross-cutting concerns:
class Post
def summary
"#{author}: #{body[0,200]}"
end
end
class Comment
def summary
"#{author}: #{body[0,200]}"
end
end
moves to
class Post
include Summarizer
end
class Comment
include Summarizer
end
module Summarizer
def summary
"#{author}: #{body[0,200]}"
end
end
This example isnāt fantastic, but whatever. Weāve eliminated some duplication, even if we didnāt reduce our method surface. This is a win along a different axis, we just shouldnāt fool ourselves that weāve made our classes āsimpler.ā
This isnāt the only metric
Now, I donāt mean to imply that method count is the be-all, end-all metric of complexity. Real software development is too complicated to wrap up in one tiny number. However, I do think thereās a strong correlation between small, focused classes and easy to understand software systems. Aggregates can be used to make it easier to understand the use of a cluster of smaller objects. But thatās another post.