rebar3_hank: The Erlang Dead Code Cleaner

From the creators of rebar3_format, here comes… rebar3_hank, a powerful but simple tool to detect dead code around your Erlang codebase (and kill it with fire!). Developers can use this rebar plugin in addition to a linter (Elvis), Xref, and Dialyzer; they complement each other perfectly. 10 minute read Hank Scorpio – Kill it with […]

From the creators of rebar3_format, here comes… rebar3_hank, a powerful but simple tool to detect dead code around your Erlang codebase (and kill it with fire!).

Developers can use this rebar plugin in addition to a linter (Elvis), Xref, and Dialyzer; they complement each other perfectly.

10 minute read


Hank Scorpio - Kill it with fire!
Hank Scorpio – Kill it with fire!

Introduction

In NextRoll’s RTB Team, we have two passions while we maintain our codebases: killing dead code and automating things! That’s why we thought that a tool for automating this process would be handy for the community and us. So, we started thinking more seriously about Hank, and we decided to spend our Winter HackWeek time to make it possible!

Nobody wants to maintain dead code. In fact, most of us are huge fans of negative PRs. Hank can help you with that by traversing your project, analyzing every .erl and .hrl file in it (optionally skipping some folders/files if you want to), applying the rules, and producing a list of all the code that you can effectively delete and/or refactor.

The best thing is that you can be sure that the dead code is, in fact, dead since Hank is built with Dialyzer levels of certainty™️.

Differences between Hank and other existing tools

You might be thinking: Why Hank? That’s a job for my linter!

The answer is: No.

We use Elvis for code linting; it reviews our Erlang code style like function naming, nesting level, line length, variable naming convention, etc.

Hank doesn’t do that.

Xref is a cross-reference tool that can be used for finding dependencies between functions, modules, applications, and releases. It does so by analyzing the defined functions and function calls.
So it will warn us about a defined function that is never used around our source code.

Hank doesn’t do that either.

And Dialyzer?
Dialyzer is a static analysis tool that identifies software discrepancies, such as success type errors, code that has become dead or unreachable because of a programming error, and unnecessary tests, among other things.
It bases its analysis on the concept of success typings.

Hank does not rely on specs nor evaluates the “semantics” in functions params/returns.

So what exactly does Hank do?

Hank will detect and warn you about valid parts of your code that could potentially be deleted or at least refactored based on rules.

It works on entire projects (as opposed to Elvis, which works on individual files), on source code (as opposed to Xref, which works on compiled code), and on individual projects (as opposed to Dialyzer, which analyzes entire systems – including OTP and your dependencies).

The current version while writing this post is 0.2.1. It’s a minor version, but we’re already using it in our systems, and it’s practically ready for production usage.


How to use rebar3_hank

Just add this to your rebar.config (either in your project or globally in ~/.config/rebar3/rebar.config):

{plugins, [rebar3_hank]}.

Then run…

rebar3 hank

…and kill it with fire!

Ignoring rules

There are cases where you need to ignore some rules, like when developing libraries, where you can define hrls or modules which will be consumed by other projects. In those cases, you’ll possibly need to ignore some rules (like single_use_hrl_attributes). Something similar happens when using Xref.

For this purpose, you can ignore hank at the module level:

% ignoring all the rules for this module
-hank ignore

% or ignoring specific rules
-hank [single_use_hrl_attributes]

Or add this configuration in your rebar.config:

{hank, [{ignore, [
    {"test/*.erl", unused_ignored_function_params}
]}]}.

The Rules

Here you can see the rules we’ve already created, and you can use them with Hank directly.

unused_ignored_function_params

Functions evolve, and some parameters that were used before may no longer be needed. A typical easy solution could be just ignoring them and forgetting about the issue.

Hank detects ignored parameters in the same position for all function clauses and lets you know that you can delete those parameters and refactor the places where the function is invoked, thus making your code cleaner. 😉

For instance, when analyzing this module…

-module(my_module).

-export([external_fun/1]).

external_fun(X) ->
    multi_fun(X, rand:uniform(), undefined).

%% A multi-clause function with unused 3rd param
multi_fun(undefined, _, _) ->
    ok;
multi_fun(Arg1, Arg2, _Arg3) when is_binary(Arg1) ->
    Arg2;
multi_fun(Arg1, _, _) ->
    Arg1.

Hank will output…

$ rebar3 hank
===> Looking for code to kill with fire...
===> The following pieces of code are dead and should be removed:
src/my_module.erl:9: Param #3 is not used at 'multi_fun/3'

To avoid this warning, remove the unused parameter(s).

single_use_hrls

Sometimes you put some code in a header file that’s supposed to be shared among multiple modules, but you end up writing just one module that uses it. In this case, it would be better to directly put the header file’s contents in the module itself. And Hank has a rule for that!

Assuming header.hrl:

-define(APP_HEADER, "this is a header from an app that will be used in just one module").
-define(SOME_MACRO(A), A).
-module(app_include_lib).

-include("header.hrl").

-export([my_function/0]).

my_function() ->
  % those are only used here!
  ?SOME_MACRO(?APP_HEADER).

It will output:

$ rebar3 hank
===> Looking for code to kill with fire...
===> The following pieces of code are dead and should be removed:
header.hrl:0: This header file is only included at: src/app_include_lib.erl

Move the hrl file’s contents directly to the module that uses them, and you’ll not see this warning again.

See a complete example here.

single_use_hrl_attrs

Sometimes it’s more subtle, tho. It’s not that the whole file is used in just one module; it is shared among many modules. But some attributes (like macros or records) are not. They are defined in the header file but only used in a single module. Hank has a rule that will suggest you to place those attributes inside the module to limit the amount of stuff that’s shared unnecessarily.

Given the previous files and including the hrl in another file:

-module(app_include_lib_2).

-include("header.hrl").

It will output:

$ rebar3 hank
===> Looking for code to kill with fire...
===> The following pieces of code are dead and should be removed:
include/header.hrl:2: ?SOME_MACRO/1 is used only at src/app_include_lib.erl

See a complete example here

unused_hrls

Sometimes the situation is even worse: You might have hrl files that are not included in any module. Hank will detect those and let you know that you can remove them entirely since they’re virtually useless.

Adding a header_2.hrl file which is not included, the output will be:

$ rebar3 hank
===> Looking for code to kill with fire...
===> The following pieces of code are dead and should be removed:
include/header_2.hrl:0: This file is unused

See an example here

It’s worth mentioning that erlang-ls already provides a similar functionality.

unused_macros

Hank also has a rule that will detect unused macros around the project. Those macros could be defined in any file within the source code but used in none of them. Therefore, they are effectively unnecessary and can be deleted.

See an example here

unused_record_fields

A fascinating one! With this rule, Hank will spot record declarations with fields that are defined (even giving them default values) but never used. Hank considers that you are using a record field when it is accessed or written.

You can use this warning to reduce your records’ size by removing the unused fields from them.

See an example here


Extensibility

Following the lead of Elvis and rebar3_format, we built this project with extensibility in mind. Anybody can write their own rules for their projects by just implementing the hank_rule behavior.

But if you feel like sharing your new rules with the world, we are eager to get community contributions in the rebar3_hank GitHub! Check out the open issues, and feel free to open new ones!
You can also use the discussions page to get in touch with us.


Testing Hank’s Power

To see how powerful Hank was, we decided to test it in a very large codebase.

We decided to try with Erlang/OTP itself. Since it’s mainly composed by libraries, we had to limit the rules to apply to avoid some bogus results. We used this configuration:

{hank, [
    {ignore, ["**/test/**"]}, %% Just "production" code, no tests
    {rules, [
        unused_ignored_function_params,
        unused_hrls,
        unused_macros,
        unused_record_fields
    ]}
]}.

We hoped to find a large number of warnings, but never as large as what we found. Hank found more than 4000 pieces of dead code in OTP’s production code (i.e., we didn’t check the tests).

Indeed, not all of them are supposed to be removed, but to give you a taste of the stuff that Hank found, check out the following warnings…

Unused Fields in Records

Hank found 130 unused fields in records, like this one in erl_tidy or remote_logger here.

Unused Macros

Hank found more than 1000 unused macros in OTP, most of them in large modules of the megaco application and others like this one in xmerl_uri.

Unused Parameters

Hank also found more than 2000 functions with unused params. Some of them are not actually errors, like this one that’s masking a NIF function (Which will be fixed soon). But others are worth checking, like this non-exported function that never uses its first argument.

Source: AdRoll