Data classes in Kotlin: how do they impact application size Kotlin has numerous excellent features: null safety, smart casts, string interpolation and more. However, one of its features developers love the most, I have observed, are data classes. They are so well-loved that they are often used where no data class functionality is required. In […]
Kotlin has numerous excellent features: null safety, smart casts, string interpolation and more. However, one of its features developers love the most, I have observed, are data classes. They are so well-loved that they are often used where no data class functionality is required.
In this article, with the help of an experiment, I will try to understand the real cost of using a high number of data classes in an application. I am going to delete all the data classes without breaking the compilation. Then I will share the experiment’s results and outcomes. During the experiment, the application will be broken, but this is not an issue for us because we just want to measure impact.
During the development process, we often create classes, whose main purpose is to store data. But in Kotlin, these can be declared as data classes to obtain additional functionality:
But we don’t pay for all the functionality — not by a long stretch. For release builds, the optimisers such as R8, ProGuard, DexGuard and others are used. These can delete unused methods, and that means that they can optimise data classes.
This is what will be deleted:
This is what will not be deleted:
Thus, toString(), equals() and hashCode() always remain in the release builds.
To measure the impact that application-scale data classes have on the app size, I decided to put forward a hypothesis: not all the data classes are necessary for a project, and they can be replaced with ordinary ones. Since for release builds we use an optimiser, which can delete the componentX() and copy() methods, transforming the data classes into ordinary ones can boil down to the following:
However, this behaviour cannot be implemented manually. The only way to delete these functions from the code is to redefine them in the following form for each data class in the project:
Manually for 7749 data classes in the project.
The use of a mono repository for apps exacerbates the situation. This means that I don’t know how many of these 7749 classes I need to change in order to measure the impact of data classes on just one app. So, I have to change everything!
Making this volume of changes manually is impossible so this is the time to remember about compiler plugins — which are wonderful yet undocumented. We have already told you about our experience of creating a compiler plugin in the article “Fixing serialization of Kotlin objects once and for all”. But that is where we generate new methods, while this is where we need to delete them.
There is a plugin Sekret freely available on GitHub, which allows you to hide, in toString(), the fields in the data classes specified with the annotation. This is what I used as the basis for my new plugin.
From the point of view of creating a structure for the project, practically nothing has changed. This is what we will need:
The most important part of the Gradle plugin is the KotlinGradleSubplugin declaration. This subplugin will be connected via ServiceLocator. Using the basic Gradle plugin we can configure KotlinGradleSubplugin, which will configure the behaviour of the compiler plugin.
A plugin compiler has two important components: ComponentRegistrar and CommandLineProcessor. The former is responsible for integrating our logic into compilation stages; the second, for handling the parameters for our plugin. I won’t describe them in detail here but you can view the implementation in the repository. I would just like to point out that, unlike the method described in another article, we will be registering ClassBuilderInterceptorExtension, not ExpressionCodegenExtension.
At this point in time, it is essential to prevent the compiler from creating some methods. For this, we’re going to use DelegatingClassBuilder. It will delegate all the calls to the original ClassBuilder while at the same time allowing us to redefine the behaviour of the newMethod. If we try to create the methods toString(), equals(), hashCode(), then we will return an empty MethodVisitor. The compiler will write code for these methods to it, but it will not get into the class being created.
Thus, we intervened in the process of creating data classes and completely excluded the above-mentioned methods from them. You can make sure these methods are no longer there by using code accessible in the sample project. You can also check the JAR/DEX byte code to make sure it doesn’t contain any of these methods.
The entire code is available in the repository, where you will also find an example of plugin integration.
For the purposes of comparison, we will use the Bumble and Badoo release builds. The results were obtained using the Diffuse tool, which outputs detailed information on the difference between two APK files: the sizes of the DEX files and resources, and the number of lines, methods and classes in the DEX file.
The number of data classes was determined heuristically through analysis of the strings deleted from the DEX file.
The toString() implementation for data classes always begins with the short name of the class, an open bracket and the first field of the data class. There is no such thing as a data class without fields.
From the results you can conclude that, on average, each data class represents 120 bytes compressed and 400 bytes uncompressed. At first sight this doesn’t seem to be much, so I decided to check how many it works out on the application as a whole. It became clear that all the data classes in the project represent 4% of the size of the DEX file.
It is also worth clarifying that, because of the MVI architecture, we tend to use more data classes than applications in other architectures, possibly resulting in reduced impact on your application.
By no means am I urging you to avoid using data classes, but, when making a decision on whether to or not, you need to take every aspect into consideration. Here are some questions which are worth asking before declaring a data class:
We cannot completely refrain from using data classes in the project, and the plugin referred to above breaks the application. Methods were deleted for the sake of assessing the impact of a large number of data classes. In our case, this was 4% of the size of the app’s DEX file.
If you want to assess how much space data classes take up in your application, you can do it by yourself using my plugin. If you too have carried out the same experiment, please feel free to share your feedback!
Data classes in Kotlin: how do they impact an application size was originally published in Bumble Tech on Medium, where people are continuing the conversation by highlighting and responding to this story.