Many hyperparameters? Use Gin
Motivation
The common approaches to handling user-specific settings include parsing command line arguments or using separate python files that store definitions. For small projects, these techniques are easy to incorporate and maintain.
However, once you have to handle a plethora of configurable (hyper-)parameters, you’ll quickly lose track of the situation.
As a starting point, consider this straightforward code snippet:
If you only work with a couple of parameters, say five to ten, you are fine with manipulating them via the command line, like so:
Alternatively, you could also store these values in a separate python file and give them meaningful names, as shown in the next snippet. The values are stored in definitions.py (not shown, but the content should be clear). These are then imported and worked with as follows:
Now, as you keep adding more and more parameters, you’ll either inflate the command line parsers or have to import everything from a separate file. This setup means you have to keep track of what you already defined/ imported and what not. As a result, the situation can quickly spiral out of control:
Even all values are imported, we still have to pass them to the called function, leading us to repeatedly writing the same code. If we were to use argument parsers, this would go in a similar direction. And we would have very long and chaotic command line calls.
Gin
Enter Gin:
Gin helps you maintain an overview of all your hyperparameters. It achieves this by reading them from a configuration file (similar to the hypothetic definition.py file). But, different to having to import them all and manually handle them, you can annotate functions with @gin.configurable as shown:
Note that we don’t have to set any parameters when calling method1, even though they are all declared mandatory (i.e., we must pass them). This is due to Gin, which parses the config file and loads the parameters with the stored value. Nonetheless, if you want to set a parameter manually when calling the method, you can do so, and it overwrites Gin’s setting of that particular parameter.
Let’s see how the mentioned config file could look:
As you notice, the parameters are written as <name of the annotated method>.<parameter name>. By following this convention, you can easily associate them with their respective place. This also works for classes, in which case you overwrite the __init__ method. Further, you can bundle all settings into one configuration file. Gin can detect the appropriate method by the parameter naming convention, making one file per annotated method or class obsolete.
The magic of Gin unfolds with the number of hyperparameters to track. You can also combine this with command line parsers to parse the most important parameters — such as epochs, batch size, learning rate — , and load more detailed settings from the config file.
You can also maintain two configuration files for easier handling: One with all the default parameters for all annotated methods and classes, and one with your custom settings. This way, you’ll never lose track of your initial values and can easily switch between default and experimental settings.
Conclusion
Gin is an easy to use framework. It reduces your code’s complexity while increasing flexibility. All you have to do is outsourcing desired parameter settings to a configuration file, annotating methods and classes with @gin.configurable, and then load the configuration during runtime.
Use Gin responsibly.