Classical optimization using Optuna

Classification

Optimization has become one of the most important tools in today’s digital world. In many situations, performance improvements are important, whether in the private or corporate sector. In particular, manual or analytical optimisation becomes more and more complex due to the number of parameters of increasingly complex models. Therefore, today I will give a short introduction to the field of classical optimization in Python.

Basics

What is the goal of a classical optimization?

The goal of an optimization is the maximization or minimization of the target value of a given model. This is done by varying various parameters that are defined in the model. The objective value classically arises from an objective function, which does not necessarily have to be known beforehand. Subsequently, different methods can be used to slowly approach the required optimum. ¹

A list of different numerical methods which can be used for such a task can be here can be found.

Depending on the given problem, certain algorithms may not be able to find a solution. For some problems it may even be that no solution can be found, regardless of the algorithm used.

Implementation of an example of classical optimization

In the following I show an example of how an implementation of a simple example could look like. Im using the Python library Optuna, which simplifies the implementation of an optimization algorithm. Algorithms for the optimization must not be implemented here directly, but are made available through the library. Efficient implementation is relevant here, since thereby a fast and efficient optimization is made possible. This efficient implementation is provided by Optuna.

Furthermore, Optuna features compatibility with all current machine learning libraries. This will be discussed in another article.

Simple example to get you started

To get started, we use a classic function minimization problem, in this case the function

$$f: R \rightarrow R: x \mapsto 4x^4 - 2x^2 + 7.$$

The function can be visualized in the interval $$x \in [-4, 4]$$ as follows:

From an analytical point of view the minimum is easy to calculate

$$ \frac{df}{dx}(x) = 16x^3 - 4x = 0 \leftrightarrow 16x^2 = 4 \leftrightarrow x = \pm \sqrt{\frac{1}{4}} = \pm \frac{1}{2},$$

under the condition that $$x \neq 0.$$ The minimum here is at $$x_m = \frac{1}{2}$$, since there $$\frac{d^2f}{dx^2}(x_m) > 0$$ holds.

The implementation runs as follows:

import optuna

def value(x):
    return (4*(x**4) - 2*(x**2) + 7)

def objective(trial: optuna.trial.Trial):
    x = trial.suggest_float("x", -4, 4)
    return value(x)

study = optuna.create_study(study_name="Polynomial Example", direction="minimize")
max_trials = 250
study.optimize(objective, n_trials=max_trials)

Important for Optuna is the objective function in the code. This returns a scalar value that can be specified to be either maximized or minimized. Furthermore, this is passed an optuna.Trial.trial object, through which all hyperparameters of the current optimization step can be requested. This is done by functions like trial.suggest_float, which take the name of the parameter as argument, as well as the upper and lower bound of the parameter. For more functions for optimization, see here.

In the case above, the value of the polynomial at location $$x$$ (which is the optimization parameter) is returned. By executing the code Optuna starts its work and minimizes the function value by adjusting our parameter $$x$$. The bounds for the parameter were set as $$x \in [-4, 4]$$. On completion we should have a value of $$x \approx 0.5$$.

For larger projects, it is a good idea at this point to read out the best value and the values of the best parameter variations. This can be done in the following way:

trial = study.best_trial
print("Best value: ", trial.value)
for key, param in trial.params.items():
    print(f"{key}: {param}")

A sample output then looks like this in the example above:

Best value:  6.750018072628766
x: 0.4978698668063428

The following graphic shows the corresponding points that were used in the optimization:

TL;DR

Optimization libraries such as Optuna are excellent not only for hyperparameter optimization of neural networks, but also for more classical, everyday optimization tasks. It does not require much prior knowledge, just data from the optimization problem. Because of its stability and ease of use, I personally think Optuna is especially suited for this. If for some reason Optuna is not optimal for your own application, you can fall back on a variety of other optimization libraries, such as SciPy Optimize or a number of other libraries.

Many libraries can be compared here.

Sources

wikipedia.org ↩