Models and Optimizers

In the previous notebooks we have demonstrated two major building blocks of ESCAPE package: parameters and functors. In this notebook we discuss objects which are relevant to optimization.

Kernel

Kernel does very little but very important job, it vectorizes a given functor, i.e. it evaluates functor over successive elements of the input array. And it supports multithreading, making all CPU cores busy.

Below we create two functors with shared parameters and corresponding kernels. This kernels we use to create "experimental data" which we will fit later.

Next we create data objects using the generated arrays with poisson noise.

Model

Model is a container for the kernel and the experimental data. It is responsible for calculating residuals and cost, which are necessary to optimize all parameters provided by the user. The model has two settings, weight_type=("none", "data") and residuals_scale=("none", "log"). The latter is used to calculate residuals and the former indicates which weights will be used to calculate cost.

Optimizer

Currently there are two optimization algorithms supported by ESCAPE.

The Levenberg-Marquardt algorithm

It has been quite a long period of time the standard method for non-linear data fitting and still is very usefull for many applications. It starts at the initial value of the cost function and, as a gradient descent trust region method, steps in the direction of the derivative until it reaches the local minimum. The cost function is the sum of square differences between theory and model. This algorithm uses a numerical approximation of the Jacobian matrix to set the step direction and an adaptive algorithm for the size of the trust region.

One should use this method when there is a reasonable fit exist near the minimum and one would like to get the best value for the optimized parameters. Compared to stochastic methods, LM converges much faster.

Before optimization we will "shake" values of our parameters. That means that the starting values of the parameters will be chosen randomly but within their limits. They have to be explicitly selected by the user as shown below.

Differential evolution

Differential evolution is a stochastic population based algorithm which uses differences between solutions points as a guide to selecting new points for new solution. A difference vector is computed for a pair of randomly chosen points for each member of the population. Based on crossover setting a random subset of vector's components are added to the current point. If the cost of this new point is smaller than the cost of the current point, it replaces the current point. Convergence with differential evolution will be slower, but more robust.

ESCAPE user can boost the convergence speed of DE algorithm using the following polishing settings:

- polish_final_maxiter - maximum number of iterations for LM algorithm after DE
- polish_candidate_maxiter - maximum number of iterations for LM algorithm for each candidate

Each time the candidate is chosen by the DE algorithm, LM method is started to slightly improve its value in the direction of the derivative. We recommend to use small number of iterations for the candidate polishing.