# $$User\ Defined\ Metrics\ Tutorial$$

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/catboost/tutorials/blob/master/custom_loss/custom_loss_and_metric_tutorial.ipynb)

# Contents
* [1. Introduction](#1.\-Introduction)
* [2. Classification](#2.\-Classification)
* [3. Regression](#3.\-Regression)
* [4. Multiclassification](#4.\-Multiclassification)

# 1. Introduction

CatBoost allows you to create and pass to model your own loss functions and metrics. To do this you should implement classes with specicial interfaces.

##### Interface for user defined objectives:

In [1]:
class UserDefinedObjective(object):
    def calc_ders_range(self, approxes, targets, weights):
        # approxes, targets, weights are indexed containers of floats
        # (containers which have only __len__ and __getitem__ defined).
        # weights parameter can be None.
        #
        # To understand what these parameters mean, assume that there is
        # a subset of your dataset that is currently being processed.
        # approxes contains current predictions for this subset,
        # targets contains target values you provided with the dataset.
        #
        # This function should return a list of pairs (der1, der2), where
        # der1 is the first derivative of the loss function with respect
        # to the predicted value, and der2 is the second derivative.
        pass
    
class UserDefinedMultiClassObjective(object):
    def calc_ders_multi(self, approxes, target, weight):
        # approxes - indexed container of floats with predictions 
        #            for each dimension of single object
        # target - contains a single expected value
        # weight - contains weight of the object
        #
        # This function should return a tuple (der1, der2), where
        # - der1 is a list-like object of first derivatives of the loss function with respect
        # to the predicted value for each dimension.
        # - der2 is a matrix of second derivatives.
        pass

##### Interface for user defined metrics:

In [2]:
class UserDefinedMetric(object):
    def is_max_optimal(self):
        # Returns whether great values of metric are better
        pass

    def evaluate(self, approxes, target, weight):
        # approxes is a list of indexed containers
        # (containers with only __len__ and __getitem__ defined),
        # one container per approx dimension.
        # Each container contains floats.
        # weight is a one dimensional indexed container.
        # target is a one dimensional indexed container.
        
        # weight parameter can be None.
        # Returns pair (error, weights sum)
        pass
    
    def get_final_error(self, error, weight):
        # Returns final value of metric based on error and weight
        pass

Below we consider examples of user defined metrics for different types of tasks. We will use the following variables:
<center>$a$ - approx value</center>
<center>$p$ - probability</center>
<center>$t$ - target</center>
<center>$w$ - weight</center>

In [3]:
# import neccessary packages
from catboost import CatBoostClassifier, CatBoostRegressor
import numpy as np
from sklearn.datasets import make_classification, make_regression
from sklearn.model_selection import train_test_split

# 2. Classification

Note: for binary classification problems approxes are not equal to probabilities. Probabilities are calculated from approxes using sigmoid function.
<h4><center>$p=\frac{1}{1 + e^{-a}}=\frac{e^a}{1 + e^a}$</center></h4>
As an example, let's take Logloss metric which is defined by the following formula:
<h4><center>$Logloss_i = -{w_i * (t_i * log(p_i) + (1 - t_i) * log(1 - p_i))}$</center></h4>
<h4><center>$Logloss = \frac{\sum_{i=1}^{N}{Logloss_i}}{\sum_{i=1}^{N}{w_i}}$</center></h4>
This metric has derivative and can be used as objective. The derivatives of Logloss for single object are defined by the following formulas:
<h4><center>$\frac{\delta(Logloss_i)}{\delta(a)} = w_i * (t_i - p_i)$</center></h4>
<h4><center>$\frac{\delta^2(Logloss_i)}{\delta(a^2)} = -w_i * p_i * (1 - p_i)$</center></h4>
Below you can see implemented Logloss objective and metric.

In [4]:
class LoglossObjective(object):
    def calc_ders_range(self, approxes, targets, weights):
        assert len(approxes) == len(targets)
        if weights is not None:
            assert len(weights) == len(approxes)
        
        result = []
        for index in range(len(targets)):
            e = np.exp(approxes[index])
            p = e / (1 + e)
            der1 = targets[index] - p
            der2 = -p * (1 - p)

            if weights is not None:
                der1 *= weights[index]
                der2 *= weights[index]

            result.append((der1, der2))
        return result

In [5]:
class LoglossMetric(object):
    def get_final_error(self, error, weight):
        return error / (weight + 1e-38)

    def is_max_optimal(self):
        return False

    def evaluate(self, approxes, target, weight):
        assert len(approxes) == 1
        assert len(target) == len(approxes[0])

        approx = approxes[0]

        error_sum = 0.0
        weight_sum = 0.0

        for i in range(len(approx)):
            e = np.exp(approx[i])
            p = e / (1 + e)
            w = 1.0 if weight is None else weight[i]
            weight_sum += w
            error_sum += -w * (target[i] * np.log(p) + (1 - target[i]) * np.log(1 - p))

        return error_sum, weight_sum

Below there are examples of training with built-in Logloss function and our Logloss objective and metric. As we can see, the results are the same.

In [6]:
X, y = make_classification(n_classes=2, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

In [7]:
model1 = CatBoostClassifier(iterations=10, loss_function='Logloss', eval_metric='Logloss',
                            learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                            leaf_estimation_iterations=1, leaf_estimation_method='Gradient')
model1.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 0.6900380	test: 0.6907175	best: 0.6907175 (0)	total: 49.5ms	remaining: 446ms
1:	learn: 0.6866060	test: 0.6873479	best: 0.6873479 (1)	total: 51.8ms	remaining: 207ms
2:	learn: 0.6835392	test: 0.6852325	best: 0.6852325 (2)	total: 54.1ms	remaining: 126ms
3:	learn: 0.6804590	test: 0.6829075	best: 0.6829075 (3)	total: 56.4ms	remaining: 84.6ms
4:	learn: 0.6776740	test: 0.6816999	best: 0.6816999 (4)	total: 58.6ms	remaining: 58.6ms
5:	learn: 0.6749116	test: 0.6794533	best: 0.6794533 (5)	total: 61.8ms	remaining: 41.2ms
6:	learn: 0.6712701	test: 0.6772634	best: 0.6772634 (6)	total: 65ms	remaining: 27.8ms
7:	learn: 0.6681755	test: 0.6747041	best: 0.6747041 (7)	total: 68.2ms	remaining: 17ms
8:	learn: 0.6658881	test: 0.6732683	best: 0.6732683 (8)	total: 71.3ms	remaining: 7.93ms
9:	learn: 0.6633931	test: 0.6720979	best: 0.6720979 (9)	total: 73.7ms	remaining: 0us

bestTest = 0.6720978617
bestIteration = 9



<catboost.core.CatBoostClassifier at 0x7f10798d22e8>

In [8]:
model2 = CatBoostClassifier(iterations=10, loss_function=LoglossObjective(), eval_metric=LoglossMetric(), 
                            learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                            leaf_estimation_iterations=1, leaf_estimation_method='Gradient')
model2.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 0.6900380	test: 0.6907175	best: 0.6907175 (0)	total: 4.36ms	remaining: 39.2ms
1:	learn: 0.6866060	test: 0.6873479	best: 0.6873479 (1)	total: 9.44ms	remaining: 37.8ms
2:	learn: 0.6835392	test: 0.6852325	best: 0.6852325 (2)	total: 15.2ms	remaining: 35.5ms
3:	learn: 0.6804590	test: 0.6829075	best: 0.6829075 (3)	total: 19.8ms	remaining: 29.6ms
4:	learn: 0.6776740	test: 0.6816999	best: 0.6816999 (4)	total: 24.5ms	remaining: 24.5ms
5:	learn: 0.6749116	test: 0.6794533	best: 0.6794533 (5)	total: 29.2ms	remaining: 19.5ms
6:	learn: 0.6712701	test: 0.6772634	best: 0.6772634 (6)	total: 34.8ms	remaining: 14.9ms
7:	learn: 0.6681755	test: 0.6747041	best: 0.6747041 (7)	total: 40ms	remaining: 10ms
8:	learn: 0.6658881	test: 0.6732683	best: 0.6732683 (8)	total: 45.2ms	remaining: 5.03ms
9:	learn: 0.6633931	test: 0.6720979	best: 0.6720979 (9)	total: 50.6ms	remaining: 0us

bestTest = 0.6720978617
bestIteration = 9



<catboost.core.CatBoostClassifier at 0x7f10798d2048>

# 3. Regression

For regression approxes don't need any transformations. As an example of regression loss function and metric we take well-known RMSE which is defined by the following formulas:
<h3><center>$RMSE = \sqrt{\frac{\sum_{i=1}^{N}{w_i * (t_i - a_i)^2}}{\sum_{i=1}^{N}{w_i}}}$</center></h3>
<h4><center>$\frac{\delta(RMSE_i)}{\delta(a)} = w_i * (t_i - a_i)$</center></h4>
<h4><center>$\frac{\delta^2(RMSE_i)}{\delta(a^2)} = -w_i$</center></h4>

In [9]:
class RmseObjective(object):
    def calc_ders_range(self, approxes, targets, weights):
        assert len(approxes) == len(targets)
        if weights is not None:
            assert len(weights) == len(approxes)
        
        result = []
        for index in range(len(targets)):
            der1 = targets[index] - approxes[index]
            der2 = -1

            if weights is not None:
                der1 *= weights[index]
                der2 *= weights[index]

            result.append((der1, der2))
        return result

In [10]:
class RmseMetric(object):
    def get_final_error(self, error, weight):
        return np.sqrt(error / (weight + 1e-38))

    def is_max_optimal(self):
        return False

    def evaluate(self, approxes, target, weight):
        assert len(approxes) == 1
        assert len(target) == len(approxes[0])

        approx = approxes[0]

        error_sum = 0.0
        weight_sum = 0.0

        for i in range(len(approx)):
            w = 1.0 if weight is None else weight[i]
            weight_sum += w
            error_sum += w * ((approx[i] - target[i])**2)

        return error_sum, weight_sum

Below there are examples of training with built-in RMSE function and our RMSE objective and metric. As we can see, the results are the same.

In [11]:
X, y = make_regression(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

In [12]:
model1 = CatBoostRegressor(iterations=10, loss_function='RMSE', eval_metric='RMSE',
                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                           leaf_estimation_iterations=1, leaf_estimation_method='Gradient')
model1.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 128.6631656	test: 140.6536718	best: 140.6536718 (0)	total: 3.86ms	remaining: 34.8ms
1:	learn: 128.0351695	test: 140.7369887	best: 140.6536718 (0)	total: 8.51ms	remaining: 34ms
2:	learn: 126.7781283	test: 141.0444768	best: 140.6536718 (0)	total: 11.6ms	remaining: 27.2ms
3:	learn: 125.7603646	test: 141.1458855	best: 140.6536718 (0)	total: 15.9ms	remaining: 23.8ms
4:	learn: 124.6922146	test: 141.0856002	best: 140.6536718 (0)	total: 18.6ms	remaining: 18.6ms
5:	learn: 123.6667350	test: 141.0495141	best: 140.6536718 (0)	total: 21.1ms	remaining: 14.1ms
6:	learn: 122.7210914	test: 140.8511986	best: 140.6536718 (0)	total: 23.7ms	remaining: 10.2ms
7:	learn: 121.8418528	test: 140.7646996	best: 140.6536718 (0)	total: 26.3ms	remaining: 6.58ms
8:	learn: 121.0103984	test: 140.4834561	best: 140.4834561 (8)	total: 28.9ms	remaining: 3.21ms
9:	learn: 119.9286951	test: 140.2935285	best: 140.2935285 (9)	total: 31.5ms	remaining: 0us

bestTest = 140.2935285
bestIteration = 9



<catboost.core.CatBoostRegressor at 0x7f10a84f9c50>

In [13]:
model2 = CatBoostRegressor(iterations=10, loss_function=RmseObjective(), eval_metric=RmseMetric(),
                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                           leaf_estimation_iterations=1, leaf_estimation_method='Gradient')
model2.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 128.6631656	test: 140.6536718	best: 140.6536718 (0)	total: 4.01ms	remaining: 36.1ms
1:	learn: 128.0351695	test: 140.7369887	best: 140.6536718 (0)	total: 6.72ms	remaining: 26.9ms
2:	learn: 126.7781283	test: 141.0444768	best: 140.6536718 (0)	total: 9.52ms	remaining: 22.2ms
3:	learn: 125.7603646	test: 141.1458855	best: 140.6536718 (0)	total: 12.2ms	remaining: 18.3ms
4:	learn: 124.6922146	test: 141.0856002	best: 140.6536718 (0)	total: 17.5ms	remaining: 17.5ms
5:	learn: 123.6667350	test: 141.0495141	best: 140.6536718 (0)	total: 20.6ms	remaining: 13.7ms
6:	learn: 122.7210914	test: 140.8511986	best: 140.6536718 (0)	total: 23.4ms	remaining: 10ms
7:	learn: 121.8418528	test: 140.7646996	best: 140.6536718 (0)	total: 26.4ms	remaining: 6.59ms
8:	learn: 121.0103984	test: 140.4834561	best: 140.4834561 (8)	total: 30.5ms	remaining: 3.39ms
9:	learn: 119.9286951	test: 140.2935285	best: 140.2935285 (9)	total: 35.2ms	remaining: 0us

bestTest = 140.2935285
bestIteration = 9



<catboost.core.CatBoostRegressor at 0x7f1079365080>

# 4. Multiclassification

Note: for multiclassification problems approxes are not equal to probabilities. Usually approxes are transformed to probabilities using Softmax function.
<h3><center>$p_{i,c} = \frac{e^{a_{i,c}}}{\sum_{j=1}^k{e^{a_{i,j}}}}$</center></h3>
<center>$p_{i,c}$ - the probability that $x_i$ belongs to class $c$</center>
<center>$k$ - number of classes</center>
<center>$a_{i,j}$ - approx for object $x_i$ for class $j$</center>

Let's implement MultiClass objective that is defined as follows:
<h3><center>$MultiClass_i = w_i * \log{p_{i,t_i}}$</center></h3>
<h3><center>$MultiClass = \frac{\sum_{i=1}^{N}Multiclass_i}{\sum_{i=1}^{N}w_i}$</center></h3>

<h3><center>$\frac{\delta(Multiclass_i)}{\delta{a_{i,c}}} = \begin{cases} 
w_i-\frac{w_i*e^{a_{i,c}}}{\sum_{j=1}^{k}e^{a_{i,j}}}, & \mbox{if } c = t_i \\ 
-\frac{w_i*e^{a_{i,c}}}{\sum_{j=1}^{k}e^{a_{i,j}}}, & \mbox{if } c \neq t_i 
\end{cases}$</center></h3>

<h3><center>$\frac{\delta^2(Multiclass_i)}{\delta{a_{i,c_1}}\delta{a_{i,c_2}}} = \begin{cases} 
\frac{w_i*e^{2*a_{i,c_1}}}{(\sum_{j=1}^{k}e^{a_{i,j}})^2} - \frac{w_i*e^{a_{i, c_1}}}{\sum_{j=1}^{k}e^{a_{i,j}}}, & \mbox{if } c_1 = c_2 \\ 
\frac{w_i*e^{a_{i,c_1}+a_{i,c_2}}}{(\sum_{j=1}^{k}e^{a_{i,j}})^2}, & \mbox{if } c_1 \neq c_2 
\end{cases}$</center></h3>

In [14]:
class MultiClassObjective(object):
    def calc_ders_multi(self, approx, target, weight):
        approx = np.array(approx) - max(approx)
        exp_approx = np.exp(approx)
        exp_sum = exp_approx.sum()
        grad = []
        hess = []
        for j in range(len(approx)):
            der1 = -exp_approx[j] / exp_sum
            if j == target:
                der1 += 1
            hess_row = []
            for j2 in range(len(approx)):
                der2 = exp_approx[j] * exp_approx[j2] / (exp_sum**2)
                if j2 == j:
                    der2 -= exp_approx[j] / exp_sum
                hess_row.append(der2 * weight)
                
            grad.append(der1 * weight)
            hess.append(hess_row)
            
        return (grad, hess)

In [15]:
class AccuracyMetric(object):
    def get_final_error(self, error, weight):
        return error / (weight + 1e-38)

    def is_max_optimal(self):
        return True

    def evaluate(self, approxes, target, weight):
        best_class = np.argmax(approxes, axis=0)
        
        accuracy_sum = 0
        weight_sum = 0 

        for i in range(len(target)):
            w = 1.0 if weight is None else weight[i]
            weight_sum += w
            accuracy_sum += w * (best_class[i] == target[i])

        return accuracy_sum, weight_sum

Below there are examples of training with built-in MultiClass function and our MultiClass objective. As we can see, the results are the same.

In [16]:
X, y = make_classification(n_samples=1000, n_features=50, n_informative=40, n_classes=5, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

In [17]:
model1 = CatBoostClassifier(iterations=10, loss_function='MultiClass', eval_metric='Accuracy',
                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                           leaf_estimation_iterations=1, leaf_estimation_method='Newton', classes_count=5)
model1.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 0.3706667	test: 0.2400000	best: 0.2400000 (0)	total: 22.3ms	remaining: 201ms
1:	learn: 0.4813333	test: 0.2760000	best: 0.2760000 (1)	total: 35.2ms	remaining: 141ms
2:	learn: 0.5400000	test: 0.3120000	best: 0.3120000 (2)	total: 46.9ms	remaining: 109ms
3:	learn: 0.6026667	test: 0.3040000	best: 0.3120000 (2)	total: 59.3ms	remaining: 88.9ms
4:	learn: 0.6573333	test: 0.3120000	best: 0.3120000 (2)	total: 71.4ms	remaining: 71.4ms
5:	learn: 0.6933333	test: 0.3360000	best: 0.3360000 (5)	total: 83.3ms	remaining: 55.5ms
6:	learn: 0.7000000	test: 0.3440000	best: 0.3440000 (6)	total: 95.4ms	remaining: 40.9ms
7:	learn: 0.7040000	test: 0.3520000	best: 0.3520000 (7)	total: 107ms	remaining: 26.9ms
8:	learn: 0.7293333	test: 0.3720000	best: 0.3720000 (8)	total: 120ms	remaining: 13.3ms
9:	learn: 0.7600000	test: 0.3960000	best: 0.3960000 (9)	total: 132ms	remaining: 0us

bestTest = 0.396
bestIteration = 9



<catboost.core.CatBoostClassifier at 0x7f10798d2080>

In [18]:
model2 = CatBoostClassifier(iterations=10, loss_function=MultiClassObjective(), eval_metric=AccuracyMetric(),
                           learning_rate=0.03, bootstrap_type='Bayesian', boost_from_average=False,
                           leaf_estimation_iterations=1, leaf_estimation_method='Newton', classes_count=5)
model2.fit(X_train, y_train, eval_set=(X_test, y_test))

0:	learn: 0.3706667	test: 0.2520000	best: 0.2520000 (0)	total: 217ms	remaining: 1.95s
1:	learn: 0.4813333	test: 0.2760000	best: 0.2760000 (1)	total: 432ms	remaining: 1.73s
2:	learn: 0.5400000	test: 0.3120000	best: 0.3120000 (2)	total: 649ms	remaining: 1.51s
3:	learn: 0.6026667	test: 0.3040000	best: 0.3120000 (2)	total: 863ms	remaining: 1.29s
4:	learn: 0.6573333	test: 0.3120000	best: 0.3120000 (2)	total: 1.08s	remaining: 1.08s
5:	learn: 0.6933333	test: 0.3360000	best: 0.3360000 (5)	total: 1.3s	remaining: 869ms
6:	learn: 0.7000000	test: 0.3440000	best: 0.3440000 (6)	total: 1.52s	remaining: 653ms
7:	learn: 0.7040000	test: 0.3520000	best: 0.3520000 (7)	total: 1.75s	remaining: 436ms
8:	learn: 0.7293333	test: 0.3720000	best: 0.3720000 (8)	total: 1.96s	remaining: 218ms
9:	learn: 0.7600000	test: 0.3960000	best: 0.3960000 (9)	total: 2.18s	remaining: 0us

bestTest = 0.396
bestIteration = 9



<catboost.core.CatBoostClassifier at 0x7f10798b0be0>