# CatBoost and CoreML tutorial â€” Titanic dataset

CatBoost does support model export to Apple's [CoreML](https://developer.apple.com/machine-learning/) format, which lets you to easily embed ML models into applications on Apple's platforms.

Currently, export of models with only float and one-hot features supported.

This tutorial demonstrates exporting of CatBoost model trained on  [Titanic](https://www.kaggle.com/c/titanic/data) dataset to CoreML model.

Get titanic dataset:

In [2]:
import numpy as np

from catboost import Pool, CatBoost
from catboost.datasets import titanic

In [3]:
train_df = titanic()[0]
X, y = train_df.drop('Survived', axis=1), train_df.Survived

In [4]:
X.head()

Unnamed: 0,PassengerId,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


Let us drop Name and Ticket features as it doesn't make sence to one-hot these features, since there's a single object with this value, so there will be overfitting.

In [5]:
X.drop(['Name', 'Ticket'], axis=1, inplace=True)
categorical_features_indices = np.where(X.dtypes != np.float)[0]

In [6]:
is_cat = (X.dtypes != float)
for feature, feat_is_cat in is_cat.to_dict().items():
    if feat_is_cat:
        X[feature].fillna("NAN", inplace=True)

cat_features_index = np.where(is_cat)[0]

In [7]:
X.head()

Unnamed: 0,PassengerId,Pclass,Sex,Age,SibSp,Parch,Fare,Cabin,Embarked
0,1,3,male,22.0,1,0,7.25,NAN,S
1,2,1,female,38.0,1,0,71.2833,C85,C
2,3,3,female,26.0,0,0,7.925,NAN,S
3,4,1,female,35.0,1,0,53.1,C123,S
4,5,3,male,35.0,0,0,8.05,NAN,S


In [8]:
train_pool = Pool(data=X, label=y, cat_features=cat_features_index)

Train the model:

In [10]:
model = CatBoost(params={'loss_function': 'Logloss', 'one_hot_max_size': 255, 'verbose': 100})
model.fit(train_pool)

Learning rate set to 0.016216
0:	learn: 0.6862663	total: 73.2ms	remaining: 1m 13s
100:	learn: 0.4272007	total: 2.6s	remaining: 23.2s
200:	learn: 0.4044455	total: 4.59s	remaining: 18.3s
300:	learn: 0.3928060	total: 6.28s	remaining: 14.6s
400:	learn: 0.3852512	total: 7.88s	remaining: 11.8s
500:	learn: 0.3750366	total: 9.58s	remaining: 9.54s
600:	learn: 0.3624703	total: 12.5s	remaining: 8.28s
700:	learn: 0.3493490	total: 15s	remaining: 6.39s
800:	learn: 0.3390201	total: 17.3s	remaining: 4.3s
900:	learn: 0.3301084	total: 19.5s	remaining: 2.14s
999:	learn: 0.3224542	total: 21.6s	remaining: 0us


<catboost.core.CatBoost at 0x110317050>

Predict probabilities:

In [11]:
test_pool = Pool(data=X[0:1], cat_features=cat_features_index)

In [12]:
model.predict(test_pool, prediction_type="Probability")

array([[0.88003974, 0.11996026]])

Save model:

In [13]:
model.save_model(
    "titanic.mlmodel",
    format="coreml",
    export_parameters={
        'prediction_type': 'probability'
    }
)

All the features are named as "feature_i" where i is a feature number in the dataset starting from 0.

Now you can import saved model to XCode and use it directly from swift:

```swift
import CoreML

let model = titanic()

let passengerId = "1"
let pclass = "1"
let sex = "female"
let age = 38.0
let sibsp = "1"
let parch = "0"
let fare = 71.2833
let cabin = "C85"
let embarked = "C"

guard let titanicOutput = try? model.prediction(feature_0: passengerId, feature_1: pclass, feature_2: sex, feature_3: age, feature_4: sibsp, feature_5: parch, feature_6: fare, feature_7: cabin, feature_8: embarked) else {
            fatalError("Unexpected runtime error.")
        }

print(String(
    format: "Probability of survival: %1.5f",
    titanicOutput.prediction[0].doubleValue
))
```

If you want to practice, titanic model is easy to integrate into Apple's  [MarsHabitatPricer](https://developer.apple.com/documentation/coreml/integrating_a_core_ml_model_into_your_app) example project:

<img 
src="https://imgur.com/f6G6ZrJ.jpg">