A multi-layer perceptron for classification using a well-known dataset
This post uses TensorFlow with Keras API for a classification problem of predicting diabetes based on a feed-forward neural network also known as multilayer perceptron and uses Pima Indians Diabetes Database from Kaggle. A Google colab notebook with code is available on GitHub.
Exploratory data analysis
The dataset consists of 8 numeric features each of which does not have any missing values. The database contains 768 records from which 500 correspond to negative outcomes and 268 to positive.
There are no features that strongly correlate to each other.
Building a model
We split the dataset into the training part which constitutes 80% of the whole data and the test part of 20%. A sequential model consisting of 6 layers. The first one is a normalization layer that is a kind of experimental preprocessing layer used to coerce it inputs to have distributions with the mean of zero and standard deviation of one.
The model contains two three fully-connected layers, two with five units and ReLu activation and one output layer with sigmoid activation function. In addition, there are two dropout layers to prevent overfitting. The layers with ReLu activation use He normal weight initialization and the output layer uses normal Glorot normal weight initialization.
The model uses Adam optimizer, binary cross-entropy loss function and binary accuracy as a metric.
The learning rate of 1e-5 was picked to ensure the decrease of both training and validation loss. Two dropout layers were added to prevent overfitting.
The accuracy of over 70% was achieved. The confusion matrix is depicted below.