Logistic regression | Seenu's AI-licious blog

Logistic regression in machine learning is an algorithm for binary classification; this means output is one of the two choices like true/false, 0/1, spam/no-spam, male/female etc. Though the name says Logistic regression, it is actually a classification algorithm, not a regression algorithm.

In order to derive an output as above, from the underlying probabilities, it generally uses the sigmoid function, which has a range of 0 to 1. In other words, it is a linear algorithm with sigmoid function applied on output.

For our demonstration purposes, we will use Bank Note Dataset from UCI machine learning repo.

The first 4 columns in the data set are X input features.

Wavelet Transformed image (continuous) Variance; Wavelet Transformed image (continuous) skewness; Wavelet Transformed image (continuous) curtosis; Image (continuous) Entropy.

The last column is Y, which says whether a given note is authentic (0) or forged (1)

In order to implement this in Keras, we follow the steps below:

Create X input and Y output;
Create a Sequential model;
Add the input layer and a hidden layer with the number of neurons, number of input variables and activation function;
Add the output layer with the sigmoid function;
Compile the model; and
Train the model.

Here is the code. You can play with different hyperparameters to increase accuracy.

# Logistic regression is a binary classification algorithm
# This code shows basic version of it using Keras using
# Banknote authentication Data Set

from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
import numpy

# Download Data Set from link below and save as .csv file
#https://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt

seed = numpy.random.seed(7)

# Load banknote authentication Data Set
dataset = numpy.loadtxt("data_banknote_authentication.csv", delimiter=",")
# split data into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,4]

# create model
model = Sequential()
# first arg is for number of neurons
# 2nd arg is for number of input variables
model.add(Dense(10, input_dim=4, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# As this is binary classification, it is better to use binary_crossentropy loss function (better with output probabilities)
# you could also use others like MSE
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model, split into 67% for training and 33% for testing
model.fit(X, Y, validation_split=0.33, epochs=15, batch_size=100)

# Logistic regression is a binary classification algorithm

# This code shows basic version of it using Keras using

# Banknote authentication Data Set

from keras.models import Sequential

from keras.layers import Dense

from sklearn.model_selection import train_test_split

import numpy

# Download Data Set from link below and save as .csv file

#https://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt

seed = numpy.random.seed(7)

# Load banknote authentication Data Set

dataset = numpy.loadtxt("data_banknote_authentication.csv", delimiter=",")

# split data into input (X) and output (Y) variables

X = dataset[:,0:4]

Y = dataset[:,4]

# create model

model = Sequential()

# first arg is for number of neurons

# 2nd arg is for number of input variables

model.add(Dense(10, input_dim=4, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

# As this is binary classification, it is better to use binary_crossentropy loss function (better with output probabilities)

# you could also use others like MSE

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model, split into 67% for training and 33% for testing

model.fit(X, Y, validation_split=0.33, epochs=15, batch_size=100)

Here is the sample output.

100/919 [==>………………………] – ETA: 0s – loss: 0.5984 – acc: 0.7700
919/919 [==============================] – 0s 14us/step – loss: 0.5483 – acc: 0.7889 – val_loss: 0.7256 – val_acc: 0.4768
Epoch 11/15

100/919 [==>………………………] – ETA: 0s – loss: 0.5178 – acc: 0.7900
919/919 [==============================] – 0s 14us/step – loss: 0.4934 – acc: 0.8118 – val_loss: 0.7121 – val_acc: 0.4768
Epoch 12/15

Tag: Logistic regression

Logistic Regression with Keras