This chapter deals with constructing a logistic regression model in CNTK.
Basics of Logistic Regression model
Logistic Regression, one of the simplest ML techniques, is a technique especially for binary classification. In other words, to create a prediction model in situations where the value of the variable to predict can be one of just two categorical values. One of the simplest examples of Logistic Regression is to predict whether the person is male or female, based on person’s age, voice, hairs and so on.
Example
Let’s understand the concept of Logistic Regression mathematically with the help of another example −
Suppose, we want to predict the credit worthiness of a loan application; 0 means reject, and 1 means approve, based on applicant debt , income and credit rating. We represent debt with X1, income with X2 and credit rating with X3.
In Logistic Regression, we determine a weight value, represented by w, for every feature and a single bias value, represented by b.
Now suppose,
X1 = 3.0 X2 = -2.0 X3 = 1.0
And suppose we determine weight and bias as follows −
W1 = 0.65, W2 = 1.75, W3 = 2.05 and b = 0.33
Now, for predicting the class, we need to apply the following formula −
Z = (X1*W1)+(X2*W2)+(X3+W3)+b i.e. Z = (3.0)*(0.65) + (-2.0)*(1.75) + (1.0)*(2.05) + 0.33 = 0.83
Next, we need to compute P = 1.0/(1.0 + exp(-Z)). Here, the exp() function is Euler’s number.
P = 1.0/(1.0 + exp(-0.83) = 0.6963
The P value can be interpreted as the probability that the class is 1. If P < 0.5, the prediction is class = 0 else the prediction (P >= 0.5) is class = 1.
To determine the values of weight and bias, we must obtain a set of training data having the known input predictor values and known correct class labels values. After that, we can use an algorithm, generally Gradient Descent, in order to find the values of weight and bias.
LR model implementation example
For this LR model, we are going to use the following data set −
1.0, 2.0, 0 3.0, 4.0, 0 5.0, 2.0, 0 6.0, 3.0, 0 8.0, 1.0, 0 9.0, 2.0, 0 1.0, 4.0, 1 2.0, 5.0, 1 4.0, 6.0, 1 6.0, 5.0, 1 7.0, 3.0, 1 8.0, 5.0, 1
To start this LR model implementation in CNTK, we need to first import the following packages −
import numpy as np import cntk as C
The program is structured with main() function as follows −
def main(): print("Using CNTK version = " + str(C.__version__) + "\n")
Now, we need to load the training data into memory as follows −
data_file = ".\\dataLRmodel.txt" print("Loading data from " + data_file + "\n") features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1]) labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)
Now, we will be creating a training program that creates a logistic regression model which is compatible with the training data −
features_dim = 2 labels_dim = 1 X = C.ops.input_variable(features_dim, np.float32) y = C.input_variable(labels_dim, np.float32) W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter b = C.parameter(shape=(labels_dim)) z = C.times(X, W) + b p = 1.0 / (1.0 + C.exp(-z)) model = p
Now, we need to create Lerner and trainer as follows −
ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR fixed_lr = 0.010 learner = C.sgd(model.parameters, fixed_lr) trainer = C.Trainer(model, (ce_error), [learner]) max_iterations = 4000
LR Model training
Once, we have created the LR model, next, it is time to start the training process −
np.random.seed(4) N = len(features_mat) for i in range(0, max_iterations): row = np.random.choice(N,1) # pick a random row from training items trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] }) if i % 1000 == 0 and i > 0: mcee = trainer.previous_minibatch_loss_average print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee)
Now, with the help of the following code, we can print the model weights and bias −
np.set_printoptions(precision=4, suppress=True) print("Model weights: ") print(W.value) print("Model bias:") print(b.value) print("") if __name__ == "__main__": main()
Training a Logistic Regression model – Complete example
import numpy as np import cntk as C def main(): print("Using CNTK version = " + str(C.__version__) + "\n") data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file print("Loading data from " + data_file + "\n") features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[0,1]) labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2) features_dim = 2 labels_dim = 1 X = C.ops.input_variable(features_dim, np.float32) y = C.input_variable(labels_dim, np.float32) W = C.parameter(shape=(features_dim, 1)) # trainable cntk.Parameter b = C.parameter(shape=(labels_dim)) z = C.times(X, W) + b p = 1.0 / (1.0 + C.exp(-z)) model = p ce_error = C.binary_cross_entropy(model, y) # CE a bit more principled for LR fixed_lr = 0.010 learner = C.sgd(model.parameters, fixed_lr) trainer = C.Trainer(model, (ce_error), [learner]) max_iterations = 4000 np.random.seed(4) N = len(features_mat) for i in range(0, max_iterations): row = np.random.choice(N,1) # pick a random row from training items trainer.train_minibatch({ X: features_mat[row], y: labels_mat[row] }) if i % 1000 == 0 and i > 0: mcee = trainer.previous_minibatch_loss_average print(str(i) + " Cross-entropy error on curr item = %0.4f " % mcee) np.set_printoptions(precision=4, suppress=True) print("Model weights: ") print(W.value) print("Model bias:") print(b.value) if __name__ == "__main__": main()
Output
Using CNTK version = 2.7 1000 cross entropy error on curr item = 0.1941 2000 cross entropy error on curr item = 0.1746 3000 cross entropy error on curr item = 0.0563 Model weights: [-0.2049] [0.9666]] Model bias: [-2.2846]
Prediction using trained LR Model
Once the LR model has been trained, we can use it for prediction as follows −
First of all, our evaluation program imports the numpy package and loads the training data into a feature matrix and a class label matrix in the same way as the training program we implement above −
import numpy as np def main(): data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=(0,1)) labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2)
Next, it is time to set the values of the weights and the bias that were determined by our training program −
print("Setting weights and bias values \n") weights = np.array([0.0925, 1.1722], dtype=np.float32) bias = np.array([-4.5400], dtype=np.float32) N = len(features_mat) features_dim = 2
Next our evaluation program will compute the logistic regression probability by walking through each training items as follows −
print("item pred_prob pred_label act_label result") for i in range(0, N): # each item x = features_mat[i] z = 0.0 for j in range(0, features_dim): z += x[j] * weights[j] z += bias[0] pred_prob = 1.0 / (1.0 + np.exp(-z)) pred_label = 0 if pred_prob < 0.5 else 1 act_label = labels_mat[i] pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \ else ‘WRONG’ print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str))
Now let us demonstrate how to do prediction −
x = np.array([9.5, 4.5], dtype=np.float32) print("\nPredicting class for age, education = ") print(x) z = 0.0 for j in range(0, features_dim): z += x[j] * weights[j] z += bias[0] p = 1.0 / (1.0 + np.exp(-z)) print("Predicted p = " + str(p)) if p < 0.5: print("Predicted class = 0") else: print("Predicted class = 1")
Complete prediction evaluation program
import numpy as np def main(): data_file = ".\\dataLRmodel.txt" # provide the name and the location of data file features_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=(0,1)) labels_mat = np.loadtxt(data_file, dtype=np.float32, delimiter=",", skiprows=0, usecols=[2], ndmin=2) print("Setting weights and bias values \n") weights = np.array([0.0925, 1.1722], dtype=np.float32) bias = np.array([-4.5400], dtype=np.float32) N = len(features_mat) features_dim = 2 print("item pred_prob pred_label act_label result") for i in range(0, N): # each item x = features_mat[i] z = 0.0 for j in range(0, features_dim): z += x[j] * weights[j] z += bias[0] pred_prob = 1.0 / (1.0 + np.exp(-z)) pred_label = 0 if pred_prob < 0.5 else 1 act_label = labels_mat[i] pred_str = ‘correct’ if np.absolute(pred_label - act_label) < 1.0e-5 \ else ‘WRONG’ print("%2d %0.4f %0.0f %0.0f %s" % \ (i, pred_prob, pred_label, act_label, pred_str)) x = np.array([9.5, 4.5], dtype=np.float32) print("\nPredicting class for age, education = ") print(x) z = 0.0 for j in range(0, features_dim): z += x[j] * weights[j] z += bias[0] p = 1.0 / (1.0 + np.exp(-z)) print("Predicted p = " + str(p)) if p < 0.5: print("Predicted class = 0") else: print("Predicted class = 1") if __name__ == "__main__": main()
Output
Setting weights and bias values.
Item pred_prob pred_label act_label result 0 0.3640 0 0 correct 1 0.7254 1 0 WRONG 2 0.2019 0 0 correct 3 0.3562 0 0 correct 4 0.0493 0 0 correct 5 0.1005 0 0 correct 6 0.7892 1 1 correct 7 0.8564 1 1 correct 8 0.9654 1 1 correct 9 0.7587 1 1 correct 10 0.3040 0 1 WRONG 11 0.7129 1 1 correct Predicting class for age, education = [9.5 4.5] Predicting p = 0.526487952 Predicting class = 1