problems-04-Classfication.pdf
(
57 KB
)
Pobierz
Artificial Neural Networks
Prof. Dr. Sen Cheng
Oct 25, 2021
Problem Set 4: Logistic Regression/ Classification
Tutors:
David Kappel (david.kappel@rub.de), Xiangshuai Zeng (xiangshuai.zeng@ini.ruhr-uni-bochum.de)
Further Reading:
Hands-on Machine Learning with Scikit-Learn and TensorFlow, Ch. 4
1. Derive the gradient of the loss function of logistic regression using paper and pencil.
2. Implement a logistic regression model using only elementary programming operations. For your guidance, see
the steps below:
(a) Load the file ‘04 log regression data.npy’ in numpy. It contains a hypothetical dataset, where the first two
columns reflect the features: length of current residency and yearly income, and the last column contains
the labels, i.e., whether a bank loan was granted to an individual or not.
(b) Because the scale of the two features differs considerably, it is advised to standardize them before fitting.
Import the
zscore
function from
scipy.stats
to standardize each of the features.
(c) Visualize the data set in a scatter plot of length of current residency vs. yearly income, using different
colors to reflect whether the loan was granted or not. You may call
matplotlib.pyplot.scatter
twice for
plotting the points belonging to each of the two classes.
(d) Implement the gradient descent algorithm to minimize the loss function. (Don’t forget to set up the bias
terms.)
i. Set the initial parameters
θ
0
=
0 or to a small random number.
ii. Run the fitting process for 15,000 epochs and a learning rate
η
=
0.002.
3. Once the gradient descent is completed, plot the decision boundary together with the data. Remember that the
1
decision boundary is given by
p(x)
=
σ(θ
T
x)
=
p
0
. If
p
0
=
2
, the previous condition is equivalent to
ˆ
θ
T
x
=
θ
0
+
θ
1
x
1
+
θ
2
x
2
=
0
4. Plot the loss vs the training epochs. Why does the loss saturate? Why is the asymptotic value not zero?
5. The decision boundary can be shifted by adjusting
p
0
to trade off the likelihood of the different errors. Apply
different classification thresholds
p
0
to obtain the precision-recall curve,
F
1
score and ROC curves. Based on
these measures assess the model performance and determine what would be a reasonable classification criterion.
(1)
1
Plik z chomika:
mxp-pl
Inne pliki z tego folderu:
04-Classification.pdf
(350 KB)
04_log_regression_data.npy
(23 KB)
lec03-quiz.pdf
(141 KB)
problems-04-Classfication.pdf
(57 KB)
Inne foldery tego chomika:
0 - Introduction to scientific computing
01-02 - Introduction-Optimization
03-Regression
05-Model Selection
06-Biological Neural Networks
Zgłoś jeśli
naruszono regulamin