Data is available at Download and save to your own directory

Sigmoid function

Implementing a simple classifier

Read a small dataset of beer reviews

Predict the user's gender from the length of their review

Fit the model

Calculate the accuracy of the model

Accuracy seems surprisingly high! Check against the number of positive labels...

Accuracy is identical to the proportion of "males" in the data. Confirm that the model is never predicting positive

Implementing a balanced classifier

Use the class_weight='balanced' option to implement the balanced classifier

Simple classification diagnostics


Compute the accuracy of the balanced model

True positives, False positives (etc.), and balanced error rate (BER)

Can rewrite the accuracy in terms of these metrics

True positive and true negative rates

Balanced error rate (BER)

Precision, recall, and F1 scores

F1 score

Significance testing

Randomly sort the data (so that train and test are iid)

Predict overall rating from ABV

Fit the two models (with and without the feature)

Residual sum of squares for both models

F statistic (results may vary for different random splits)

Regression in tensorflow

Small dataset of fantasy reviews

Predict rating from review length

First check the coefficients if we fit the model using sklearn

Convert features and labels to tensorflow structures

Build tensorflow regression class

Initialize the model (lambda = 0)

Train for 1000 iterations of gradient descent (could implement more careful stopping criteria)

Confirm that we get a similar result to what we got using sklearn

Make a few predictions using the model

Classification in Tensorflow

Predict whether rating is above 4 from length

Convert to tensorflow structures

Tensorflow classification class

Initialize the model (lambda = 0)

Run for 1000 iterations

Model predictions (as probabilities via the sigmoid function)

Regularization pipeline

Just read the first 5000 reviews (deliberately making a model that will overfit if not carefully regularized)

Fit a simple bag-of-words model (see Chapter 8 for more details)

1000 most popular words

Bag-of-words features for 1000 most popular words

Unregularized model (train on training set, test on test set)

Regularized model ("ridge regression")

Complete regularization pipeline

Track the model which works best on the validation set

Train models for different values of lambda (or C). Keep track of the best model on the validation set.

Using the best model from the validation set, compute the error on the test set

Plot the train/validation/test error associated with this pipeline

Precision, recall, and ROC curves

Same data as pipeline above, slightly bigger dataset

Simple bag-of-words model (as in pipeline above, and in Chapter 8)

Predict whether the ABV is above 6.7 (roughly, above average) from the review text

Train on first 9000 reviews, test on last 1000


To compute precision and recall, we want the output probabilities (or scores) rather than the predicted labels

Build a simple data structure that contains the score, the predicted label, and the actual label (on the test set)

For example...

Sort this so that the most confident predictions come first

For example...

Receiver operator characteristic (ROC) curve

Precision recall curve



Count occurrences of each style

Build one-hot encoding using common styles

Compute and report metrics


Balance the classifier using the 'balanced' option


Precision/recall curves


Model pipeline


Fit the classification problem using a regular linear regressor