Data is available at Download and save to your own directory


Example based on Bay-Area bike-share data. Extract the time information from the events.

Find the earliest event (so that we can sort events from the first to the last hour)

Count events by hour

Autoregressive feature representation. Here we don't include a bias term, though could easily include one.

The observation one week ago is the most predictive, followed by the observation from the previous hour:

Sliding window

Parse ratings and timestamps from (a small fraction of) Goodreads fantasy novel data

Sort observations by time

Keep track of a window (wSize) of ratings and timestamps (the raw time is just for printing the plot)

Use a dynamic-programming approach to build the sliding window

X and Y coordinates for plotting

FPMC in Tensorflow

Extract the interaction data, including the timestamps associated with each interaction

Interaction with timestamp

Sort interactions by time (including interaction sequences for each user). Useful when building data structures that include adjacent pairs of interactions (but consider whether this is desirable if making train/test splits!).

Build a data structure including users, items, and their previous items

Define the tensorflow model. Similar to models from Chapter 5, with the addition of the term associated with the previous interaction.

FMPC class. UI and IJ are given as initialization options, allowing us to exclude certain terms (for exercises later).

Run 100 batches



(still using the hourly bikeshare interaction data from examples above)


(using Goodreads data from examples above)

The FPMC implementation we built above allowed us to control which terms (user/item or item/item) were included


FISM implementaion, using a factorization machine

Factorization machine design matrix. Note that we have two sets of features (the user history, and the target item). Both are of dimension nItems.

Fairly slow and memory-hungry (every row contains a copy of a user's history). Could possibly be implemented faster in Tensorflow.


(still using Goodreads data)

Regression-based approach. Just collect past K interactions (ratings) as features.

Factorization machine-based approach. Copy the same features from the model above, but also include a user term. In theory, this should allow us to learn how sensitive a particular user is to herding.