------------------------- METAREVIEW ------------------------ Summary of strengths and weaknesses of the paper: The idea of incorporating time interval between interactions for sequential recommendation is very interesting. The paper is generally well written, with good motivation. The evaluation is reasonably thorough. Nonetheless, some important related works on sequential recommendation are missing, for example, BERT4Rec [1]. Including these methods in the comparison will further strengthened the paper. [1] BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer Also the paper writing can be further improved. For example, it is quite difficult to it is quite difficult to understand Figure 3. Discussion: There is no discussion on this paper as all reviewers are in favor of accepting this paper. Main reasons for final recommendation of acceptance or rejection: Overall, the paper is well written and the idea is interesting. It is recommended to be accepted. ----------------------- REVIEW 1 --------------------- SUBMISSION: 734 TITLE: Time Interval Aware Self-Attention for Sequential Recommendation AUTHORS: Jiacheng Li, Yujie Wang and Julian McAuley ----------- Paper Clarity ----------- SCORE: 3 (Above Average) ----------- Interest to Audience ----------- SCORE: 3 (Medium) ----------- Paper Significance ----------- SCORE: 3 (Above Average) ----------- Strengths ----------- 1. The idea of incorporating time interval between interactions for sequential recommendation is very interesting. 2. The self-attention based method for time interval-aware sequential recommendation seems reasonable. 3. This paper is in general well written. ----------- Weaknesses ----------- 1. The absolute time intervals between actions (e.g., 5.5 days) are not incorporated. Only relative time intervals are used. 2. According to the experimental results, incorporating time intervals do not always improve the recommendation performance. 3. Some important related works are missing. ----------- Overall Evaluation ----------- SCORE: 1 (Weak accept) ----- TEXT: This paper presents a sequential recommendation method which can incorporate the time intervals between interactions. The proposed method is based on self-attention, and can combine previously interacted items, order of these interactions, and the time intervals between these interactions for recommendation. The authors do experiments on several public datasets to verify the performance of their approach. This paper is in general well written. My comments on this paper are as follows: 1. Only relative time intervals are used in the proposed method. The absolute time intervals between actions (e.g., 5.5 days) are not incorporated. I wonder whether the performance of the absolute time intervals can be better than relative time intervals. 2. According to the experimental results in Table 5, incorporating time intervals do not always improve the recommendation performance. 3. Some important related works on sequential recommendation are missing, for example, BERT4Rec [1]. These methods should be compared in experiments. [1] BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer 4. The paper writing can be further improved. For example, it is quite difficult to understand Figure 3. ----------------------- REVIEW 2 --------------------- SUBMISSION: 734 TITLE: Time Interval Aware Self-Attention for Sequential Recommendation AUTHORS: Jiacheng Li, Yujie Wang and Julian McAuley ----------- Paper Clarity ----------- SCORE: 4 (Excellent (Easy to follow)) ----------- Interest to Audience ----------- SCORE: 4 (High) ----------- Paper Significance ----------- SCORE: 4 (Excellent) ----------- Strengths ----------- 1) The writing is clear and well organized, with good motivation, and diagrams that explain the core thesis and concepts introduced in the paper. (While I appreciated the full-color film covers, it may be more accessible for others if you include text in the diagram as some of the titles are not obvious/visible from the image) 2) The proposed algorithm has a clear and singular hypothesis which is that taking absolute and relative time positions into account can improve on the state of the art sequential recommendation models. Several other well described state of the art models are used as benchmarks and as baselines upon which this algorithm is built. 3) The approach is novel and has a chance to advance the state of the art in sequential recommendations research. 4) Figure 2 shows data analysis of some of the datasets used, which supports their core hypothesis around time intervals in sequential recommendations. 5) The evaluation is thorough, includes a strong (popularity) baseline, and includes analysis of the effect of the different hyperparameters, as well as different variations on the algorithm (e.g. relative vs absolute timestamps, different latent dimensionality, max sequence length). 6) The authors use publicly available datasets, and describe implementation details and hyperparameter settings that will aid in reproducibility. I am curious if the authors can/will also publish source code in the final version? ----------- Weaknesses ----------- 1) No statistical tests were done to measure the confidence of each result. 2) There is brief discussion of differences in the different datasets in 4.9, but this could be expanded. It would be interesting to compare the performance across different datasets, and discuss why the algorithm might perform better/worse on different datasets, or show for which datasets each algorithm performs best/worst on. For example, it seems that most (but not all) algorithms perform best on the Steam dataset. Why? (but why not for GRU4REC+, Caser, and MARank?) 3) The datasets used are all product/content *review* datasets. It would have been interesting to see how the algorithm performed on more play/purchase/consumption-based datasets, as the sequential dynamics is probably different than that of reviewing. As one suggestion, https://link.springer.com/article/10.1007/s11257-018-9209-6 lists a number of datasets used for sequential recommendations which include more click/view/play data, such as the 30MUSIC and NOWPLAYING datasets, or also consider the dataset included with the WSDM Cup 2019 Music Skip Prediction task at https://www.aicrowd.com/challenges/spotify-sequential-skip-prediction-challenge 4) While not the focus of this paper, it would be useful in the future to study the interactions and implicit/explicit feedback in addition to just the time and sequencing. For example, could a model take into account whether a review was positive or negative to help predict what the next item would be? 5) I realize this is common in Deep Learning papers but in 3.3 it would be helpful to describe what type of “embeddings” are used: how they are computed and on what datasets? 6) Editing suggestions: Section 4.7 ends in the middle of a sentence (“From Figure 2,”). In 4.9 a sentence is broken into two (“As shown in the Figure 7. Time intervals have…”) (also it should be “in Figure 7” instead of “in the Figure 7”) 7) Minor typos: “mini-bath” instead of “mini-batch” in 3.6, “MoviesLens” instead of “MovieLens” in 4.6 ----------- Overall Evaluation ----------- SCORE: 2 (Accept) ----- TEXT: The paper begins by reviewing the recent state of the art in sequential recommenders, from Markov Chains to Recurrent Neural Networks to Attention-based models. They propose an improvement over one such algorithm (SASRec) which takes into account absolute and relative time positions of items, in addition to the sequential order (they call their algorithm TiSASRec). Their evaluation supports their hypothesis that time positions are important to sequential recommendations, and shows that their algorithm is able to effectively model time intervals to improve on the performance of SASRec. They perform an evaluation using several publicly available datasets of sequential item data (reviews), and compare their performance against several published algorithms, including a popularity baseline as well as some state of the art approaches, and show the best results across all datasets and algorithms. They also analyze the effects of different variations of their algorithm, such as absolute vs relative time positions, maximum sequence length, and so on. The paper also includes implementation details and hyperparameter settings which will make it easier to reproduce in future research. This paper is written and organized well, the proposed algorithm is well motivated with a good baseline and a very clear hypothesis, and the evaluation is thorough and supports the hypothesis. In addition, the area of sequential recommendations is very active, and this clearly builds upon the recent state of the art, and based on the results, could be influential for guiding future research. ----------------------- REVIEW 3 --------------------- SUBMISSION: 734 TITLE: Time Interval Aware Self-Attention for Sequential Recommendation AUTHORS: Jiacheng Li, Yujie Wang and Julian McAuley ----------- Paper Clarity ----------- SCORE: 3 (Above Average) ----------- Interest to Audience ----------- SCORE: 3 (Medium) ----------- Paper Significance ----------- SCORE: 3 (Above Average) ----------- Strengths ----------- 1. The problem studied in the paper is interesting and important. 2. The proposed solution for involving time intervals in sequential recommendation is reasonable. 3. The experimental results are convincing. ----------- Weaknesses ----------- 1) The description about the proposed TiSASRec in Figure3 is not clear enough. It is better if the authors could make it more clear about the relation matric M^u,M^I and the embedding matrix M_K^P, M_V^P, M_k^R, M_V^R. 2) In TiSASRec, the authors makes use of the relative scaled time intervals, instead of the absolute time intervals. It is better to explain the reasons about why the scaled relative time intervals have advantages in the task. 3) In section 3.4, it is better to denote the matrices and the vectors with boldface symbols. ----------- Overall Evaluation ----------- SCORE: 1 (Weak accept) ----- TEXT: This paper considered the time intervals between each interaction in sequential recommendation, and proposed a new recommendation algorithm TiSASRes for modeling both the position and the time intervals information during the recommendation process. The experimental results on three benchmarks showed the effective of the proposed method. Sequential recommendation has been extensively studied with deep learning models in recent years. This paper proposes an extension of self-attention mechanism for involving the time intervals. I have the following comments on the paper: 1) The description about the proposed TiSASRec in Figure3 is not clear enough. It is better if the authors could make it more clear about the relation matric M^u,M^I and the embedding matrix M_K^P, M_V^P, M_k^R, M_V^R. 2) In TiSASRec, the authors makes use of the relative scaled time intervals, instead of the absolute time intervals. It is better to explain the reasons about why the scaled relative time intervals have advantages in the task. 3) In section 3.4, it is better to denote the matrices and the vectors with boldface symbols.