************************************************** * WWW 2019 (accept) * ************************************************** ----------------------- REVIEW 1 --------------------- PAPER: 1589 TITLE: Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation AUTHORS: Jianmo Ni, Larry Muhlstein and Julian McAuley Audience: 3 (Yes, to a large group of people) Overall score: 1 (weak accept) ----------- Strengths ----------- - Methodology to extract attribute-based and workout-based embeddings that are useful for different tasks that require personalisation. - Evaluation performed in a large-scale dataset collected in the wild. - Well written manuscript. Easy to follow ----------- Weaknesses ----------- - The reviewer feels that it is not clear the difficulty of the forecasting tasks. Quantifying the differences between users and workouts regarding the variability of heart rates/speed within the different sport types would help to assess this. Also, it would allow to put the errors in context and assess the utility of the model for real-world applications (beyond just compare it with baselines). I.e., how good/bad an RMSE=17 is in forecasting heart rate? Is it tolerable for the applications authors propose? - More robust baselines should be used. Specifically for: - Workout profile forecasting: baselines here should include those that do not take into account the embeddings but just the input sequences (contextual sequences X). Specifically, these might include LSTMs (with attention) that do not consider the embeddings, concatenation of predictions using feature-based baselines such as SVR and GBR (one prediction per time step). Other useful baseline could be a multitask model that simultaneously learns speed and heart rate. Not including the embeddings in the model would allow to assess the need of personalised models for this tasks and the gain when using them. - Predicting the user's heart rate in the next seconds or minutes during an ongoing workout: feature-based baselines such as SVR or GBR should also be used for this task. Features could be extracted from both previous heart rates / speed, and also from the contextual sequences X (as before). - More detail on how to obtain the attribute embeddings should be provided to better assess the novelty of the contribution. - In the workout profile forecasting task, users need to indicate the desired time to complete the route. The reviewer feels this is difficult to asses for the user if she hasn't done it before. But if she has, the speed profile is already in already recorded in the historic, which would simplify the task. Authors should comment on this. - For workout profile forecasting, authors indicate that they wish to forecast the heart rate profile but also the speed profile. For this, the user needs to indicate the route and the time to complete it. With this information, and the altitude profile of the route, the reviewer feels that estimating the speed should be straightforward (specially given the correlation between altitude and speed --heart rate-- shown in Figure 4 for some activities such as "biking"). For other activities such as running, the speed and heart rate seem to be constant through the duration of the workout, which simplifies the tasks. As indicated earlier, an exploratory analysis of the differences between users and workout types would help to assess this difficulty. - The reviewer believes that the fact that authors promise to make the dataset available upon acceptance should not be considered as a contribution since this should be the default to advance research in the area. The same applies to the release of the code. ----------- Summary and review comments ----------- Authors propose a LSTM-based model that infers static user embeddings from user attributes (gender, sport,..) and temporal user embeddings from historical workout measurement sequences. They then use these embeddings to (i) estimate the speed and heart rate profile given the activity the user wishes to perform, (ii) predict the user's heart rate in the next seconds or minutes during an ongoing workout, (iii) provide personalised recommendations of routes given a workout criteria, (iv) and predict whether a user's heart rate will exceed some threshold if they continue at the current pace during their workout. They evaluated their methodology using a large-scale dataset collected in the wild with data from 887 participants and more than 100 thousand workouts, and compare it with different baselines. Overall, the reviewer feels that it is an interesting contribution that would attract the interest of a large audience. However, further details/analysis are needed to better assess the difficulty of the prediction tasks, the utility of the embeddings, as well as the real improvement with respect to baselines. ----------------------- REVIEW 2 --------------------- PAPER: 1589 TITLE: Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation AUTHORS: Jianmo Ni, Larry Muhlstein and Julian McAuley Audience: 3 (Yes, to a large group of people) Overall score: 1 (weak accept) ----------- Strengths ----------- - The contribution of the real-world dataset of 250K workouts would be of great value to the digital health community. I hope the authors will do proper anonymization. - The paper is very well written, with clear explanation of the models, parameters used, as well as of data cleaning for the experimentation. The examples of prediction tasks are quite interesting. - Model is an application of a popular techniques (LSTMs, etc) to a real-world application. Although it is out of my scope to verify just how novel the particular model setup is. ----------- Weaknesses ----------- - It is unfortunate the dataset has such a huge gender disparity, it would be nice if the authors tried to over-sample or otherwise explore the female data in more detail. Also would be good to know whether Endomondo's user base is just male-heavy, or whether this was some artifact of the sampling strategy (though I appreciate if authors of this paper are from Endomondo, they may not want to admit that women don't seem to use their app). - A bit more explanation on the candidate route generation would be helpful, as it is not a trivial task, and it was glossed over in the paper. How does the app know the route is safe, for example? - The recommender metrics, although standard in the literature, are not very intuitive to interpret. What does 2.78 RMSE in heart rate prediction mean? How far off would a prediction be, and if it were to be used in actual application, how often would it make non-sensical or dangerous recommendations? ----------- Summary and review comments ----------- The paper proposes a model (FitRec) for personalized workout data that includes heart rate and GPS. The model is used for two tasks: "workout profile forecasting" and "short term prediction" -- recommending running routes and short-term heart rate prediction. One of the biggest contributions of this work is the dataset, which includes 250K workout records from Endomondo app (though the authors should be careful not to include the actual GPS data, as it would be a serious breach of privacy). The model uses LSTMs to process the contextual sequences of workout data, and uses the learned contextual embeddings for profile forecasting and using encoder-decoder with attention for temporal prediction. The experimentation is done on real-world workout data, and to deal with the noise the authors use derived distance and speed. The fitness data is sample at equal intervals (which may suffer of poor resolution, though the examples of experiments seem to show a decent resolution). The m! odel is compared both to a simple user mean and to a multilayer perceptron, as well as subsets of userfeatures (it is interesting to see that the gender marker did not help, but that is because of under-representation of female users). The paper is very well written, and the examples of the experiments provide concrete evidence of the usefulness of the system (beyond AUC numbers). Note that in introduction, the last two sentences of second paragraph on page 2 (starting with "Fitness and activity data have attracted interest...) seem to be out of place. ----------------------- REVIEW 3 --------------------- PAPER: 1589 TITLE: Modeling Heart Rate and Activity Data for Personalized Fitness Recommendation AUTHORS: Jianmo Ni, Larry Muhlstein and Julian McAuley Audience: 3 (Yes, to a large group of people) Overall score: 2 (accept) ----------- Strengths ----------- - the predictive framework presented by the authors is well described and robust - the related work is well explored - comparison with the performance of baseline models is appreciated ----------- Weaknesses ----------- - data are not generated by "Web users", so the eligibility of the paper for the Web conference is debatable ----------- Summary and review comments ----------- The paper describes a robust predictive framework presented by the authors and developed based on an interesting dataset generated by wearable devices from users performing diverse physical activities. The goal of the approach is to predict short-term heart rate profiles and recommend suitable activities, e.g. a suitable route for a runner. The performance of the model is presented in comparison with baseline models. The presented work is very robust and convincing, as well as well discussed in all its aspects. The only issue is the fact that the data taken into account are not strictly Web data, even though the fact that they are generated by wearable devices connected to the Internet and can provide insight about the health of the users is somehow overcomes this issue. ------------------------- METAREVIEW ------------------------ There is no metareview for this paper ************************************************** * AAAI 2019 (reject) * ************************************************** Reviewer #1 Questions 1. [Summary] Please summarize the main claims/contributions of the paper in your own words. This paper presents an approach to predict hearth rate before and during a given workout activity. The authors use Long Short Term Memory model to process the input data. The final goal of the authors is to provide a personalized recommandation tool, at this end they propose to include contextual informations into the model. 2. [Relevance] Is this paper relevant to an AI audience? Relevant to researchers in subareas only 3. [Significance] Are the results significant? Not significant 4. [Novelty] Are the problems or approaches novel? Somewhat novel or somewhat incremental 5. [Soundness] Is the paper technically sound? Technically sound 6. [Evaluation] Are claims well-supported by theoretical analysis or experimental results? Sufficient 7. [Clarity] Is the paper well-organized and clearly written? Satisfactory 8. [Detailed Comments] Please elaborate on your assessments and provide constructive feedback. This paper addresses an interesting problem but does this in a way that sounds trivial and at the same time difficult to follow . The motivation behind the work (first page) gives a good explanation of why the work is interesting but fails in telling about the approach presented in the paper. Most importantly it introduces the acronyme LSTM without saying what is it ! I think the related works section is quite poor. The section presenting the approach lacks of telling the general idea of the approach. It is only in this section that the acronim LSTM is finally revealed making all the previous part of the paper quite misterious. Moreover, a part a reference, the authors do not explain what LSTM is, why does it works, why is it good for this problem. I think, in this part, the authors should make the effort to explain the same thing but in a way more linked to what the goal of the approach is. It would be a step forward in this direction to put the examples at the end of the two sections Pre-workout prediction and In-workout prediction at the beginning of these sections to put the reader in the context of the problem. in the Model Structure section, I would like Fig 1 to be better explained, this can give to the reader the general idea of the appraoch and avoid it to be lost in formulas. When (finally) introducing the acronim LSTM, it would be better to give an idea of what LSTM does, not in details but the idea. A paper can be based on previous works but it must be self contained, at least in its general purpose: a reader reading the paper carefully should be able to end up understanding what the paper presents, the approach it is proposing, ... I do not find it possible with this paper. What is it MLP? The strong point of the paper is the experiments part. This is well written and complete in its presentation. To be exaustive, this part lacks a "critique" on the approach presented, where does the approach fail? which are the drawbacks of the presented appraoch? I think these points should be addressed either in the experiments part (showing examples of wrong prediction) or, more generally, in the conclusions part. 9. [QUESTIONS FOR THE AUTHORS] Please provide questions for authors to address during the author feedback period. Reading the paper, the thing that made me feeling really unconfortable is the not-exaplained acronime LSTM, I think this is a quick-fixing problem. The part where the approach is explained, I think, needs to be improved making the presentation closer to the problem at end. You need to explain what LSTM does. Fig 1 can be used to illustrate the approach in the text to meke it clearer. When presenting the Endomondo Dataset there is a typo: statistics (I think) I would like to see where the approach fails or which are the drawbacks of it 10. [OVERALL SCORE] 5 - Marginally below threshold 11. [CONFIDENCE] Reviewer is knowledgeable but out of the area 15. Please acknowledge that you have read the author rebuttal. If your opinion has changed, please summarize the main reasons below. Thanks for your rebutaal. I am sorry to say that but you did not convinced me on the fact that your paper will take into account our comments. Reviewer #2 Questions 1. [Summary] Please summarize the main claims/contributions of the paper in your own words. The authors proposed an LSTM-based model to estimate a user's heart rate profile for a candidate activity so as to predict and recommend suitable activities for each individual. They evaluated their model on a novel dataset (collected from the endomondo.com, including attributes like heart rate, GPS coordinates as well as other metadata) by demonstrating the model had the capability of learning attribute embeddings, contextual embeddings, and dependencies among contextual information, comparing their model against several baselines using criteria like RMSE, MAE, Precision and Recall. Finally, they showed that it had better results. 2. [Relevance] Is this paper relevant to an AI audience? Relevant to researchers in subareas only 3. [Significance] Are the results significant? Moderately significant 4. [Novelty] Are the problems or approaches novel? Novel 5. [Soundness] Is the paper technically sound? Technically sound 6. [Evaluation] Are claims well-supported by theoretical analysis or experimental results? Sufficient 7. [Clarity] Is the paper well-organized and clearly written? Good 8. [Detailed Comments] Please elaborate on your assessments and provide constructive feedback. 1. Route recommendation is a complicated task due to many trivial but necessary problems to consider, such as terrain, temporal traffic condition, weather, and so on. The 3 values(AUC, HR@10, NDCG) they used to verify the route recommendation performance are OK but may not as practical as it should be. 2. The precision, recall and f1 score of the model don't seem to be better than the other two(MLP & DA-RNN) when the heart rate is under 185 bpm. In reality, there are not so many people who can easily achieve or stand this heart rate (max. 220 - age) for a long time. So maybe there's some bias in it. 9. [QUESTIONS FOR THE AUTHORS] Please provide questions for authors to address during the author feedback period. 1. Did the authors use the wrong RMSE formula? (Where is the root of the RMSE formula in your Evaluation Metrics parts) 2. Due to 1., I question the correctness of the values in the Table 2 & 3. 10. [OVERALL SCORE] 6 - Marginally above threshold 11. [CONFIDENCE] Reviewer is knowledgeable but out of the area Reviewer #3 Questions 1. [Summary] Please summarize the main claims/contributions of the paper in your own words. The paper presents an LSTM-based model that predicts vital signal sequences (heart rate, speed fluctuations, etc) in future based on the fitness App user past. Additionally, it also recommends the routs to the users that would help them to rich their fitness goals. The experimental results demonstrate the applicability of the approach and its success in solving the proposed problems. 2. [Relevance] Is this paper relevant to an AI audience? Relevant to researchers in subareas only 3. [Significance] Are the results significant? Significant 4. [Novelty] Are the problems or approaches novel? Novel 5. [Soundness] Is the paper technically sound? Has minor errors 6. [Evaluation] Are claims well-supported by theoretical analysis or experimental results? Sufficient 7. [Clarity] Is the paper well-organized and clearly written? Good 8. [Detailed Comments] Please elaborate on your assessments and provide constructive feedback. The paper is nicely written and a pleasure to read. The addressed problem is novel and the obtained results are promising. I would just outline several minor issues that need to be elaborated in the manuscript: 1) The utilized data source, data set, and the problem approached are not novel in the field. Particularly there have been several studies that have utilized Endomondo data for different prediction purposes. Large-scale datasets that include not just Endomondo sequential data but also other social networks for the same users are also released to the public, including the works published in AAAI last years. Ideally, I would love to see those works as a baseline in the evaluation section. Since the time is constrained, they need to be at least properly cited [1,2,3] and the differences must be clearly explained. 2) The Endomondo data is very noisy. The sampling rate fluctuations are just one of the problems. Another problem is related to the lack of the knowledge which sensors were used and numerous wrong "jumps" of the measurements dues to sensor inaccuracies. The above obviously introduces the noise into the ground truth and the training set. I did not find the info on how do you handle such a problem. [1] Farseev, Aleksandr, and Tat-Seng Chua. "TweetFit: Fusing Multiple Social Media and Sensor Data for Wellness Profile Learning." AAAI. 2017. [2] Farseev, Aleksandr, and Tat-Seng Chua. "Tweet can be fit: Integrating data from wearable sensors and multiple social networks for wellness profile learning." ACM Transactions on Information Systems (TOIS) 35.4 (2017): 42. [3] Chowdhury, Alok Kumar, et al. "Automatic classification of physical exercises from wearable sensors using small dataset from non-laboratory settings." Life Sciences Conference (LSC), 2017 IEEE. IEEE, 2017. 9. [QUESTIONS FOR THE AUTHORS] Please provide questions for authors to address during the author feedback period. 1) Did you consider the changes in the altitude in your data modeling and evaluation? I did not find an explanation on it, while it is very important for a fair comparison of user performances. 2) Any insights on solving a Cold-Start issue for cases when historical workouts are not available for the users? 3) Work the route recommendation, the 100 workout candidates are supposed to be all in the similar geographical location because it will not be useful to recommend a route which is far from user's current location. Does such condition hold? 4) How the noise in sensor measurement was handeled? Did it affect the results? 10. [OVERALL SCORE] 6 - Marginally above threshold 11. [CONFIDENCE] Reviewer is knowledgeable in the area 15. Please acknowledge that you have read the author rebuttal. If your opinion has changed, please summarize the main reasons below. Thanks for answering the questions. Based on authors' clarifications, I'm ready to increase the overall score to M. Accept. ************************************************** * ICDM 2018 (reject) * ************************************************** ================================================================ --======== Review Reports ========-- The review report from reviewer #1: *1: Is the paper relevant to ICDM? [_] No [X] Yes *2: How innovative is the paper? [_] 6 (Very innovative) [_] 3 (Innovative) [X] -2 (Marginally) [_] -4 (Not very much) [_] -6 (Not at all) *3: How would you rate the technical quality of the paper? [_] 6 (Very high) [X] 3 (High) [_] -2 (Marginal) [_] -4 (Low) [_] -6 (Very low) *4: How is the presentation? [_] 6 (Excellent) [_] 3 (Good) [X] -2 (Marginal) [_] -4 (Below average) [_] -6 (Poor) *5: Is the paper of interest to ICDM users and practitioners? [_] 3 (Yes) [X] 2 (May be) [_] 1 (No) [_] 0 (Not applicable) *6: What is your confidence in your review of this paper? [X] 2 (High) [_] 1 (Medium) [_] 0 (Low) *7: Overall recommendation [_] 6: must accept (in top 25% of ICDM accepted papers) [_] 3: should accept (in top 80% of ICDM accepted papers) [_] -2: marginal (in bottom 20% of ICDM accepted papers) [X] -4: should reject (below acceptance bar) [_] -6: must reject (unacceptable: too weak, incomplete, or wrong) *8: Summary of the paper's main contribution and impact • Release of (large scale) fitness dataset for running and biking exercises with meta-information, such as gender. • LSTM-based model to capture inter/and intra context features to suggest personalised exercise routes with a given duration and intensity • For that, two subtasks are solved independently: “Pre-Workout Prediction” estimates the heart rate and speed given a workout time and route, whereas “In-Workout Prediction” predicts the heart rate in the next time-step, given its history and the current route. *9: Justification of your recommendation The usage of the norm-length of 450 time-units is to coarse with the given maximal interval of 10 minutes between measurements, and the interpolation is invalid. Missing significance tests and diligence as well as unsuitable discussions require a heavy rewrite. The current state does not provide enough material of adequate quality to be accepted. *10: Three strong points of this paper (please number each point) 1. Release of public fitness dataset 2. Qualitative and quantitative analysis of recommendation 3. Fills a gap of missing academic fitness recommender systems *11: Three weak points of this paper (please number each point) 1. Missing diligence, apparent in strange typos: heart rage → heart rate, etc., as well as swapped units in tables: speed (BPM) → speed (KMPH) 2. All results are presented without significance tests 3. Dataset analysis is missing key aspects (i.e. gender distribution, country origin of runs, average altitude profile of routes, etc.) *12: Is this submission among the best 10% of submissions that you reviewed for ICDM'18? [X] No [_] Yes *13: Would you be able to replicate the results based on the information given in the paper? [X] No [_] Yes *14: Are the data and implementations publicly available for possible replication? [X] No [_] Yes *15: If the paper is accepted, which format would you suggest? [_] Regular Paper [X] Short Paper *16: Detailed comments for the authors Summary This paper develops a fitness track recommendation system using recurrent neural networks. Routes are recommended requiring the desired training time and heart rate as input. These recommendations are done via a pre-trained LSTM that predicts the heart rate and speed given a distance and altitude sequence. Additionally a short-term recommendation task is solved, which predicts whether continuing current workout intensity will yield in passing a certain heart rate threshold. For that, another LSTM is pre-trained in predicting the heart rate at the next step given the history of the heart rate, speed and altitude sequence. All predictions are done on a novel dataset consisting of running and cycling exercises. Originality of work There exist only few publicly available datasets containing workout routines, especially for multiple types of movements and in larger scales. Furthermore, the application of a personalised recommender system for running paths is not well studied, possibly due to the lack of public data. Potential impact of results Personalised route recommendation is integrated into some fitness tracking portals. However, none of them are academically evaluated and publicly available to cross-check. In combination with the release of the dataset, this publication can have larger impact in the field. Quality of execution While the approach itself is well documented and valid, other parts of this paper show a lack thereof: • Depicted dataset statistics are missing key attributes, such as distribution over gender and sport type, as well as the average altitude profile and heart rate histograms. This is essential for assessing the presented results. • The shown procedure of deriving speed as the covered distance over timespan, only results in the average speed during that period. This is especially problematic since the lowest accepted sampling rate is set to ten minutes! In a course where altitude is changing rapidly, this method removes a lot of information. Especially the norm-length of 450 time-units makes predictions very hard to compare, even for a single runner. • Reasons for discarding the originally measured speed data are missing (Why unreliable?) • Interpolation to 10 second intervals (1/60) is not justifiable. Especially, since the last 10 steps are presented to the network as an input, thus the network tries to learn the interpolation function as its prediction function. • Results are discussed very shortly and significance tests are missing. Sometimes the paper mentions to be significantly better than baselines (used test for that is not mentioned), in most cases achieved results are mentioned to outperform used baselines. • As noted, bodily responses are very subjective, thus a per user averaged result would have been useful. Are men easier to predict than women? • Additionally the generalising power of the model would be of interest: How well does the recommendation work on unknown users? (→ cold start problem). • Grid ranges for the parameter search are missing, as well as parameter settings for the used baselines. • Within the recommendation (B) evaluation, threshold values are set without justification (180 bpm already seems very high, no figure displays any HR near that threshold). How well does the majority vote baseline (always predict “not exceeding threshold”, zeroed array)? F1 averaging scheme (macro?) is missing. • The procedure for recommendation task (A) seems reasonable and covers different metrics, although results are hard to interpret (see above). Furthermore, the presented LSTM without contextual input, performs worse than the user mean! Quality of presentation The paper is well structured and easy to follow. Set goals are worked out well and results presented clearly. In some sections the writing is missing some diligence (i.e. swapped units in table V), or at least poorly chosen (i.e. 10^7, where the 7 is a footnote). Adequacy of citations The paper covers a wide ground of related work, although the part on personalised recommendation is very short. Overall Overall, the paper has some great insights and aims to fill the gap for an academic personalised fitness recommendation system. Especially the release of a public fitness data set would further boost developments of such open systems. However, data analysis misses some key aspects. Since it is an unpublished source, these are essential interpreting/assessing published results. The usage of the norm-length of 450 time-units is to coarse with the given maximal interval of 10 minutes between measurements, and the interpolation is invalid. Missing significance tests and diligence as well as unsuitable discussions require a heavy rewrite. The current state does not provide enough material of adequate quality to be accepted. ======================================================== The review report from reviewer #2: *1: Is the paper relevant to ICDM? [_] No [X] Yes *2: How innovative is the paper? [_] 6 (Very innovative) [X] 3 (Innovative) [_] -2 (Marginally) [_] -4 (Not very much) [_] -6 (Not at all) *3: How would you rate the technical quality of the paper? [_] 6 (Very high) [X] 3 (High) [_] -2 (Marginal) [_] -4 (Low) [_] -6 (Very low) *4: How is the presentation? [X] 6 (Excellent) [_] 3 (Good) [_] -2 (Marginal) [_] -4 (Below average) [_] -6 (Poor) *5: Is the paper of interest to ICDM users and practitioners? [X] 3 (Yes) [_] 2 (May be) [_] 1 (No) [_] 0 (Not applicable) *6: What is your confidence in your review of this paper? [_] 2 (High) [X] 1 (Medium) [_] 0 (Low) *7: Overall recommendation [_] 6: must accept (in top 25% of ICDM accepted papers) [X] 3: should accept (in top 80% of ICDM accepted papers) [_] -2: marginal (in bottom 20% of ICDM accepted papers) [_] -4: should reject (below acceptance bar) [_] -6: must reject (unacceptable: too weak, incomplete, or wrong) *8: Summary of the paper's main contribution and impact 1、A LSTM-based model was proposed to consider both intra-workout and inter-workout contextual information. 2、The predictive models can be used on two real-world recommendation tasks: Workout route recommendation and short-term heart rate prediction. 3、The proposed model was evaluated against baselines on several personalized recommendation tasks, showing the promise of using wearable data for activity modeling and recommendation. *9: Justification of your recommendation The promise of the proposed model in using wearable data for activity modeling and recommendation. *10: Three strong points of this paper (please number each point) 1、A LSTM-based model was proposed to consider both intra-workout and inter-workout contextual information. 2、The predictive models can be used on two real-world recommendation tasks: Workout route recommendation and short-term heart rate prediction. 3、The proposed model was evaluated against baselines on several personalized recommendation tasks, showing the promise of using wearable data for activity modeling and recommendation. *11: Three weak points of this paper (please number each point) 1、In Page 6,the number “2” in the l2 regularizer should be formatted as subscript. 2、In eq.(13), the meaning of T test should be given in the paper. 3、In the CONCLUSIONS section, it is better to present some prospects about the future work. *12: Is this submission among the best 10% of submissions that you reviewed for ICDM'18? [_] No [X] Yes *13: Would you be able to replicate the results based on the information given in the paper? [_] No [X] Yes *14: Are the data and implementations publicly available for possible replication? [_] No [X] Yes *15: If the paper is accepted, which format would you suggest? [X] Regular Paper [_] Short Paper *16: Detailed comments for the authors This paper presents an LSTM-based model that addresses sequential prediction problems in fitness and exercise data which is the building block to develop personalized applications. The main contributions of this paper include that: (1) An LSTM-based model was proposed to consider both intra-workout and inter-workout contextual information; (2) The predictive models can be used on two real-world recommendation tasks: Workout route recommendation and short-term heart rate prediction; (3) The proposed model was evaluated against baselines on several personalized recommendation tasks, showing the promise of using wearable data for activity modeling and recommendation. Also some minor errors and improvement need to be considered, 1、In Page 6,the number “2” in the l2 regularizer should be formatted as subscript. 2、In eq.(13), the meaning of T test should be given in the paper. 3、In the CONCLUSIONS section, it is better to present some prospects about the future work. ======================================================== The review report from reviewer #3: *1: Is the paper relevant to ICDM? [_] No [X] Yes *2: How innovative is the paper? [_] 6 (Very innovative) [_] 3 (Innovative) [X] -2 (Marginally) [_] -4 (Not very much) [_] -6 (Not at all) *3: How would you rate the technical quality of the paper? [_] 6 (Very high) [_] 3 (High) [X] -2 (Marginal) [_] -4 (Low) [_] -6 (Very low) *4: How is the presentation? [_] 6 (Excellent) [_] 3 (Good) [X] -2 (Marginal) [_] -4 (Below average) [_] -6 (Poor) *5: Is the paper of interest to ICDM users and practitioners? [X] 3 (Yes) [_] 2 (May be) [_] 1 (No) [_] 0 (Not applicable) *6: What is your confidence in your review of this paper? [_] 2 (High) [X] 1 (Medium) [_] 0 (Low) *7: Overall recommendation [_] 6: must accept (in top 25% of ICDM accepted papers) [_] 3: should accept (in top 80% of ICDM accepted papers) [X] -2: marginal (in bottom 20% of ICDM accepted papers) [_] -4: should reject (below acceptance bar) [_] -6: must reject (unacceptable: too weak, incomplete, or wrong) *8: Summary of the paper's main contribution and impact This paper uses RNN-based models to capture the "context" patterns of a user. Two personalized recommendation models are designed and it shows that wearable data can provide inference for fitness recommendation. *9: Justification of your recommendation The problem is a practical one; but the paper does not show the research value and challenges in it, and the design is pretty standard. *10: Three strong points of this paper (please number each point) - Two novel RNN-based models for personalized prediction and recommendation. - Two real-world recommendation tasks are implemented and they show the ability of prediction models. - Release a large-scale workout dataset to open. *11: Three weak points of this paper (please number each point) - Not clear how the prediction task of workouts is challenging to previous standard models. - Recommendation might be beyond classification, authors are suggested to include more insights on the unique requirement of "fitness suggestion" - Organization can be improved, e.g., more content on the challenges and analysis, while less on LSTM definition. *12: Is this submission among the best 10% of submissions that you reviewed for ICDM'18? [X] No [_] Yes *13: Would you be able to replicate the results based on the information given in the paper? [X] No [_] Yes *14: Are the data and implementations publicly available for possible replication? [X] No [_] Yes *15: If the paper is accepted, which format would you suggest? [_] Regular Paper [X] Short Paper *16: Detailed comments for the authors See weaknesses above. ========================================================