Reviewer #1 Questions 1. Overall Rating Accept 2. Interest to Audience. Will this paper attract the interest of WSDM 2021 attendees? Will it be intriguing and inspiring? Might it be highlighted afterwards in the press as particularly innovative? Interesting to most attendees. Likely to draw significant attention 3. Significance: Does the paper contribute a major breakthrough or an incremental advance? Are other people (practitioners or researchers) likely to use these ideas or build on them? Is it clear how this work differs from previous contributions, and is related work adequately referenced? Substantial advance, with new methods or new insights that other researchers likely to build on 4. Experiments: Do the experiments support the claims made in the paper and justify the proposed techniques? OK, but certain claims are not covered by the experiments 5. Paper Clarity. Is the paper clearly written? Is it well-organized? Is the result section (and or are the proofs, if applicable) clear enough so that expert readers will understand them and can reproduce the results? Good: most of the paper was understandable, with minor typos or details that could be improved 6. Summary of the paper (what is being proposed and in what context) and a brief justification of your overall recommendation. (One solid paragraph) In order to improve the performance of recommendation models, the authors studied the dataset sampling strategies on the ranking performance. The authors designed a data-specific sampling strategy, named SVP-CF, to preserve the relative performance of models after sampling. They also developed a method, which can suggest the sampling scheme for a given dataset. Experiments are conducted on five public datasets to verify the effectiveness of the proposed approach. 7. Three (or more) strong points about the paper. Please be precise and explicit; clearly explain the value and nature of the contribution. 1. The authors characterized the effect of sixteen sampling strategies on recommendation performance. 2. The authors proposed SVP-CF, which can preserve the relative performance of models after sampling. 3. The authors developed a data-genie, which can analyze the performance of different sampling strategies. 8. Three (or more) weak points about the paper. Please clearly indicate whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution. Please be polite, specific, and constructive. 1. The authors claimed that the proposed SVP-CF is better than commonly used sampling strategies. This claim cannot be supported in some experimental results. 2. If the proposed SVP-CF is the best sampling strategy, what is the data-genie designed for? 9. Detailed Evaluation (Contribution, Pros/Cons, Errors); please number each point and please provide as constructive feedback as possible. Overall, this paper is properly organized and the experiments are sufficient. Therefore, I suggest the ‘accept’. 10. Reviewer's confidence. Knowledgeable in this sub-area Reviewer #2 Questions 1. Overall Rating Weak Accept 2. Interest to Audience. Will this paper attract the interest of WSDM 2021 attendees? Will it be intriguing and inspiring? Might it be highlighted afterwards in the press as particularly innovative? Interesting to many attendees 3. Significance: Does the paper contribute a major breakthrough or an incremental advance? Are other people (practitioners or researchers) likely to use these ideas or build on them? Is it clear how this work differs from previous contributions, and is related work adequately referenced? Substantial advance, with new methods or new insights that other researchers likely to build on 4. Experiments: Do the experiments support the claims made in the paper and justify the proposed techniques? Very nicely support the claims made in the paper 5. Paper Clarity. Is the paper clearly written? Is it well-organized? Is the result section (and or are the proofs, if applicable) clear enough so that expert readers will understand them and can reproduce the results? Good: most of the paper was understandable, with minor typos or details that could be improved 6. Summary of the paper (what is being proposed and in what context) and a brief justification of your overall recommendation. (One solid paragraph) This paper studies the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms. Following this idea, they characterized the effect of sampling on algorithm performance in details, and designed a data-specific sampling strategy, which aims to preserve the relative performance of models after sampling. Extensive experiments have proven their arguments. 7. Three (or more) strong points about the paper. Please be precise and explicit; clearly explain the value and nature of the contribution. 1. This paper is well-written and well-organized. 2. The motivation of this paper is very interesting, and the conclusions of this will have a far-reaching impact in recommendation communities. 3. Extensive experiments are conducted to verify the effectiveness of their proposed SVP-CF. 8. Three (or more) weak points about the paper. Please clearly indicate whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution. Please be polite, specific, and constructive. 1. several related works are missing. 2. Some representative GNN-based methods should be introduced and compared. 3. English should be shallowly revised, although it is good enough to be understood. 9. Detailed Evaluation (Contribution, Pros/Cons, Errors); please number each point and please provide as constructive feedback as possible. 1. They should discuss the related works, eg, A Comparative Study of Collaborative Filtering Algorithms and Comparative Recommender System Evaluation: Benchmarking Recommendation Frameworks. 2. In the Section 3.1, the authors should select more SOTA GNN-based methods as baselines, eg., NGCF, LightGCN, HGCF, etc. 3. The authors should present more details about the implementation of the selected methods. 10. Reviewer's confidence. Expert in this problem Reviewer #4 Questions 1. Overall Rating Weak Accept 2. Interest to Audience. Will this paper attract the interest of WSDM 2021 attendees? Will it be intriguing and inspiring? Might it be highlighted afterwards in the press as particularly innovative? Some interest to a large fraction of the attendees or a lot of interest to some attendees 3. Significance: Does the paper contribute a major breakthrough or an incremental advance? Are other people (practitioners or researchers) likely to use these ideas or build on them? Is it clear how this work differs from previous contributions, and is related work adequately referenced? Moderate advance in methodology or understanding of phenomena, likely to be useful to others 4. Experiments: Do the experiments support the claims made in the paper and justify the proposed techniques? OK, but certain claims are not covered by the experiments 5. Paper Clarity. Is the paper clearly written? Is it well-organized? Is the result section (and or are the proofs, if applicable) clear enough so that expert readers will understand them and can reproduce the results? Average: the main points of the paper was understandable, but some parts were not clear 6. Summary of the paper (what is being proposed and in what context) and a brief justification of your overall recommendation. (One solid paragraph) The authors study the problem of sampling Collaborative Filtering datasets, in particular they address the problem of evaluating an algorithm’s performances on a sub-sampling of the dataset instead of using the full dataset. The goals of the authors are multiple, first evaluate multiple state-of the art algorithms according to different scenarios of user-provided feedback and sampling strategies and multiple metrics. Then providing a novel sampling strategy, SVP-CF, that employs a proxy-based approach to quantify the importance of the data points to be included in samples in order to couple with multiple aspects such as heterogeneity and missing data. Lastly, the authors aim at providing a novel strategy to pick the best way to sample a dataset in order to maintain the original performances of a model on the sampled dataset. The paper overall studies an important problem for the community and the proposed methodology seems reasonable. The experimental evaluation is large, and the results may be useful for future works. The paper can still be improved by enhancing clearness and some details about the experimental evaluation. I give the paper a weak accept. 7. Three (or more) strong points about the paper. Please be precise and explicit; clearly explain the value and nature of the contribution. 1. The authors study an important problem for the community. 2. The proposed sampling methodology seems reasonable. 3. The authors perform extensive experimental evaluation to validate their claims. 8. Three (or more) weak points about the paper. Please clearly indicate whether the paper has any mistakes, missing related work, or results that cannot be considered a contribution. Please be polite, specific, and constructive. 1. The experimental evaluation is not entirely clear to me (see P1 of point 9). 2. The presentation can be improved (see P2 of point 9). 3. The paper lacks rigorous formalization (see P3 of point 9). 9. Detailed Evaluation (Contribution, Pros/Cons, Errors); please number each point and please provide as constructive feedback as possible. P1. The experimental evaluation has some points that are not entirely clear to the reader: - First how the feedbacks of the users are generated is not clear, the results are heavily depending on such procedure thus explaining better how this step is performed and adopting different procedures to perform such task may provide more powerful results in terms of statistical noise. - The value of p used to obtain Table 1 is not reported - Table 1 lacks the variance of the results obtained with the different sampling strategies on the datasets (as stated in the introduction the variance is important in many tasks) - There is no explanation of why the metrics used are important for the specific tasks and why the authors adopt only such metrics. P2. Some sections can be rewritten to improve clearness such as: - “coreset selection” in Related works, it is not clear to the reader of why such approaches cannot be easily adapted to the framework of CF-data, perhaps the authors should argue better on this point. - “evaluating sample quality” in Related works. Does not really provide many insights to the reader during the development of the work, perhaps the authors should stress how they address these issue in their work in this section. - Providing a pseudocode for SVP-CF could improve the overall readability since such section is not entirely easy to follow. - Some results presented in the “Discussion” section (such as on CO_2 consumption) would also make sense in the Introduction if properly adapted. Additionally the authors should discuss the metrics they adopt, even they are fairly standard they should present their definitions and discuss them (a table may be sufficient). P3. The mathematical formulation lacks rigorous formalization: The set \mathcal{U} is not defined, r_i^{*u}=1 is never defined, the natural logarithm is written both as (log and ln), the domain of p% is never defined. The authors should ensure that the paper follows a uniform notation and that all the quantities are defined properly. 10. Reviewer's confidence. Generally aware of the area