Reviewer 5 (Primary SPC (meta-reviewer)) The Meta-Review This paper has scored highly on average, but with divided opinion among the reviewers. The merits of the work are agreed by all reviewers On the writing/presentation: * Well written, with problem setting well defined. On the technical merits: * Novel, effective, plausible method to generate usage data under a black-box setting. * A realistic setting for an adversarial attack, where the attacker cannot access the training data of the recommender, or the recommender model. * A large number of experiments with some interesting analysis. There are some important criticisms are raised by two of the reviewers: * The model has several implicit assumptions that aren't discussed in the paper and, in particular, that user interaction data is required for the profile pollution attack, somewhat contradicting a main contribution that the attack is 'data-free'. * The idea is very similar to existing SOTA, particularly one method proposed in the context of NLP models. * The method should have been evaluated against the LOKI baseline Having read the reviews carefully and some of the related work, I am inclined towards an acceptance, with strong advise to the authors to address the concerns raised by the reviewers. The authors should clarify that the user data required for the profile pollution attack is separate to the training data, and discuss how many such users would be expected to be available, relative to the size of the training set. I am inclined to accept that there is sufficient novelty in the application of this methodology to sequential recommenders, albeit that similar ideas have been applied for NLP models. Finally, I accept the authors explanation of the difficulty of obtaining the LOKI baseline, and note that the general consensus among the reviewers is that the evaluation was strong. Final Recommendation after Meta-Review Accept if space permits: I would argue for accepting this paper. Reviewer 1 (Secondary SPC (SPC reviewer)) Expertise Expert Your Review This paper suggests to query a target black-box recommender system in order to generate seemingly genuine sequences. Given this data, model extraction is applied, which allows to perform two types of attack. The problem of black-box attacks is of great importance to the recsys community, and any work that probes it may have a significant contribution. ================================================== Main positive points about the paper: Main positive points: * Novel, effective, plausible and simple method to generate usage data under a black-box setting. * Well written. * Problem setting is well defined. * Experimented on several recommendation models. ================================================== Main negative points: (will elaborate later) * The paper isn't focused on its main contribution. * The model has several implicit assumption that aren't discussed in the paper. * Missing both basic and state-of-the-art baselines. ================================================== Detailed review: The algorithm consists of 2 stages: 1. Obtaining mock users by querying the recommender system and training a surrogate system. 2. Attacking a recommender using the surrogate white-box model. The main drawback of LOKI compared to this work is the former assumes the attacker has read access to genuine users. This work relieves this assumption using the first stage, which invokes the target recommender. Once mock users are obtained, the problem setting convergences to the one of LOKI, i.e., the second stage, which attacks a recommender using a surrogate white-box model. The novel contribution is the first stage, and the second stage should prove its superiority compared to other methods. Given a surrogate model, the optimal *next item* can be found by trying each of the items in I as the appended one. Of course, this is a sub-optimal strategy, as it considers only the immediate next item to append. Perhaps a better way is LOKI, which considers the entire sequence of appended items, and therefore should have been evaluated. Moreover, this paper doesn't exert the aforementioned optimal next-item approach, but rather applies Algorithm 1, which only approximates it. The reason (I assume, as it's not described in the paper) for applying an approximation is improved running time. However, line 3 in the algorithm means that the attacker can access the interaction data used for training. If there are many such users, then it contradicts the main contribution of the paper (which is they don't assume access to training data). If there are few such users, then the effect of the attack is very limited and running time isn't an issue and therefore Algorithm 1 should be used. Table 4, White-Box-Random. This model is trained on random data and therefore similarity between a pair of items doesn't reflect any semantic relation. Therefore, Algorithm 1 outputs random items and the expected performance should be low. However, the results are much better than random, and it should be at least discussed how can it be. Implicit assumptions: * The authors explicitly assume the adversary can append items to existing users. However, they also assume read access * High stability. According to "Stability of Recommendation Algorithms", it's defined to measure the extent to which a recommendation algorithm provides predictions that are consistent with each other. * Eq. 4 is very reasonable, as it captures the differences between ranked items and random ones. \lambda_1 is shared for all i in [1,k), which implicitly assumes a uniform distribution of s_w. It's fine to assume it, and was previously validated in "A Black-Box Attack Model for Visually-Aware Recommender Systems", but it should be explicitly mentioned. * Line 5 in Algorithm 1 implicitly assumes that the optimal item is close to t, and therefore computes the gradient in this neighborhood. I'm not sure it always happen. As an example from NLP, feeding the word "the" won't cause an RNN to predict the next word is "the", but probably some noun. And anyway, it should be explicitly said to the reader. Minor comments: * The paper divides attacks on recommenders to profile pollution attacks and data poisoning attacks. However, there's another type of attack: adversarial attacks on the side information, e.g., "A Black-Box Attack Model for Visually-Aware Recommender Systems" and "Adversarial Item Promotion: Vulnerabilities at the Core of Top-N Recommenders that Use Images to Address Cold Start". * To my understanding, "data poisoning" is a synonym of the widely used term "shilling attack" (e.g., [21] and "Shilling attacks against recommender systems: a comprehensive survey"), and therefore, it could be mentioned. * In both algorithms, z should be strictly less than T. * t isn't defined in the equation in line 361. * indicate*s* in line 545. * According to section 5.4, about %10 of adversarial items are added, but 5.1.3 claims that only 1% is added. * Line 717, the clause "it can be hard to detect" isn't clear. ================================================== Further details: I think the paper should focus on its first phase and discard the second phase. This is because the second phase has some problematic assumptions, could be improved and doesn't evaluate against strong baseline (LOKI). Human Participants Research Does not report on human-participants research Ethical considerations (blank) Review Rating Probably reject: I would argue for rejecting this paper. Reviewer 2 (PC) Expertise Knowledgeable Your Review This is a suggested guide for the format of your review; note that this is merely a suggestion, feel free to delete sections that are not relevant to your review. ================================================== (1) Summarize in a few sentences the contribution, and its relevance to recommender systems: This paper shows the vulnerability of black-box sequential recommender models. By model extraction, a white-box model is learned which behaves similarly to the black-box model. The attack to the white-box model can be transferred to the black-box model. The experiments by profile pollution and data poisoning demonstrate the vulnerability of sequential recommender systems. ================================================== (2) List (2 or 3) of the main positive points about the paper, i.e., arguments for acceptance: This paper considers a black-box attack which relies on a surrogate model by means of distillation. The experimental results demonstrate that the sequential recommender systems can be attacked through attacking the extracted model. This shows the effective attack against the sequential recommender systems in two stages in the paper: model extraction and attack transfer. ================================================== (3) List (2 or 3) of the main negative points about this paper, i.e., arguments for rejection: The paper considers data-free settings which means no training data is available for the surrogate model. The extracted model is trained by limited API queries. Then the difference between black-box and white-box recommenders needs to be minimized. I think maybe some experiments should exist to show how many limited API queries can make the extracted model achieve the performance we are satisfied with. This can show to what extent the methods in the paper are data-free settings. ================================================== (4) Detailed review: In this section, discuss your overall opinion about the merits of the work. Consider issues of replicability, significance, value to the community, novelty, etc. You may want to include additional sections on: (4a) Suggested revisions (4b) Missing related work (4c) Typos, formatting issues, etc. (4d) Appropriateness of paper length ================================================== (5) Any further details about your specific expertise or perspective on this topic: Human Participants Research Does not report on human-participants research Ethical considerations NA Review Rating Probably accept: I would argue for accepting this paper. Reviewer 3 (PC) Expertise Knowledgeable Your Review ================================================== The authors consider the problem of black-box attacks in sequential recommenders, and specifically the one of "model extraction", i.e., 'stealing the weights' of a sequential recommender. This to the best of my knowledge hasn't been studied before in the literature. The authors consider a rather challenging setting of a black-box attack, where the training data is not accessible by the attacker. Their approach consists of synthetic data generation and knowledge distillation: specifically they distill the black-box model (based on synthetic data and their labels) to a white model; they then attack the black-box model via samples generated by the white-box recommender. ================================================== Pros: * they consider a rather realistic setting of an adversarial attack to a recommender, where the attacker cannot access the training data of the recommender, or the recommender model * they introduce a novel distillation-based approach to approximate the black-box recommender model with a white-box model, that they can then utilize to create adversarial examples, eventually attacking the black-box original recommender. * the under attack recommender policy class of sequential recommenders is a rather popular and effective one in academia and industry. ================================================== Cons/Things to improve: * other sequential recommendation models can be considered to be under attack besides the autoregressive one, so to better understand the effectiveness of the attack. * considering recommenderrs under attack which have a defense mechanism, and showing how the proposed attack performs can be interesting. Having said that, I think that the paper, as is, has already interesting enough experiments showcasing the efficacy of this black-box distillation-based approach. ================================================== Human Participants Research Human participants research but would not require review Ethical considerations (blank) Review Rating Definite accept: I would argue strongly for accepting this paper. Reviewer 4 (PC) Expertise Expert Your Review The paper presents a framework to extract an unknown sequential recommender’s weight without any access to its training data. On several datasets, the proposed method can “copy” the model successfully and some downstream attacks with the “copied model” are studied. There are some highlights in the paper, however, I’d consider it as a borderline paper due to the limited novelty. * Positives - The paper does not only consider model extract, but also shows ways to use the extracted model for several downstream attacks, like profile pollution and data poisoning. The technique proposed in this paper doesn’t assumes any training data, which is more practical and thus this kind of attack has larger threat. - The paper is well-written in general and is easy to follow (except for notations). - A large number of experiments are done with some interesting analysis like cross-model extraction, influence on budget (though I think budgets is not a big constraint in recsys), etc. * Negatives - Novelty: The main concern I have is the novelty of the proposed framework (sec 3). For me, the idea is very similar (query API -> training data -> distillation) to extracting NLP models [1], which was studied about 2yr ago, thus I’m so surprised similar things can be done for sequential recommenders. - Better distinguish from related works: For the challenges for model extraction on sequential recommenders, I don’t think (1) and (2) are unique for seqrec problems, they also exist for model extraction in NLP [1]. I’d suggest considering more unique challenges from seq rec (distilling ranked list could be one), to distinguish from existing works. * Typos/formatting issues - Notations are a bit chaotic. In such a case, maybe have a table for all notations used. - Texts in Fig2 are too small to read [1] Thieves on Sesame Street! Model Extraction of BERT-based APIs, https://arxiv.org/abs/1910.12366. Human Participants Research Does not report on human-participants research Ethical considerations (blank) Review Rating Borderline: Overall I would not argue for accepting this paper.