Primary SPC (meta-reviewer) review (reviewer 4) score 4/5 The Meta-Review Dear authors, thank you very much for your submission to RecSys’22. The three reviewers agree that bundle recommendations represent an underexplored and relevant topic, and think that the proposed conversational model is a novel contribution. They also highlight the experiments as a strength of the presented work, although some limitations and issues should be clarified. I hope their comments and suggestions could help you to continue with and improve your research. Best regards ---------------------------------------------------------------- Secondary SPC (SPC reviewer) review (reviewer 1) score 4/5 Expertise Passing Knowledge Summary of Contributions The authors introduce a novel conversational model to recommend bundles, based on a Markov Decision Process where different agents decide whether to ask a question or recommend the bundle. The agents are trained based on previous conversations through encoding. Strengths Bundle recommendation is a "niche" topic with respect to item recommendation. The paper is well written and fairly clear. The authors validate the proposed model by comparing it to state of the art approaches, offline. Weaknesses The authors base their evaluations on simulated users, with the implicit assumption that these users never change their mind (the simulated users have a bundle in their mind and the whole interaction attempts to match the attributes of the items of this bundle, and the items themselves). However, it is well known that users build their preferences while they interact with the recommender system, and they change their minds. Thus, a user study involving real users should be done to have an in-depth evaluation of the approach. The authors present a user study with people, but they presented to such people the traces of the interaction to ask them whether they were plausible, which is not the same as interacting with users. The authors should report the statistical significance of the differences between the best results and the other ones. Detailed Review This paper is interesting and well-written. However, I have concerns about the validation (see above). Another issue to be considered is that the authors only make reference to the related work about recommendation based on neural networks. However, conversational recommender systems exist since a lot of time and there is a thread of work, rather important, that exploits other inference mechanisms to steer the interaction with the user. These are typically also validated with real users. I recommend checking work by the following authors: - Francesco Ricci and colleagues - Li Chen and Pearl Pu - Giuseppe Carenini There are also works dealing with bundle recommendation that are based on constraint solvers, and they are relevant as well, e.g., see work by Markus Zanker, by Alexander Felfernig, by Paolo Dragone, by Agung Toto Wibowo. There are some typos. In the validation, clarify what you mean by Accuracy metric (as it is not F1). The figures of page 13 are unreadable, too small (and they seem to miss the caption). Human Participants Research Human participants research but would not require review Ethical considerations Some human participants have been involved in the test but they have been engaged through AMT, so I suppose they are anonymous. Moreover, they only had to estimate the plausibility of the interaction traces generated by the system. Moreover, the authors do not report any personal data of these participants, which makes me think that no information about them is known. So, I see no ethical concerns about this test. Review Rating Probably accept: I would argue for accepting this paper. ---------------------------------------------------------------- PC review (reviewer 2) score 3/5 Expertise Passing Knowledge Summary of Contributions Recommending a set of items to users usually faces the challenge of sparsity and a large output space. This paper is motivated to improve the performance of bundle recommendation through multiple rounds of communication rounds. They first define the recommendation task of Bundle MCR. Then, the authors propose a new framework to formulate Bundle MCR as Markov Decision Processes with multiple agents. Also, they use a two-stage training strategy to pre-train and finetune the model for better performance. Experimental results validate the improvement. Strengths • Bundle recommendation with multiple rounds of conversations is a practical application problem. • The presentation is good. Figures and tables illustrate the problem definition and motivation. • Extensive experiments and human evaluation validate the effectiveness of the proposed method. Weaknesses • There are presentation flaws, see D1, D2 • Analysis of the number of slots should be included in the Experiment, see D3. • The efficiency and effectiveness trade-off might be a problem, see D4. Detailed Review D1: It is confusing that the policy \pi_C is not mentioned or listed in section 4.2, but shown in section Figure. 2. D2: There is no indicator number for the figure on page 13. (Figure 3 ?) D2: This paper is proposed to improve the recommendation performance via multiple rounds of communications. Thus, how many slots to consult for each round is a key hyper-parameters. It would be better if authors present empirical analysis to study the effect of slot size. D3: BUNT-Learn has a slight advantage compared to other one-shot methods in Figure 3a. But BUNT-learn may cost more time for a user for even more than 7 rounds of communications, as shown in Figure 3b. Thus, the trade-off between efficiency and effectiveness can be further discussed and improved. Human Participants Research Does not report on human-participants research Ethical considerations (blank) Review Rating Borderline: Overall I would not argue for accepting this paper. ---------------------------------------------------------------- PC review (reviewer 3) score 4/5 Expertise Knowledgeable Summary of Contributions In this paper, the authors propose a model to address the multi-round conversational recommendation problem in the context of bundle recommendations. The experiments were conducted on 4 offline datasets as well as human study to evaluate the effectiveness of the proposed approach. Strengths The paper’s topic is practical and relevant to the scope of the conference. In general, the paper is well presented, and the research questions are clearly communicated. The state-of-the-art is adequately covered. The research contribution is a novel idea of formulating Bundle MCR as Markov Decision Processes with multiple agents, including a model framework for offline pre- training and online fine-tuning parameters. The offline experiments were well- executed, and some findings are quite interesting. Weaknesses This work lacks the detailed description of the human evaluation. Detailed Review The paper’s topic is practical and relevant to the scope of the conference. In general, the paper is well presented, and the research questions are clearly communicated. The state-of-the-art is adequately covered. The research contribution is a novel idea of formulating Bundle MCR as Markov Decision Processes with multiple agents, including a model framework for offline pre- training and online fine-tuning parameters. The offline experiments were well- executed, and some findings are quite interesting. One downside of this work is the description and discussion of human evaluation. The experimental protocol was missing and it is completely unclear how was the study conducted. For example, how did the participant evaluate the pair of conversation trajectories? How were they instructed to rate the pair? Was it a within-subject study? Only 5 Mturk workers participated in the study appears to be insufficient number of samples. Few of these questions did call into question the validation of the human study. The authors did a good job in conducting the offline experiments, however, regarding the human evaluations there are quite many issues. Human Participants Research Human participants research but would not require review Ethical considerations (blank) Review Rating Probably accept: I would argue for accepting this paper. ----------------------------------------------------------------