------------------------- METAREVIEW ------------------------ This paper presents a new approach for generative multi-behavior sequential recommendation. Overall, the reviewers are generally positive with this work, and I recommend to accept this work. While, the reviewers also have some minor suggestions to improve this work, I suggest the authors incorporate these points into the camera ready version. ----------------------- REVIEW 1 --------------------- SUBMISSION: 1211 TITLE: Generative Multi-Behavior Sequential Recommendation AUTHORS: Zihan Liu, Yupeng Hou and Julian McAuley ----------- Overall recommendation ----------- SCORE: 0 (borderline paper) ----------- Relevance to CIKM ----------- SCORE: 5 (excellent) ----------- Originality of the Work ----------- SCORE: 4 (good) ----------- Technical Soundness ----------- SCORE: 4 (good) ----------- Quality of Presentation ----------- SCORE: 4 (good) ----------- Impact of Ideas or Results ----------- SCORE: 3 (fair) ----------- Reproducibility of Methods ----------- SCORE: 3 (fair) ----------- Detailed Comments to the Author(s) ----------- This paper proposes a novel multi-behavior sequential recommendation model named as MBGen. Specifically, it is able to predict behavior type and the behavior item when we give a target behavior type. To achieve such purpose, a balanced tokenizer, a unified generative recommendation paradigm and a position-routed sparse architecture are proposed. The experiments are conducted on 2 public available datasets and the code is given. The weak and strong point are summarized as following: Strong: 1. Considering the behavior type as the features and targets is interesting. It is valuable to investigate the novel paradigm for modeling user behavior and this paper give some insights to the audience. For example, take the behavior token with the item token together as input, the balanced tokenizer. 2. code it available and almost important baselines are considered. Experiments are detailed. Weak: 1. For behavior type prediction, authors need give more insight about how to utilize such properties in a real application. What the difference compared with existing multi-scenario recommendation. For example, when user add an item into their cart, the system will predict whether she/he will add another item or pay for it. Need a detail introduction of real product. Besides, what is the difference compare with the intention recognizing. 2. What the different between the tokenization and the pre-trained embedding? How to deal with the situations that a novel id (cold-start). 3. How to deploy the proposed model into a real application or the production systems? More analysis should be given because the key problem of large model application in recommender system is how to address the low-latency and high-throughput. ----------- Summary to support your recommendation ----------- see detailed comments ----------------------- REVIEW 2 --------------------- SUBMISSION: 1211 TITLE: Generative Multi-Behavior Sequential Recommendation AUTHORS: Zihan Liu, Yupeng Hou and Julian McAuley ----------- Overall recommendation ----------- SCORE: 1 (weak accept) ----------- Relevance to CIKM ----------- SCORE: 5 (excellent) ----------- Originality of the Work ----------- SCORE: 4 (good) ----------- Technical Soundness ----------- SCORE: 3 (fair) ----------- Quality of Presentation ----------- SCORE: 3 (fair) ----------- Impact of Ideas or Results ----------- SCORE: 4 (good) ----------- Reproducibility of Methods ----------- SCORE: 4 (good) ----------- Detailed Comments to the Author(s) ----------- This paper proposes a generative multi-behavior sequential recommender, called MBGen. MBGen constructs a token sequence by tokenizing the items in a given heterogeneous sequence and predicts future interactions with three different tasks. The authors further leverage the user ID hashing, the mixture of experts, and the behavior injection for further boosting performance. MBGen was evaluated on two datasets and compared with 13 baselines from three categories. Strengths: - As far as I know, this is the first work to develop the generative recommendation framework in the multi-behavior sequential setting. - MBGen can process both item IDs and item features. - MBGen can predict future interactions with different tasks, which may help provide recommendations in diverse scenarios. - MBGen consistently outperformed all baselines with large improvement rates. - The code is made publicly available. Weaknesses: - The authors could use more datasets for their experiments. They used the Retail and IJCAI datasets while excluding the Yelp dataset due to its violation of their assumption. However, there are more datasets for benchmarking multi-behavior sequential recommenders, such as Retailrocket (e.g., used in https://urldefense.com/v3/__https://ieeexplore.ieee.org/abstract/document/10446828__;!!Mih3wA!DhtrEBpKaawpZAO8gu8qU0wKTB95maFHGGfAaSGeqyeLELBSI0mMDHFr2wnnEb1JAl2Kjyyp_8JhsjhmQ0IZldH3kw$ and https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3534678.3539342__;!!Mih3wA!DhtrEBpKaawpZAO8gu8qU0wKTB95maFHGGfAaSGeqyeLELBSI0mMDHFr2wnnEb1JAl2Kjyyp_8JhsjhmQ0LK3aF0jA$ ) and RecSys Challenge 2015 (e.g., https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3616855.3635857__;!!Mih3wA!DhtrEBpKaawpZAO8gu8qU0wKTB95maFHGGfAaSGeqyeLELBSI0mMDHFr2wnnEb1JAl2Kjyyp_8JhsjhmQ0K4zzYSDA$ ). The experiments on the limited datasets make it unclear how well MBGen performs on different domains and platforms. - While the paper proposes to prepend a user token to a token sequence, the effect of this design wasn't investigated through the ablation study. - The high improvement rates reported in the paper are very impressive. However, it's unclear where such improvements come from because simpler versions of MBGen experimented in the ablation study still achieved much higher performance than baselines. Please further investigate the causes of this and/or discuss potential reasons. - The w/o PR & BI version performed better than the full version for the target behavior prediction task on the Retail dataset. Please discuss why. - Please explain how each model was scaled up in the scalability analysis. It would be great if the authors could also add experiments on efficiency (e.g., training and inference time). Other comments: - In the beginning, the paper first argues the importance of the two-step prediction and then proposes MBGen as a solution. To me, however, this prediction doesn't necessarily require the generative part (or item tokenization) of MBGen. Given that this part is a strong, novel contribution of this work, I would change the storyline in the abstract and introduction. - L674, `not designed for retrieval tasks`: recommendation tasks? - L980: The paper proposes a behavior-aware sampling method here. The method description should be moved to the Methods section. ----------- Summary to support your recommendation ----------- The proposed generative approach to the multi-behavior sequential recommendation is novel and flexible. Its improvement is very impressive. On the other hand, the experiments part could be further improved. ----------------------- REVIEW 3 --------------------- SUBMISSION: 1211 TITLE: Generative Multi-Behavior Sequential Recommendation AUTHORS: Zihan Liu, Yupeng Hou and Julian McAuley ----------- Overall recommendation ----------- SCORE: 1 (weak accept) ----------- Relevance to CIKM ----------- SCORE: 4 (good) ----------- Originality of the Work ----------- SCORE: 4 (good) ----------- Technical Soundness ----------- SCORE: 4 (good) ----------- Quality of Presentation ----------- SCORE: 5 (excellent) ----------- Impact of Ideas or Results ----------- SCORE: 4 (good) ----------- Reproducibility of Methods ----------- SCORE: 5 (excellent) ----------- Detailed Comments to the Author(s) ----------- Strength: 1. This paper is well written, offering a clear exposition of the problem of MBSR and a thoughtful solution. 2. The innovative concept of integrating a generative recommendation paradigm by simultaneously generating the next behavior and item tokens is inspiring. The experiments also illustrate that this framework exhibits a much better scaling curve than baseline models. This finding provides promising support for scaling up the parameters for better results and may inspire more studies in this area. 3. The authors conduct extensive and rigorous experiments that robustly demonstrate the superiority of their proposed methods over existing MBSR models, with improvements of up to 70%. Weakness: I don’t see any weakness. ----------- Summary to support your recommendation ----------- The paper presents a novel approach to multi-behavioral sequential recommendation (MBSR) by incorporating the target behavior type into the learning process. The authors introduce MBGen, a framework that models MBSR as a consecutive two-step process: first, predicting the next behavior type to understand user intent, and second, forecasting the subsequent item. The benchmark experiments fully demonstrate the effectiveness of the methods and the scaling feature also provides promising results for larger models which is very useful for future work in this area. ----------------------- REVIEW 4 --------------------- SUBMISSION: 1211 TITLE: Generative Multi-Behavior Sequential Recommendation AUTHORS: Zihan Liu, Yupeng Hou and Julian McAuley ----------- Overall recommendation ----------- SCORE: 1 (weak accept) ----------- Relevance to CIKM ----------- SCORE: 3 (fair) ----------- Originality of the Work ----------- SCORE: 3 (fair) ----------- Technical Soundness ----------- SCORE: 3 (fair) ----------- Quality of Presentation ----------- SCORE: 3 (fair) ----------- Impact of Ideas or Results ----------- SCORE: 3 (fair) ----------- Reproducibility of Methods ----------- SCORE: 3 (fair) ----------- Detailed Comments to the Author(s) ----------- Strengths 1. MBGen presents a pioneering approach to MBSR by integrating the prediction of behavior types into its learning objective, offering a fresh perspective in the field of recommendation systems. 2. The paper's data-driven methodology, which tokenizes behaviors and items into a unified sequence, allows the model to capture fine-grained user interaction patterns, potentially enhancing the accuracy and personalization of recommendations. 3. Through the proposed position-routed sparse network structure, MBGen can efficiently scale up model parameters without a proportional increase in computational cost, making it more viable for large-scale datasets. Weaknesses 1. The sophisticated design of MBGen, with components like the position-routed sparse network and multi-task learning framework, might increase the complexity of model implementation and debugging. 2. The performance of MBGen is heavily reliant on the availability of comprehensive and unbiased user behavior data; any shortcomings in data quality could adversely affect the model's recommendation effectiveness. 3. While the model shows promising results on public datasets, the paper does not extensively discuss its generalizability across different recommendation scenarios or domains, which may limit its applicability in a broader context. ----------- Summary to support your recommendation ----------- MBGen is a promising framework for multi-behavior sequential recommendation that enhances the performance and personalization of recommendation systems through innovative methods. Despite potential challenges such as implementation complexity and data dependency, its innovation and effectiveness offer new avenues for research in the field of recommender systems. Future work could explore more effective behavior-aware sampling methods and improved ways to model user-behavior dependencies within the generative multi-behavior framework. ----------------------- REVIEW 5 --------------------- SUBMISSION: 1211 TITLE: Generative Multi-Behavior Sequential Recommendation AUTHORS: Zihan Liu, Yupeng Hou and Julian McAuley ----------- Overall recommendation ----------- SCORE: 1 (weak accept) ----------- Relevance to CIKM ----------- SCORE: 5 (excellent) ----------- Originality of the Work ----------- SCORE: 3 (fair) ----------- Technical Soundness ----------- SCORE: 4 (good) ----------- Quality of Presentation ----------- SCORE: 4 (good) ----------- Impact of Ideas or Results ----------- SCORE: 4 (good) ----------- Reproducibility of Methods ----------- SCORE: 5 (excellent) ----------- Detailed Comments to the Author(s) ----------- SUMMARY This paper proposes a multi-behavioral sequential generative recommendation framework for multi-behavioral sequential recommendation. It first divides the task into two stages: predicting the behavior type and the item in the sequence. In the proposed method, behaviors and items are tokenized into tokens, constructing a single token sequence for inference. The authors evaluate their approach on two datasets and compare it with several baseline methods to demonstrate its effectiveness. Strong points 1. The paper's approach is effective and shows significant improvement under three different tasks. 2. The balanced behavior-aware tokenizer is innovative. It reduces the model’s decision space and computational consumption by using shorter input sequences. The ablation study and analytical experiments are convincing. 3. The code is open source. The experiments are sufficient and the setup is clear and easy to follow. 4. The paper is clearly written and easy to understand. Weak points 1. The motivation for the two-step modeling is relatively less convincing. There may not be an order between the intention and the item. In other words, different items may result in different behavior from the user. The formula in line 322 resembles the prediction goal of pCTCVR[1]. However, the sequential property exists between actions (such as "click" and "purchase") but may not exist between intention and item. 2. The multi-task capability is achieved by feeding the model different prompts, which may limit the innovation of the model. 3. The organization of the method chapter is somewhat confusing. Specifically, Chapter 3.3 seems more like an overview and discussion. Chapter 3.4 is a bit brief as a section on model details. [1] Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate ----------- Summary to support your recommendation ----------- This paper is an exploration of generative recommendations for multi-behavioral sequential scenarios, which have certain innovative and practical guiding significance.