Reviews For Paper
Track Knowledge Management
Paper ID 241
Title Leveraging Social Connections to Improve Personalized Ranking for Collaborative Filtering

Masked Reviewer ID: Assigned_Reviewer_1
Overall Rating Accept
Top 3 Strengths 1. The paper presents the idea of SocialBPR that incorporates social behavior
into the user personalized ranking. Essentially the ordering imposed is that
users prefer personal items over the items bought by their friends which is in
turn over other random items. This observation made by the authors is
interesting and is also shown to be validated from the training data for
multiple online social networks.

2. The paper is well written and has a nice readable flow. The introduction,
modeling and experimental sections are especially well written. Further, all
design choices are well motivated.

3. I also liked the fact that the authors tested and invalidated the counter
claim that the so called social items are "more negative" than random items.

4. Empirical analysis is comprehensive, and considers almost all baseline
systems that do 1-class recommendation problems. Although the gains are fairly
weak and may not have much practical impact, the authors have done a godo job of
presenting the results.
Top 3 Weaknesses 1. The demonstrated gains of S-BPR are fairly weak. In most of the datasets the
AUC changes in the second decimale from around 0.73 to 0.75. First, an AUC of
0.75 typically means that the algorithm is placing the chosen item in the top
25% of the overall list. What is the impact of the improvements in a real world
setting ?

2. AUCs of close to 0.5 means the recommendation quality is mostly random. Why
is that the case for the epinions data ?

3. It is unclear why static sampling outperforms the other methodologies and the
explanations provided are not clear. Since the adaptive sampling chooses the
negative example that is closest to the positive, I would have expected it to be
the best possible one. Do you suppose that it is good to start with the static
sampling and towards the end, start using adaptive sampling ?
Detailed Comments 1. The authors also need to provide details about the running time of the proposed
algorithms. For instance, the adaptive sampling is fairly expensive and it would
be useful to measure the convergence rates not in terms of number of iterations,
but in terms of wall clock times.

2. Why do you think SBPR1 beat out all the baselines. Both SBPR1 and SBPR2 (which
are somewhat conflicting) beat out all the other baselines, clearly we should be
able to do better using both somehow.

3. Equation 6: Should be "- regularization" since you are maximizing.
Author feedback needed? Yes
What specific feedback do you like the authors to provide? Please address the weakness (W1, W2, W3) and the detailed comments (D1, D2).

Masked Reviewer ID: Assigned_Reviewer_2
Overall Rating Accept
Top 3 Strengths a. The paper is of great interest
b. The paper proposes a ranking algorithm that outperforms the state-of-the-art of the same kind (one class recommendation problems)
c. The paper uses real world experimental data and the datasets are both big and complete enough for safe conclusions
Top 3 Weaknesses a. Related work section should be more specific when describing the state-of-the-art MR-BPR technique and the difference between this and the proposed method
b. Results analysis should include an explanation of the almost similar results that SBPR1 and SBPR2 achieve
c. Figures that show how the SBPR1 algorithm outperforms the baseline and SBPR2 methods (AUC line) should be included
Detailed Comments The proposed paper is of great interest as it develops a ranking algorithm (Social-BPR) that uses social information of the user for item recommendation. The evaluation experiments are conducted in four big and complete datasets which allow us to say that the results are trustworthy. The results show that the proposed algorithm outperforms the state-of-the-art methods both in cold-start cases and in cases where users have much observable historical information. On the other hand, authors don’t provide a convincing explanation of the almost similar results that SBPR1 and SBPR2 achieve (these algorithms treat social information in an opposite way). Moreover, a more specific description of the state-of-the-art MR-BPR method would help the reader to understand what is new about the proposed algorithm. Finally, it would help readers to fully understand the paper if more figures, that show how the SBPR1 algorithm outperforms the baseline and SBPR2 methods (e.g. the AUC line) where included.
Author feedback needed? No

Masked Reviewer ID: Assigned_Reviewer_3
Overall Rating Accept
Top 3 Strengths - Extensive experiments
- well-written in the most part
- nice approach to an interesting problem
Top 3 Weaknesses - Confusing notation
- Problems in experiments and how results are displayed
- related work is missing
Detailed Comments This paper presents an interesting approach to an important problem: that of providing recommendations to users who have little or no ratings in a system. The authors employ the social structure available in many real-life applications and show through experiments in 4 different datasets the effectiveness of their approach.

In general, the paper is well-written and the concepts explained nicely. However, there are a few improvements needed for this paper to be ready for publication in CIKM, and a few more that the authors need to consider for future extensions of this work.

Main issues:
- Naming the absence of feedback as "negative" gives it a bad connotation. The fact that an item is not liked/rated by the user or any of their friends is not something that needs to be regarded as a negative vote to it. In fact, this might be a very new item, or something that is yet unknown to the user and his circle of friends. As a matter of fact, even the authors state the obvious, i.e. that "our analysis of the datasets suggests that negative feedback with high global popularity does not indicate that a
user dislikes an item." I believe that the authors should change their terminology and find a more neutral way to describe such items.
- Notation problems that need to be fixed:
a) k is used to signify items, but is also used to signify the number of latent factors. This is rather confusing.
b) There are two P's used and, even though the fonts are slightly different, it is still confusing.
- In Figure 1, the authors show that the probability of a user selecting an item a "friend" has selected before is better in explicit relations, however it is still very small. This defeats the premise of the paper. The authors should comment on that.
- The discussion on the addition of s_{uk} in Eq. 6, in the last paragraph of section 4.2 is confusing and needs to be rephrased and explained in more detail.
- All the alphas (α_u, α_v, α_b) are fixed to some values. Why is that? How were these values selected?
- Experiments: It seems that the authors performed random sampling instead of 10-fold cross validation. Given the size of most datasets, the latter approach would guarantee more objective results.
- Table 3: R@5 for Ciao is significantly worse for the proposed methods. The authors need to comment on that giving explanation of why we observe this kind of discrepancy in results.
- Figure 3: The authors provide comparisons for R@N for different values of N. Given that in real-life recommendation systems a user rarely reviews more than the top-10 recommendations, it would be more interesting to see a close-up of these graphs, focusing on values of N i [0,20] instead.
- The methodology followed for the cold-start users is unclear. Why don't the authors pick only cold-start users in their training and test data? How do they ensure that the recommendations (and results) they generate are for such users only? Perhaps this section needs to be more carefully rephrased.
- Starting from p.8 and onwards, the placement of figures/tables in the paper is completely out of order as compared to the order they're mentioned in the text. This is very confusing for the reader who has to skip some and move further down the paper, then having to go back, and so on. The authors need to renumber and re-arrange the tables and figures appropriately, showing them in the order they appear in the text.
- Related work: While the authors provide a rather extensive overview of related work in terms of approaches that employ probabilistic and/or matrix factorization techniques, a significant body of work addressing the same problem but using neighborhood-based or network propagation techniques is completely ignored. We recommend that the authors look at the work of Jennifer Golbeck who has written a survey on the topic, as well as an entire book, as well as collective works (e.g. many papers presented in this book: cover this problem).

Minor issues:
- Table 2: It would be useful if the authors could provide the percentages over the total number of users here for quick reference.
- The authors mention that they crawled Epinions to gather the data. Since epinions dataset is publicly available (e.g. in SNAP, KONECT and ASU), we wonder why the authors had to gather data again and what is the difference between theirs and the ones that are publicly available.
- The link to the ASU website provided in footnote 5 is broken.

Future work:
- Perform 10-fold-cross validation instead of random sampling.
- Evaluate additional metrics for the social coefficient weighting - e.g. embededness, cohesion, etc.

Author feedback needed? No