Editor's Comments to Author:
Associate Editor
Comments to the Author:
The paper receives two reviews, with one accept recommendation and one minor revision recommendation. Both reviewers are very positive about the paper, acknowledging that the paper presents a novel algorithm for detection of friend circles in social networks, the paper is well written, and contains significant extension to the conference version. Both reviewers also provide constructive suggestions on further improving the paper, such as addressing the issue of dynamic updates when new friends are added, and better description of the data collection method.
We recommend that the paper be accepted to the CASIN special issue of TKDD, after the authors conduct a minor revision on the paper to address the comments from the reviewers.
Reviewer(s)' Comments to Author:
Reviewer: 1
Recommendation: Minor Revision
Comments:
Intro, "... by identifying friends sharing a common feature.": This statement would be clearer with an example.
5: How were the 10 Facebook users recruited? Was this effort crowdsourced, and were there monetary incentives? How did the author's locate G+ users with publicly accessible circles, and who are these users (i.e. normal people, celebrities, etc.). How was the Twitter data collected, and who are these users? When were the Facebook, Twitter, and G+ datasets collected? What fraction of each G+/Twitter user's friends are in labelled circles, and how many friends are unlabeled?
5, "Around a quarter of the identified circles...": The frequency shown in Figure 2 is ~47%. Perhaps the authors meant 'half' instead of 'quarter'. Or perhaps I'm just interpreting the figure incorrectly; shouldn't the columns sum to 100%? That does not appear to be the case.
6: Given that space is not an issue, I'd like to see the full list of features the authors used to cluster Facebook accounts.
8.4: I don't understand the rationale behind the discussion of the low F1 scores. "Many circles have not been maintained since they were initially created." This seems to imply that circle membership is dynamic, but nothing in the data collection suggests that the author's have time varying data. Furthermore, the author's are not evaluating the dynamic-update version of their algorithm in this section.
8.4: Is the performance of the author's algorithm primarily due to matching profile features or leveraging graph structure? It would be interesting to see the performance of the algorithm with only one of these two components enabled, to see which contributed more to performance.
Additional Questions:
Review's recommendation for paper type: Full length technical paper
Does the paper present innovative ideas or material?: Yes
In what ways does this paper advance the field?: This paper presents a novel algorithm to deduce circles of friends on social networks, i.e. take a user's friends and group them into (possibly overlapping) clusters of similar users. Unlike prior work, this algorithm uses both graph structure as well as profile similarity into account to cluster users. Although this submission is based on an already published manuscript, it does contain additional innovations, such as optimizing the algorithm for user's with many friends.
Rate how well the ideas are presented (very difficult to understand=1 very easy to understand =5): 4
Rate the paper on its contribution to the body of knowledge to this field (none=1, very important=5): 4
Rate the information in the paper is it sound, factual, and accurate?(poor=1 excellent=5): 4
Please explain why.: My only serious concern with this work is the experimental, real-world dataset. The authors do not describe how their data was collected in detail, thus it is possible that their data sample is biased in unknown ways. This isn't a fatal flaw; it doesn't change the fact that their algorithm performs well. However, in order to understand the limits of the algorithm, its important to understand the data it was evaluated on, i.e. were these users super-user/celebrities or were they average-joe type users? Is there a difference between the circles of user's who have public circles vs. private circles? Etc.
Rate the overall quality of the writing (very poor=1, excellent=5): 4
Does this paper cite and use appropriate references?: Yes
If not, what important references are missing?:
Is the treatment of the subject complete?: No
If not, What important details / ideas/ analyses are missing?:
Reviewer: 2
Recommendation: Accept
Comments:
In this paper, the authors formulate the important problem of identifying social circles from one's friend network as a multi-membership node clustering problem over the network. The authors develop a probabilistic model to characterize both network structure and user profile information in the clustering. Effective parameter learning and inference algorithms are presented. Experiments on Facebook, Google+ and Twitter demonstrate the advantages of the proposed approach over existing methods.
Compared with the conference version, this journal version includes many new materials, such as circle maintenance, semi-supervised circle predictions and corresponding experimental results. Such discussions and experiments provide new insights into the problem and make the proposed method more practical. The topic is important and interesting, and the presentation of the paper is very clear.
A few points that the authors might consider to discuss in the paper are:
1. In the circle maintenance section, the authors discuss the scenario when a new friend is added to the user's network and the friend's circle is predicted. It will be common that new friends will continuously be added into the network, so it would be nice to discuss how to effectively and efficiently predict the circles of new friends in a screaming environment. It will be costly to repeat the process of finding model parameters for each new friend, and some ways to incrementally update the model parameters may be better.
2. The authors mentioned the future work of discovering friends' circles for multiple users together. I feel that this is an important topic. Many users who are close to each other may share very similar circles, so to reduce the waste of efforts, it is much more efficient to explore the collective discovery of circles over a network for multiple users simultaneously. It might be better to expand the discussions on how the proposed approach can be adapted to the collective circle discovery scenarios.
3. It's nice to develop efficient algorithms for large-scale networks and show the time complexity and running time experiments. It may be helpful to give time complexity analysis using big O notation.
Additional Questions:
Review's recommendation for paper type: Full length technical paper
Does the paper present innovative ideas or material?: Yes
In what ways does this paper advance the field?: The authors solve the circle discovery problem by proposing a novel multi-membership node clustering algorithm. Different from existing approaches, this method can generate overlapping and nested clusters.
Rate how well the ideas are presented (very difficult to understand=1 very easy to understand =5): 4
Rate the paper on its contribution to the body of knowledge to this field (none=1, very important=5): 4
Rate the information in the paper is it sound, factual, and accurate?(poor=1 excellent=5): 4
Please explain why.: The idea is well explained and presented. The proposed method is discussed thoroughly from different perspectives.
Rate the overall quality of the writing (very poor=1, excellent=5): 5
Does this paper cite and use appropriate references?: Yes
If not, what important references are missing?:
Is the treatment of the subject complete?: Yes
If not, What important details / ideas/ analyses are missing?