----------------------- REVIEW 1 --------------------- PAPER: 299 TITLE: What's in a name? Understanding the Interplay Between Titles, Content, and Communities in Social Media AUTHORS: Himabindu Lakkaraju, Julian Mcauley and Jure Leskovec ----------- REVIEW ----------- This paper studies images that have been submitted multiple times on Reddit to determine the influence of community and language factors in the popularity of items. The authors also develop and evaluate models for predicting popularity of the content. Strengths: - Detailed experiments on the effectiveness of each model and test set which shows that both community and language factors play an important role in predicting popularity - Interesting insights into what makes a title good - for most communities they should be novel to the community but not too novel, and different parts of speech are associated with popularity in different communities Weaknesses: - The model does not take into account the user who posts the image, even though it is part of the data collected. Although in general it is not very important who posts an item of content on Reddit, there are some exceptions where a user is very well-known and their posts may receive more attention. Therefore I think that the popularity of the poster is an important factor that should be taken into account in the model. This is touched on in the discussion: "Are popular users popular because they submit better content, or is their content more successful merely because they are popular?" - The evaluation does not cover unseen content - it covers the two cases where the test set consists of either all submissions of an image except the last two, or a randomly selected test set, but not the case where all instances of a particular item have been removed. It is claimed in the discussion that the findings could easily be applied to this case where a submission history is not available so why not try it? The learnings about what language features are good for what communities could still be used as well as some of the other factors such as time of day. Presentation: - Generally good but there are a few typos: estiamte, content tile (should be title), commutniy, communtiy - Figure 1: The text says "The submission first achieves major success the tenth time it is submitted (‘God bless whoever makes these’)." but from the Figure it looks like it should be "'MURICA" - Figure 2: Average success over time peaks at ~13:00 UTC (US morning) but the text claims it peaks US evening - am I reading this wrong or is the diagram incorrectly labelled? ----------------------- REVIEW 2 --------------------- PAPER: 299 TITLE: What's in a name? Understanding the Interplay Between Titles, Content, and Communities in Social Media AUTHORS: Himabindu Lakkaraju, Julian Mcauley and Jure Leskovec ----------- REVIEW ----------- The authors investigate the problem of determining the estimated popularity of a post based on both its title and the community to which it is submitted. Specifically, they study posts and re-posts of images in reddit that occur in different communities in order to determine how different community-based and title-based factors interact with each other. They develop a linear combination of two models--a language model to predict a post's success based on the title, and a community model that makes a prediction based on all community factors independent of the submission's title. They find that in order to obtain the best predicted success of a submission, it is necessary to account for both the community factors and language (title) factors. This paper is very well written (with a few small errors--see below) and presents successful results. Although the developed model is a combination of techniques that have all generally been used before, the specific problem of predicting the popularity of reposts in different communities is novel. Work like this can be applicable to other social networking sites that host a community-based platform for posting information. One interesting portion of the paper is the evaluation of the language model. It may be beneficial to compare the community+language model to a baseline model that uses only title to predict popularity. These models, as stated in the related work section, have been studied and should be available for comparison. Additionally, the paper may benefit from an explicit discussion about how these results are applicable to other social networks (Facebook, Twitter, etc.). One can infer these relationships but it would be best to state them explicitly. Finally, a unique advantage of using reddit as a dataset is that on reddit, a user's reputation (amount of karma) generally has no effect on the popularity of that submission. This is a significant advantage that some readers who are not familiar with reddit may not understand. Again, explicitly stating this in the paper may be beneficial (to better appreciate the last paragraph in "Discussion" section). Small errors (typos): -Introduction, 2nd-to-last paragraph, first sentence. "estimate" spelled "estiamte", and the line after, "important role" should be "important roles" (plural) -Contributions and Findings, 1st paragraph, last sentence. "This number is made up of 16,700 are original…". Remove "are". -Experiments, Baselines, 1st bullet point. "Models the…exponential functin". Misspelled "function".