------------------------- METAREVIEW ------------------------ There was considerable discussion of this paper between reviewers. Longitudinal reporting of this kind is exceptionally rare in the CER literature and I think this paper adds value because of that. In CER we too-typically read reports of small-scale, short-cycle interventions that are one-offs, never to be seen again. This demonstrates change at scale, and as a bonus, convincingly reports on the introduction and beneficial effect of a suite of practices. True, there is little "theoretical" framing, and little interpretation of why the interventions "worked", although the author(s) do justify why those particular interventions were chosen, and each one has its own literature demonstrating its efficacy. I take this to be a good piece of evaluative research. ----------------------- REVIEW 1 --------------------- SUBMISSION: 106 TITLE: A Longitudinal Evaluation of a Best Practices CS1 AUTHORS: Adrian Salguero, Julian McAuley, Beth Simon and Leo Porter ----------- Prior Work ----------- SCORE: 2 (strongly agree) ----- TEXT: They give nice summaries with references regarding the three "best practices" studied in this report. ----------- Theory ----------- SCORE: 1 (agree) ----- TEXT: Some reference is made to relevant learning theories with regard to some of the best practices studied. ----------- Methods ----------- SCORE: 2 (strongly agree) ----- TEXT: They give, clear, detailed descriptions of what they did. ----------- Soundness ----------- SCORE: 2 (strongly agree) ----- TEXT: The methods applied are appropriate for the research questions posed. ----------- Advances Knowledge ----------- SCORE: 2 (strongly agree) ----- TEXT: This is a carefully done longitudinal study that gives compelling evidence to support the use of these best practices, at least when combined together. I imagine many readers may find it useful in helping them get support for similar changes at their institutions. ----------- Interpretation and Implications ----------- SCORE: 1 (agree) ----- TEXT: They do a careful evaluation and analysis of the data. ----------- Clarity ----------- SCORE: 2 (strongly agree) ----- TEXT: The paper is very well written with just a few minor typos or other wording problems listed under additional comments. ----------- Recommendation ----------- SCORE: 2 (strongly agree) ----- TEXT: This is a well written, well thought out study, that will be of interest to many readers, both for its findings as well as how to perform similar longitudinal studies elsewhere. ----------- Suggestions or Additional Comments for Authors ----------- "The contributions of this comprehensive..." The bullet list that follows is for the findings of this study not the contributions. It is preferable to treat citation marks [123] as footnote marks not as parts of speech. If used as a part of speech it should include authors, as in: Wrong: "For a more detailed summary of Peer Instruction, please see [39]." Right: "For a more detailed summary of Peer Instruction, please see Simon et al. [39]." Right: "...as their community of practice only contains other novices [18]." "Through carefully crafted multiple-choice questions (targeting students’ zone of proximal development), PI provides..." All other prior uses spell out PI. It is fine to switch but then the abbreviation should be shown in a prior use (PI). "These studies differed, however, in that pair programming was conducting in a closed-lab setting..." The inclination is to correct conducting to conducted, however, if I correctly understand this to be referring to the study at UCSC, in McDowell et al. [24] is states: "Although each student was assigned to one 90-minute lab time per week, most programming assignments were completed outside of scheduled lab time." "for each student to complete an brief in-person comprehension" a brief "along with the all the successful" extra "the" "retention could be explained away by the different students in each time periods." should be period. "Number of years from starting starting at our institution" extra "starting" "Academic years will be referred to by the year in which they begin in, as all..." drop the "in" after begin "refers to the time period where best practices was implemented" WERE implemented "uniquely benefited in a statically significant way" statistically "Y-axis begins at a GPA of 2.5 as GPAs to help show" delete "as GPAs" "long dependency chain of required course for computing majors" courseS "more important or if they were need in combination." needED ----------------------- REVIEW 2 --------------------- SUBMISSION: 106 TITLE: A Longitudinal Evaluation of a Best Practices CS1 AUTHORS: Adrian Salguero, Julian McAuley, Beth Simon and Leo Porter ----------- Prior Work ----------- SCORE: 0 (undecided) ----- TEXT: A fair amount of references are cited. They are all relevant, for example, papers on peer programming which is one of the three strands of "best practice" incorporated in the redesign of the module. However, although they relate to the topics being discussed there is no real sense of a research exercise which is particularly grounded in those works. ----------- Theory ----------- SCORE: -1 (disagree) ----- TEXT: There is no theoretical basis provided for the work conducted. The authors might perhaps argue that it is not appropriate in this case and there is no relevant theoretical framework that can be applied. However, the question for me is: in what way can the work presented here be said to be research? To be honest, this looks like the sort of presentation I'd expect at a departmental meeting when considering a course review. It looks at trends in outcomes etc and provides some analysis of results. But this is not necessarily something which contributes to understanding of the field and advancement of knowledge. ----------- Methods ----------- SCORE: -1 (disagree) ----- TEXT: The study conducts an analysis of a number of years' results data. But the most recent year seems to be 7 years ago which is puzzling. The key point seems to be a change in the approach to delivery of a module some years ago. The results appear to show better outcomes following the incorporation of 3 approaches (one of which I'd not previously heard of). But we have no explanation of why these three things have produced good results. If it was all of them or just one or two. ----------- Soundness ----------- SCORE: 0 (neutral) ----- TEXT: Research questions are presented and the answers are ok as far as they go. So, for example, the question about how students results is answered by data that shows an improvement of attainment and falling in dropout. That's all very positive, but there is no illumination of what's actually going on. ----------- Advances Knowledge ----------- SCORE: -1 (disagree) ----- TEXT: Without any drilling down into the factors at play or any real attempt to provide more meaningful analysis of the reasons for success this does not contribute to an advance in knowledge. Three initiatives were introduced: why these? Why might they do any better or worse than any others? How would we know? And this was a pretty long time ago! Have new innovations not been added since then? Is "best practice" (by the way, "good practice" - yes, "best practice" - how do we know?) no further forward than 12 years ago. The timescale here is very strange. I'm all for allowing time for reflection - but this is in danger of being plain out of date. Technologies and class room innovations have moved on a great deal in this time. ----------- Interpretation and Implications ----------- SCORE: -1 (disagree) ----- TEXT: Very little interpretation is provided and nothing of any depth. If there were to be anything meaningful or enduring from this it would need to be from underlying principles that could be related to improvement. But that's completely missing. ----------- Clarity ----------- SCORE: 2 (strongly agree) ----- TEXT: The paper is clearly written. ----------- Recommendation ----------- SCORE: -1 (disagree) ----- TEXT: I'm afraid I couldn't see anything particularly new or of general relevance from this paper. ----------------------- REVIEW 3 --------------------- SUBMISSION: 106 TITLE: A Longitudinal Evaluation of a Best Practices CS1 AUTHORS: Adrian Salguero, Julian McAuley, Beth Simon and Leo Porter ----------- Prior Work ----------- SCORE: 2 (strongly agree) ----- TEXT: The paper does a good job of summarizing relevant research on all three of the interventions included in the experimental treatment. ----------- Theory ----------- SCORE: 1 (agree) ----- TEXT: The paper itself is a longitudinal study on a combination of interventions that has already been reported elsewhere. Although the theory coverage in this paper itself is not in depth, it does describe the theoretical backing for the three interventions combined in the NPE course. ----------- Methods ----------- SCORE: 1 (agree) ----- TEXT: Generally, the paper does a reasonable job of describing the classroom changes, although details are left to earlier work that is cited in some cases. I think that is appropriate for a paper like this, however. ----------- Soundness ----------- SCORE: 1 (agree) ----- TEXT: Longitudinal studies like this are hard to do, and the authors should be applauded for their efforts in this regard. It can be challenging to present a study like this while addressing all the necessary issues that arise when doing long-term comparisons. There were some issues that came up in reading the paper that I wish were better addressed, although the paper still provides a useful explanation of its evidence. First, the primary comparison approach is to compare within groups between the pre-treatment and treatment periods, where there were no changes to the control group (the PE course). This approach presumes that all external factors are either unchanging, or affect the groups equally. However, this is not really discussed in the paper. There was no indication whether there were curricular changes to the major or program (which could affect time to graduation) or to later follow-on courses (which could affect GPA), or whether there were reasons to believe these kinds of changes would affect less-experienced students differently than more experienced students. While it is impossible to control for these kinds of effects in a study like this in most cases, the paper would be strengthened with an honest discussion of any external changes that occurred and the corresponding threats to validity of how changes to the program over the time during the period of the study may be a poten! tial confounding factor. A bigger issue that is not discussed in depth in the paper is changes in student population over time. From Table 2, it appears that there was an increase of around 73% increase in unique students per year completing the NPE course between the pre-treatment and treatment periods (from approx 216/yr in the 2001-2007 period to what appears to be 374/yr in the 2008-2012 period). This isn't immediately obvious in the paper, since the two periods are different lengths (7 years vs 5) and only total numbers are shown. However, the PE group only had about a 20% increase in unique students per year. Such a relatively large population jump in the NPE group could be due to a wide variety of external factors. Most importantly, the paper doesn't really do anything to try to assess whether students entering the NPE course in the treatment years are "comparable" to those in the pre-treatment period (comparing pre-course measures of any kind of academic performance, for example). Clearly, ! the demographic data on ethnicity and gender indicates a population shift, but the real question is whether or not any other kind of population shift that might plausibly be correlated with course performance/success might be going on? This isn't really discussed at all. Because the change in population size appears to be significantly larger than the effect size, the paper would be stronger if it at least addressed this issue and discussed it in threats to validity. Even better, if any pre-course measures could be used to check for potential performance differences between the two periods, that would be welcome. Again, I know that these factors are essentially outside the control of a study like this, but population shifts of this magnitude really do call into question the validity of assumptions about students entering the course being "the same" between the two periods. Also, while the paper does differentiate between students who take a course multiple times in order to succeed (in counting failure rates, and also in counting unique individuals, for example), as a reader I did want to know more about any differences in frequency of repeating courses between the two groups--that is, what proportion of students repeat the course, both before/after treatment in the NPE group, and also how this compares with the PE group. This can be difficult to explain, since some students repeat a course more than once (at least at some institutions), and there may be differences between withdrawing and repeating vs. failing and repeating, etc. When the reduced rate of failing described for the NPE group does suggest that less repeating is going on, it would be helpful to know a bit more about the picture regarding repeating courses. Finally, in the discussion of time to graduation, this is related in complicated ways to changing majors, although that is not discussed in the paper. In particular, students who start in one major and change to another normally take longer to graduate than students who stick with one major. While the NPE group shows higher times to graduate than the PE group, the NPE group may also contain a larger proportion of students who switched majors. The paper discusses the percentage of students in each group that switch, but does not take this into account as an independent factor (even though the data appears to be available) when comparing time to graduate. Overall, I believe these issues can be addressed by the authors, and that the basic approach in the paper is sound. These additional details would help strengthen the paper if addressed. ----------- Advances Knowledge ----------- SCORE: 1 (agree) ----- TEXT: While this paper falls more into the role of corroborating benefits that were originally described elsewhere, doing so in the context of a large-scale longitudinal study is rare, which is what makes this paper valuable. ----------- Interpretation and Implications ----------- SCORE: 2 (strongly agree) ----- TEXT: The summary in the paper is reasonable in describing the implications of the work. Although addressing some of the issues discussed above would strengthen the discussion, I think the audience at the conference would appreciate the opportunity to discuss/debate the results summarized in this work. ----------- Clarity ----------- SCORE: 2 (strongly agree) ----- TEXT: The quality of the writing is appropriate for the conference. ----------- Recommendation ----------- SCORE: 2 (strongly agree) ----- TEXT: While the paper isn't perfect, I believe it describes a useful effort to validate long-term results from a treatment that combines three interventions that are well-known by themselves, and that the results would generate useful discussion in the community.