Browsing by Autor "Charles R. Ebersole"

Now showing 1 - 3 of 3

Crowd-sourcing Hypothesis Tests: Making Transparent How Design Choices Shape Research Results
(RELX Group (Netherlands), 2020) Justin F. Landy; Miaolei Jia; Isabel L. Ding; Domenico Viganola; Warren Tierney; Anna Dreber; Magnus Johannesson; Thomas Pfeiffer; Charles R. Ebersole; Quentin F. Gronau
Crowdsourcing hypothesis tests: Making transparent how design choices shape research results.
(American Psychological Association, 2020) Justin F. Landy; Miaolei Jia; Isabel L. Ding; Domenico Viganola; Warren Tierney; Anna Dreber; Magnus Johannesson; Thomas Pfeiffer; Charles R. Ebersole; Quentin F. Gronau
To what extent are research results influenced by subjective decisions that scientists make as they design studies? Fifteen research teams independently designed studies to answer five original research questions related to moral judgments, negotiations, and implicit cognition. Participants from 2 separate large samples (total N > 15,000) were then randomly assigned to complete 1 version of each study. Effect sizes varied dramatically across different sets of materials designed to test the same hypothesis: Materials from different teams rendered statistically significant effects in opposite directions for 4 of 5 hypotheses, with the narrowest range in estimates being d = -0.37 to + 0.26. Meta-analysis and a Bayesian perspective on the results revealed overall support for 2 hypotheses and a lack of support for 3 hypotheses. Overall, practically none of the variability in effect sizes was attributable to the skill of the research team in designing materials, whereas considerable variability was attributable to the hypothesis being tested. In a forecasting survey, predictions of other scientists were significantly correlated with study results, both across and within hypotheses. Crowdsourced testing of research hypotheses helps reveal the true consistency of empirical support for a scientific claim. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Many Labs 5: Testing Pre-Data-Collection Peer Review as an Intervention to Increase Replicability
(SAGE Publishing, 2020) Charles R. Ebersole; Maya B. Mathur; Erica Baranski; Diane-Jo Bart-Plange; Nicholas R. Buttrick; Christopher R. Chartier; Katherine S. Corker; Martin Corley; Joshua K. Hartshorne; Hans IJzerman
Replication studies in psychological science sometimes fail to reproduce prior findings. If these studies use methods that are unfaithful to the original study or ineffective in eliciting the phenomenon of interest, then a failure to replicate may be a failure of the protocol rather than a challenge to the original finding. Formal pre-data-collection peer review by experts may address shortcomings and increase replicability rates. We selected 10 replication studies from the Reproducibility Project: Psychology (RP:P; Open Science Collaboration, 2015) for which the original authors had expressed concerns about the replication designs before data collection; only one of these studies had yielded a statistically significant effect ( p < .05). Commenters suggested that lack of adherence to expert review and low-powered tests were the reasons that most of these RP:P studies failed to replicate the original effects. We revised the replication protocols and received formal peer review prior to conducting new replication studies. We administered the RP:P and revised protocols in multiple laboratories (median number of laboratories per original study = 6.5, range = 3–9; median total sample = 1,279.5, range = 276–3,512) for high-powered tests of each original finding with both protocols. Overall, following the preregistered analysis plan, we found that the revised protocols produced effect sizes similar to those of the RP:P protocols (Δ r = .002 or .014, depending on analytic approach). The median effect size for the revised protocols ( r = .05) was similar to that of the RP:P protocols ( r = .04) and the original RP:P replications ( r = .11), and smaller than that of the original studies ( r = .37). Analysis of the cumulative evidence across the original studies and the corresponding three replication attempts provided very precise estimates of the 10 tested effects and indicated that their effect sizes (median r = .07, range = .00–.15) were 78% smaller, on average, than the original effect sizes (median r = .37, range = .19–.50).