{"id":2712,"date":"2018-07-23T10:00:50","date_gmt":"2018-07-23T09:00:50","guid":{"rendered":"http:\/\/blogs.kcl.ac.uk\/editlab\/?p=2712"},"modified":"2018-07-24T09:26:58","modified_gmt":"2018-07-24T08:26:58","slug":"o-for-open-science","status":"publish","type":"post","link":"https:\/\/blogs.kcl.ac.uk\/editlab\/2018\/07\/23\/o-for-open-science\/","title":{"rendered":"O for Open Science"},"content":{"rendered":"<h1>Joni Coleman tackles the complex subject of delivering transparent, robust science to all &#8211; and almost manages not to talk about Twitter.<\/h1>\n<p><!--more--><\/p>\n<p><a href=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2016\/07\/joni-coleman-200x280.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-250 size-thumbnail\" src=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2016\/07\/joni-coleman-200x280-150x150.jpg\" alt=\"\" width=\"150\" height=\"150\" srcset=\"https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2016\/07\/joni-coleman-200x280-150x150.jpg 150w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2016\/07\/joni-coleman-200x280-50x50.jpg 50w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2016\/07\/joni-coleman-200x280-100x100.jpg 100w\" sizes=\"auto, (max-width: 150px) 100vw, 150px\" \/><\/a><\/p>\n<hr \/>\n<p>If you\u2019ve ever volunteered to take part in research: you should be angry.<br \/>\nIf you pay your taxes: you should be angry.<br \/>\nIf you believe the scientific method is central to rationalism, modernity, and progress: YOU SHOULD BE ANGRY.<\/p>\n<p>Because much of science is failing you. Last year, well over a million papers were <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pubmed?term=(%222017%2F01%2F01%22%5BDate%20-%20Publication%5D%20%3A%20%222017%2F12%2F31%22%5BDate%20-%20Publication%5D)\">published in the biomedical sciences alone<\/a>, many of which will have been funded through public or charitable monies, and will have relied inescapably on the generosity of participants. But (unless you are lucky enough to have access to a subscription) many of them will be unavailable for you to read. Science is, to a great extent, a closed shop.<\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_2720\" style=\"width: 310px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-2720\" class=\"size-medium wp-image-2720\" src=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280-300x214.jpg\" alt=\"\" width=\"300\" height=\"214\" srcset=\"https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280-300x214.jpg 300w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280-768x548.jpg 768w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280-1024x731.jpg 1024w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/steel-2586498_1280.jpg 1280w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><p id=\"caption-attachment-2720\" class=\"wp-caption-text\">Pictured: Deep metaphor<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>But, increasingly, things are changing. Open science initiatives are blooming, with the intent of making science more accessible, more transparent, and more rapid. In this blog, I will introduce some of the key enterprises of open science, and assess how well they are going. Check out the links, or squawk at me on Twitter (<a href=\"http:\/\/twitter.com\/Joni_Coleman\">@Joni_Coleman<\/a>) if you want to know more.<\/p>\n<h3><strong><u>Doing science: Data sharing, open source, and collaboration<\/u><\/strong><\/h3>\n<p>It is possible (if crude) to reduce science to three pillars: hypothesis (what you think should happen in a given situation), data (what actually happens) and interpretation (how the data relate to the hypothesis). Scientists are generally pretty free with communicating their hypotheses and interpretations (although more on that later\u2026), but the <strong>communication of data<\/strong> is often slower. Sometimes there are good reasons for that \u2013 anyone who has an email address knows all too well now that data protection is a hot topic\u00b8 and sometimes making data openly available would risk the privacy of participants. In other cases, scientists have an understandable desire to protect the investment they have made in gathering data, an expensive and strenuous task that is often undervalued in the reward systems of science.<\/p>\n<blockquote>\n<p style=\"text-align: center\">&#8220;Data sharing and open data were among the central concepts in the mapping of the human genome&#8221;<\/p>\n<\/blockquote>\n<p>However, there are clear benefits to sharing data, both in terms of the rigour of science (I might trust your conclusions far more if I am able to reach the same answer from the data) and also in terms of the speed of progress. This has been a cornerstone of my own field, genomics. Data sharing and open data were among the central concepts in the mapping of the human genome; the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Bermuda_Principles\">Bermuda Accord (1996)<\/a> established a precedent that raw human sequence data should be made public-access within 24 hours of being generated. In part, this was a necessary step to enable the <strong>international collaboration<\/strong> to function effectively, but it also set the background for an approach to data sharing that has arguably stimulated the proliferation of genomic studies in the last 10 years. For example, the summary results from each Psychiatric Genomics Consortium study are made <a href=\"https:\/\/www.med.unc.edu\/pgc\/results-and-downloads\">publicly available<\/a>, and this enables further discoveries to be made much more speedily. Genomics is by no means the only field to adopt such an approach either.<\/p>\n<p>&nbsp;<\/p>\n<div id=\"attachment_2721\" style=\"width: 310px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-2721\" class=\"size-medium wp-image-2721\" src=\"http:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280-300x175.jpg\" alt=\"\" width=\"300\" height=\"175\" srcset=\"https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280-300x175.jpg 300w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280-768x448.jpg 768w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280-1024x597.jpg 1024w, https:\/\/blogs.kcl.ac.uk\/editlab\/files\/2018\/07\/application-3426397_1280.jpg 1280w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a><p id=\"caption-attachment-2721\" class=\"wp-caption-text\">Collaboration: this can also be achieved with Windows or Linux laptops<\/p><\/div>\n<p>Another key tenet of the surge in genomic data generation has been the <strong>open code<\/strong> movement, and the idea of sharable reproducible analysis scripts. Freely-available programming languages like R and python build on community-written \u201cpackages\u201d enabling others to perform specific analyses, while literate programming initiatives encourage clear writing of scripts. Increasingly, journals are asking for code to be available at review. While this is somewhat terrifying for the analyst (who has to let their ugly, esoteric code out in public), it ensures that reviewers are able to understand precisely how results have come about from data. Combined with open data, open code allows any reader to understand (and build upon) the original science.<\/p>\n<p>&nbsp;<\/p>\n<h3><strong><u>Publishing science: Open access and pre-printing, transparency in reviews, and registered reports<\/u><\/strong><\/h3>\n<p>So once the study\u2019s done, that\u2019s the hard bit over, right? <em>Au contraire. <\/em>Publishing a scientific paper can often be a long, involved and expensive process, and often ends with a paper locked behind a paywall that costs $10 a pop to read. Let\u2019s dissect the differences the open access approach can make.<\/p>\n<p>From submission to publication, the traditional route can commonly take months \u2013 our <a href=\"https:\/\/www.ncbi.nlm.nih.gov\/pubmed\/26989097\">GWAS of cognitive behavioural therapy<\/a>, for example, was published approximately two years after it was completed. In a field like genomics, the fast pace of innovation means that papers can be outdated by the time they emerge. Here is where open science has scored considerable success in biology in the last few years. <strong>The concept of <a href=\"https:\/\/www.biorxiv.org\/\">pre-printing <\/a><\/strong>(making an early, pre-peer review version of articles publicly available) is far from novel (mathematics has been <a href=\"http:\/\/www.arxiv.com\">arxiv<\/a>-ing data for years), but has been instrumental in disseminating research quickly. There are risks to this, however, particularly in terms of the citation of preprint articles. Peer review can be a valuable system, allowing a critical eye to catch errors of logic or weaknesses in argument that the authors may have missed. Without that eye, preprints become liable to change, rendering citations of them potentially erroneous further down the line. Arguably, this is also the case with published papers, however \u2013 there is generally poor recognition of the amount of citations given to flawed studies that have since been retracted. Preprints should be read carefully to assess the validity of their argument \u2013 as should every article cited.<\/p>\n<blockquote>\n<p style=\"text-align: center\">&#8220;if most good science doesn\u2019t produce headline-grabbing outcomes, agreeing to publish good science will leave many headlines ungrabbed&#8221;<\/p>\n<\/blockquote>\n<p>Careful and considered reading can enable you to form a well-reasoned opinion on a paper. But you can only judge what is presented to you. How do you as a reader know whether the results in the paper answer the hypothesis the authors originally sought to answer? A certain degree of trust is required, and, unfortunately, <a href=\"https:\/\/retractionwatch.com\/the-retraction-watch-leaderboard\/\">multiple well-publicised cases of scientific fraud<\/a> have shown the incentives of science drive some authors to lie. Open science can combat this through <strong>pre-registration<\/strong>, outlining exactly what you are going to do before you do it. This could also extend to writing a registered report \u2013 agreeing with a journal (and with the peer reviewers appointed by that journal) that the question you are asking and the methods you will use are scientifically interesting and valid, and so your article should be published regardless of results. Both of these seem fine ideas. However, they have been slow to be adopted. In part, that is driven by the commercial concerns of publishers \u2013 if most good science doesn\u2019t produce headline-grabbing outcomes, agreeing to publish good science will leave many headlines ungrabbed. However, there are also concerns from the scientists&#8217; end. Scientists could (and <a href=\"https:\/\/www.theguardian.com\/commentisfree\/2009\/oct\/03\/bad-science-verdict-drug-trials\">do<\/a>) \u201cpre-register\u201d studies wrongly, or could register a study that has already been completed with positive results (making it unclear if there were negative results that were not reported). For a more honest concern, many papers are inherently exploratory, and as such not all analyses could be preregistered \u2013 it is not always clear how such results would be dealt with in a pre-registered study. The concerns around pre-registration and registered reports need to be carefully and clearly resolved and communicated \u2013 making such a change will need further changes to the culture of science as a whole.<\/p>\n<p>Back to that article you wrote. It\u2019s gone in, and has changed in peer review considerably. Ideally that was because the editor picked careful, wise reviewers. Or maybe they picked terrible reviewers, who barely read the paper other than to demand the authors cite a bunch of papers written by the reviewer themselves. How can the future reader tell? In the majority of cases, peer reviews remain confidential to the editor, journal and authors. However, there is an increasing drive to <strong>publish reviews openly<\/strong>, so everyone can judge how reasonable the reviewer was in dissecting the paper. However, most reviews are performed anonymously. Again, there is a movement within open science towards signed reviews \u2013 if you are writing honestly and fairly, why wouldn\u2019t you want your name attached to your review? Given that reviewers largely go unrewarded, and that science relies to a great extent on reputation, building a name for yourself as a good, fair reviewer would be beneficial. But. One of the biggest criticisms of signed reviews is that it could restrict the reviewer from being honest and robust in their review of a bad paper. Science relies on reputation \u2013 if a named reviewer makes a powerful author look foolish, what\u2019s to prevent that author damaging the reviewer\u2019s career? It\u2019s perhaps unclear whether open, signed reviews would alleviate this \u2013 <a href=\"https:\/\/en.wikiquote.org\/wiki\/Louis_Brandeis\">sunlight may be the best disinfectant<\/a>, or it could simply make the honest reviewer more open to such covert attacks.<\/p>\n<p>Finally, the paper is reviewed and published. <strong>How can readers access it?<\/strong> Currently, there are three main scenarios. Firstly, some papers are housed in subscription-only journals with no free options, despite the best efforts of the funding agencies (although you could try contacting the author. You certainly shouldn\u2019t visit websites of dubious morality). Secondly, the paper might remain behind a paywall for 6-18 months before being released onto an open-access repository. Or, the authors might pay a fee to have the paper published openly from the beginning. These article processing charges allow the costs of publication to be met \u2013 however, they have also led to the creation of <a href=\"https:\/\/beallslist.weebly.com\/\">predatory pay-to-publish journals<\/a> with little concern about the quality of science.<\/p>\n<blockquote>\n<p style=\"text-align: center\">&#8220;enterprises like the Open Science Framework will not bring open science to the forefront alone \u2013 that will require a wholesale shift in the culture and approaches of science&#8221;<\/p>\n<\/blockquote>\n<p>There is much to be done across many areas to make science more open; allow me to highlight one particular effort to meet the challenge. The <strong><a href=\"http:\/\/osf.io\">Open Science Framework<\/a> (OSF)<\/strong> is an open community for performing science, stretching across disciplines. Amongst the many initiatives the OSF is attempting to implement are recognition for good open science practices in individual papers; integration of robust and reproducible coding and data workflows; forums to encourage and enable collaborations; and the capacity to register, manage and archive a project\u2019s data. Such enterprises will not bring open science to the forefront alone \u2013 that will require a wholesale shift in the culture and approaches of science. However, the OSF, and efforts like it, have an important role to play in making a more robust science that can deliver reliable progress.<\/p>\n<p>Finally, it wouldn\u2019t be a blog by me if it didn\u2019t mention <strong>Twitter<\/strong>\u2026 With the ability to communicate globally, there is a huge opportunity for scientists to engage with each other and with members of the public in discussing their results and explaining the cool things they find. This can happen through many avenues, be it Twitter, Reddit (check out their Ask Me Anything sections) or blogs, covering research, methods and much, much more. Such discussions are invaluable to science &#8211; and were also invaluable to this blog in particular. As such, I must thank those listed below for taking the time to point out several exciting areas of open science that I hadn\u2019t thought of! Any errors, inaccuracies or omissions in this blog are mine alone.<\/p>\n<p>Thanks to the following for responding to <a href=\"https:\/\/twitter.com\/Joni_Coleman\/status\/1016388359009325056\">spontaneous evening Twitter<\/a>:<br \/>\nJordan Anaya (<a href=\"http:\/\/twitter.com\/OmnesResNetwork\">@OmnesResNetwork<\/a>), Lisa DeBruine (<a href=\"http:\/\/twitter.com\/lisadebruine\">@lisadebruine<\/a>), Nick Brown (<a href=\"http:\/\/twitter.com\/sTeamTraen\">@sTeamTraen<\/a>), Chris Chambers (<a href=\"http:\/\/twitter.com\/chrisdc77\">@chrisdc77<\/a>), Mark Adams (<a href=\"http:\/\/twitter.com\/mja\">@mja<\/a>), Gustav Nilsonne (<a href=\"http:\/\/twitter.com\/GustavNilsonne\">@GustavNilsonne<\/a>), Mira van der Naald (<a href=\"http:\/\/twitter.com\/MiravdNaald\">@MiravdNaald<\/a>), and Amy Riegelman (<a href=\"http:\/\/twitter.com\/@amylibrarian\">@amylibrarian<\/a>).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Joni Coleman tackles the complex subject of delivering transparent, robust science to all &#8211; and almost manages not to talk about Twitter.<\/p>\n","protected":false},"author":163,"featured_media":2721,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[248],"tags":[33,187,279,278,78,67],"class_list":["post-2712","post","type-post","status-publish","format-standard","has-post-thumbnail","category-a-z","tag-academia","tag-data","tag-international","tag-open-science","tag-research","tag-research-methods"],"_links":{"self":[{"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/posts\/2712","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/users\/163"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/comments?post=2712"}],"version-history":[{"count":17,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/posts\/2712\/revisions"}],"predecessor-version":[{"id":2747,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/posts\/2712\/revisions\/2747"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/media\/2721"}],"wp:attachment":[{"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/media?parent=2712"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/categories?post=2712"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.kcl.ac.uk\/editlab\/wp-json\/wp\/v2\/tags?post=2712"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}