4 open access practices to try, rated from easy – hard

The following 4 practices span different aspects of the research cycle (conceptualization, design, analysis, reporting, dissemination) and involve a range of difficulty levels for newcomers:

1. Open Science Journal Club (Level: Easy)

What? Organise a journal club with other students and staff to discuss issues surrounding reproducibility and open science. Usually, these take the format where one person leads the discussion each session after everyone has read the selected paper. They can range in how formal they are, from a presentation with slides followed by discussion to a completely open-ended discussion (with or without a moderator). It may even be possible that one already exists in your department that you could join! We rated beginning an open science journal club as easy because it requires minimal prior learning, you only need one other person to start a club, and there are already many reading lists and existing structures available to follow.

Why? Before you can engage in open science practices you need to understand the lay of the land by becoming familiar with major works and issues. Journal clubs are a great way to do this, wherein you can learn about a new topic and critically engage with your colleagues. It can get lonely reading and working alone, so meeting in this way can create an environment in which you can socialize while also learning with others and building a community around open science. In addition to discussing the papers themselves, each article can serve as a conversation starter about open science and reproducibility more generally. Seeing who attends the journal club can help locate others who are interested in or knowledgeable about open science, effectively establishing a network for collaborators or support. They can help you think about how you can approach conversations with your advisor or other members in your department who may not be as interested in engaging with open science practices. More generally, organising a journal club and presenting are both transferable skills for most jobs.

How? Contact colleagues in your department to enquire if anyone would be interested in participating. You only need one other person to get started! Once set up, the journal club can expand beyond your department to the wider university, facilitating interdisciplinary exchange. In addition to club attendees presenting, you can also invite external speakers to present either in person or remotely. It is also possible to organize these clubs completely remotely if you are unable to meet in person. One example of a very successful framework for an open science journal club is the ReproducibiliTea initiative: (https://reproducibilitea.org/), which has spread to over 100 institutions all over the world and provides a great starter pack including a list of papers to potentially discuss (https://osf.io/3qrj6/).

Worries. Students have different relationships with their advisors with regards to how much permission they would need to set up a journal club. However, in most cases journal clubs can absolutely be student-generated and student-run. It may be worth telling your advisor and inviting them to attend, yet, making it clear they have no obligation to attend. Also, you may feel that learning about open science practices is taking you away from time that could be spent on your own research. This is a common worry when engaging in any “extracurricular activities.” Although it does take time, it can be a relatively small time investment for how much you can learn. You can choose how often to hold the journal clubs and when to start and stop holding them, so they could be anything from weekly to termly, whatever works well for you and your colleagues.


2. Project Workflow and Documentation (Level: Easy)

What? Project workflow refers to how you organise projects and move through the various stages of the research cycle. This includes your file folder structure, document naming conventions, version control, data storage, and other details. It also includes the choice of who has access to the project (e.g., collaborators, the public) and when in the process they have access (e.g., at all times, upon publication). We rate creating a project workflow and documentation as easy because, even though there are many considerations to think through on the front end, it is primarily about organisation (folder and storage use) and recordkeeping, which are likely processes students are already using to some extent. Moreover, developing a clear project workflow is much easier for students than later career scholars, who have many more projects to organize and may be more entrenched in their methods (or lack thereof).

Why? Having a dedicated project workflow system helps keep your research organised, enhancing reproducibility, minimising mistakes, and facilitating collaborations with others and future-you. Making your project open to your advisors and any other collaborators (even if not open to the public) through working in shared folders can ensure everyone has access to everything in an organised fashion and saves the hassle of emailing infinite versions of documents. If you do choose to make the project public at any point, everything will be almost complete, and you will not have to create a system from scratch. Having an organised workflow is beneficial in most jobs, as well as in your everyday life – no more scribbled shopping lists on scrap pieces of paper!

How? Many research teams will not want to make all of their work public from the get-go, but even just imagining that the project will be public can encourage taking an outside perspective that will lead to improvements in organisation. When joining a new lab as a graduate student or beginning a new collaboration, ask the project leaders about their project workflow. It is entirely possible that they do not have a formally specified workflow, and just asking about it could initiate new ways of doing things. If they are not as receptive as you would like, find a compromise between what you would find most useful and what they are used to, and if you are a new student, then it might be useful to set some review dates for when you will discuss if the current approach is working well. For additional resources visit the UK Data Service’s guide to data documentation https://ukdataservice.ac.uk/help/new-users/data-documentation/.

Worries. You may feel some apprehension at the idea of having your workflow process or documentation public. You definitely do not have to make your project page public right away. You can wait until the project is complete, if you choose, and can clean up your project page if/when you eventually make it public. However, having a clear and intentional process for file management from the get-go will alleviate these worries as well as the need to clean things up after the fact, which just adds more work. You may also be unclear on what you are allowed to share. We discuss the issue of data sharing in a subsequent section, but whether you are sharing data, materials, analysis code, or anything else related to the project, it is important to consult with your supervisor and collaborators to ensure sharing is permitted and desired. Lastly, you may be concerned that if other people use your materials (e.g., survey design, code) that this is detrimental, as someone else is “profiting” from your hard work, but actually you can gain credit yourself in the form of citations.


3. Preprints and Depositing in the UWL Repository (Level: Easy)

What? The term preprint originally referred to a version of a manuscript that was publicly available prior to being submitted for peer-review. Although that still remains true, preprints now also include manuscripts that are under review, or author-accepted manuscripts (which follow peer review but precede publisher modifications such as copyediting and typesetting). We rate posting preprints and accepted manuscripts as easy because in essence it simply requires uploading a file you already have to a website. In fact, this may be the lowest effort open science behaviour that one could engage in, and yet it is associated with many potential benefits.

Why? Posting a manuscript before submitting to a journal allows for a wider range of feedback than what is afforded through peer review and can help improve a paper prior to submission by identifying any major flaws. Posting an article after submission, but before acceptance, gets a version of the paper out as soon as possible for sharing findings and interest, as well as keeping a record of what the paper looked like before it underwent the review process. Posting a manuscript after it has been accepted to a journal allows for the paper to be shared faster than it may be published and allows for an open access version of the paper to be shared. Preprints are also a great way to share work that does not continue to publication, providing greater access to the full body of available literature. Using preprints in this way can be useful if you choose not to stay in academia and do not get a chance to publish your research, but would still like to have it available as part of the scientific record.

How? There are many different available preprint hosts with varying levels of moderation (e.g., arXiv, bioRxiv, SSRN) and emphasis on specific disciplines (see https://osf.io/preprints/). There are also general purpose repositories such as Figshare (https://figshare.com/) and Zenodo (https://zenodo.org/), which hosts both literature and data deposits.

If you are a UWL PhD student or member of academic staff you can also take advantage of your institutional access to the UWL Repository (https://repository.uwl.ac.uk/) to deposit your research publications and scholarly material. Your deposits are then reviewed by one of the repository’s administrators to ensure that all material conforms to the correct publisher policies on copyright (see further details on publisher copyright and self-archiving policies at https://v2.sherpa.ac.uk/romeo/).

For guidance on depositing in the UWL Repository, view our step-by-step YouTube tutorial at: https://www.youtube.com/watch?v=1kL0Q-HsmKQ

You can also follow our written guidance on how to make a successful deposit at the following link: https://repository.uwl.ac.uk/guide.pdf

As mentioned, you can post a preprint at different points of the publication timeline. If you are posting an article that has already been accepted for publication at a journal, in most cases you are able to post an author accepted version to a preprint server, but not the final publisher-formatted version (see Worries below for details).

Worries. Most of the worries around posting preprints pertain to what is allowable and how doing so will impact the peer-review process. Many authors worry that posting a preprint will be treated as “published” and thus they will not be able to submit their manuscript for publication in a journal; however, this is not true in most cases. Before posting a preprint, authors should consult Sherpa Romeo (http://sherpa.ac.uk/romeo), which tracks the restrictions and rules for most journals, indicating whether journals allow posting of preprints pre-submission (usually yes), posting of author accepted manuscripts (usually yes), and posting of publisher-formatted accepted articles (usually no). To alleviate any remaining anxieties, if you know the journal that you wish to submit to, you can contact the editor directly in order to get in writing whether preprints of the work are allowed before submission to the journal. There may also be concern that posting a preprint will decrease the number of times an article is cited; however, this is not the case as preprints have actually been found to increase the number of citations. Another concern with posting a preprint prior to submission is that someone else could “scoop” you, stealing your idea and running the study themselves before you are able to publish your work. Although this is possible, it is also very unlikely. Moreover, all articles are posted with a date/time stamp and therefore there is a clear temporal record.

Finally, students may have concerns that their advisors or other collaborators will not be open to posting preprints. Of course, you should always consult with coauthors prior to posting preprints. There should be little concern with posting author accepted manuscripts of outputs, but some may be more skittish about posting prior to submission to a journal. There may be worries about posting a version of paper that will later be changed; however, you are able to post as many updated revisions to your preprint as needed. Overall, we recommend engaging in a conversation to determine the source of the concern and then providing the preceding explanations for the commonly expressed concerns.


4. Sharing Data (Level: Medium)

What? Sharing data pertains to making the de-identified dataset used for a project available to other researchers. Importantly, this means posting the data on a data repository for researchers to download and use, or establishing a formal system through which others can access the data (useful for sensitive data). There is enough evidence to indicate that stating “data available on request” is not sufficient to constitute engaging in the process of data sharing. We rated sharing data as medium as it takes some forethought on the front end with consent forms and organisation at the backend in terms of how to organise the data and share it with others in a way that aligns with ethical responsibilities.

Why? There are several compelling reasons for sharing your data. First, data sharing allows others to reproduce the analyses reported in a paper, providing checks on quality and accuracy, and to expand on the analyses through fitting alternative models and conducting robustness tests. Second, most datasets have use beyond what is reported in a paper. This includes secondary data analysis that addresses different questions altogether and inclusion in meta-analyses, where researchers having access to the raw data is a major benefit. Third, sharing data may be required by the funding source for the project or the journal the article is published in. Sharing your data upon submission to a journal can be looked upon favorably by reviewers, even if they choose not to do anything with it, as it indicates a commitment to transparency. When applying for non-academic jobs involving data (e.g. data science), it can be useful to have an example of data that you have shared along with a codebook, to show that you are organised with your data.

How? The how of data sharing is why we conceptualise it as medium difficulty. There are a lot of complexities associated with data sharing, and so we direct readers to the UK Data Service’s resources (https://ukdataservice.ac.uk/learning-hub/research-data-management/) as well as the DMP’s website (https://dmponline.dcc.ac.uk/) rather than attempting to cover all of those complexities here. Research students and staff should read that article and share it with their supervisors and research team, as it addresses many of the commonly expressed concerns. It is also important for all researchers to become familiar with local and regional laws governing the protection of certain kinds of data (e.g., the General Data Protection Regulation in the European Union). Because the possibility of sharing must be indicated in the consent forms provided for participants, it can sometimes be difficult to publicly share data after the fact. Thus, a good time to initiate discussions about sharing data is during the project design phase. At that point, you can be sure to include appropriate clauses in the consent forms and ethics protocols that describe both your intent and your plan for sharing the raw data in an ethical way. Importantly, including these details does not require you to share your data, but rather allows for the option—if your supervisor is uncertain about sharing the data, you can revisit this with them once you have finished the project. Whether sharing your data or not—but especially if you are—it is critical to provide a data codebook that includes information on the structure of the dataset (e.g., what variable names correspond to, measurement levels); data sharing is only useful if it is understandable to an outsider. Finally, there are loads of different platforms or repositories where you can share your data, so it can be a bit overwhelming to choose. For simplicity’s sake, we suggest sharing data on Zenodo or browsing disciplinary data repositories on the registry of research data repositories (https://www.re3data.org/).

Worries. There are three major worries with data sharing. First, as mentioned above, are ethical concerns, for which we again direct readers to UK Data Service’s website. Sharing the data publicly may not be consistent with what was stated in the consent form completed by the participants, and so it may not be possible to post the raw data. An additional ethical concern is the risk of reidentification, especially for under-represented populations. There are a variety of strategies for handling these ethical concerns, including posting data without demographic information included. Moreover, it is important for researchers to understand that there are a variety of ways of making data open beyond making it freely available (e.g., having a specified process for interested researchers to securely access the data).

Finally, there is sometimes concern about other researchers benefiting from all the hard work you put in to collecting the data, and that you will not get credit for subsequent use. However, by applying at least a CC BY license to the data set, anyone who uses the data is obligated to attribute the data to you (e.g. through citing the associated paper or through using a DOI). In fact, Colavizza et al. (2020) found that sharing data actually increases the citation impact of articles by 25% on average.

Acknowledgements:

This blog post is heavily adapted from the following paper licensed under Creative Commons Attribution 4.0 which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited:

Easing Into Open Science: A Guide for Graduate Students and Their Advisors

Citation: Ummul-Kiram Kathawalla, Priya Silverstein, Moin Syed; Easing Into Open Science: A Guide for Graduate Students and Their Advisors. Collabra: Psychology 4 January 2021; 7 (1): 18684. doi: https://doi.org/10.1525/collabra.18684

Know our onions!

In March 2018, our good friend and colleague, Kevin Sanders, published a blog post which documented the process of making the UWL Repository accessible from within the Tor network, as an onion service, in order to highlight issues of intellectual surveillance as an impediment to accessing research and scholarly materials.

Thanks to this work, a global audience of users was given the opportunity to access UWL scholarly materials in a way that preserved their intellectual privacy, ensuring neither their ISP nor UWL, as a service provider, would be able to track their personal use of the repository. The following year in November, Kevin took steps to make the University’s flagship open access journal, New Vistas, accessible as an onion service in the same spirit.

Following these activities at UWL, Library Service colleagues expressed their desire to see Kevin’s proof-of-concept maintained into the future.

In practical terms and in light of Tor’s roadmap to prioritise a more secure scheme for onion services, this involved not only recreating the equivalent onion services at more permanent locations but upgrading them to conform to the preferred (v3) scheme above. These links (which are hashes of the public keys for each of the onion services) continue to use the top level domain (TLD) .onion but can be distinguished by their much longer 56-character length:

UWL Repository:

http://jibtvgs3aedcoclau7ahhqymrk4ut3xcgpccgx3z5gpeoemmpodnhcqd.onion

New Vistas journal:

http://av5idj7ggxj4gskx7ae7wwcoewulxumwp4m5mlwoybdqhj6s7wycajyd.onion/newvistas

In addition to this, library service colleagues decided to make use of the onion-location feature that allows standard domain addresses accessed within Tor (for eg. https://repository.uwl.ac.uk) to signpost to their .onion counterparts.

screenshot of the UWL Repository homepage, accessed from within the Tor network and displaying the purple Onion-location feature in the address bar
https://repository.uwl.ac.uk

By configuring the onion-location feature and sharing the onion service details as we have in this blog post we aim to contribute towards the ongoing conversation about the role of intellectual privacy in opening up research as well as introduce opportunities for users to access online research content with a greater degree of anonymity. Depending on the end goal, highlighting onion service details in this way can also be useful as onion services are not indexed in search engines in the typical way that clear net websites are.

On the subject of the onion-location feature, the Tor Project have said, ‘for years, some websites have invisibly used onion services with alternative services, and this continues to be an excellent choice. Now, there is also an opt-in mechanism available for websites that want their users to know about their onion service that invites them to upgrade their connection via the .onion address’.

Screenshot of the New Vistas journal homepage, as seen from within the Tor network and displaying the purple Onion-location feature in the address bar
https://uwlpress.uwl.ac.uk/newvistas/

So how are onion services different to a Tor relay in the network?

You may already be familiar with Tor, the browser and volunteer network of Tor relays.  Tor relays are part of a decentralised public infrastructure system where Tor users’ traffic is routed from one relay to the next, adding a layer of encryption each time, until it reaches the exit node where it leaves the network and contacts the destination server via an unencrypted link. This means that the final stretch of traffic is more vulnerable to surveillance and could be targeted by an organisation or bad actor monitoring or even running an exit node.

Tor circuit step two
Source: https://2019.www.torproject.org/about/overview.html.en#overview

Onion services don’t operate like a series of conventional Tor relays in this regard. Instead, a user connects to an onion service (formerly known as a hidden service) thanks to a series of “handshakes” that help to establish a “rendezvous point” within Tor that supports authentication between users and services without disclosing their network identities (IP addresses). This means that a user’s interaction with an onion service can never be surveilled by the monitoring of Tor exit nodes, for instance. It also means that someone hosting a website can protect the location of their server, allowing both points of a connection to be anonymised. This has the advantage of creating a metadata-free environment between user and service (with the usual caveat that users and services should continue to observe their threat models and avoid identifying themselves by other means, such as by signing their name on a blog post or logging into a service that uses their real name).

Diagram of step 6 in the process of connecting to a Tor Onion service. This shows three introductory points and a rendezvous point in which the client can connect.
Source: https://tor.void.gr/docs/onion-services.html.en

You can read more about the wider benefits of onion services with regards to censorship circumvention and network sustainability here.

To access an onion service, you’ll need to access it through the Tor browser. It’s a modified version of Firefox that’s configured to connect to sites through the Tor network.

Download the Tor browser from the Tor project’s website It’s available for Windows, Mac, Linux, and Android.


Acknowledgements:

We would like to give special thanks to the following people for their past and present support and curiosity in helping to configure and maintain the above onion services: Andy Byers, Ed Oakley, Andrew Preater, Murray Royston-Ward, Mauro Sanchez and Kevin Sanders. We would also like to acknowledge the tireless advocacy of Alison Macrina whose campaigns and influence on intellectual privacy issues are unmatched. You can find out more about her important work here: https://libraryfreedom.org/

20 Years of the Budapest Open Access Initative

Some twenty years ago a small meeting convened in Budapest bringing together delegates from different countries and institutions. While each put forth their unique perspective, they also shared a mutual belief in what they would soon come to coin ‘Open Access’.

The group of delegates met with the intention to accelerate global efforts to make research articles in all academic fields freely available on the internet and the result of that meeting was the formalisation of the Budapest Open Access Initiative, commonly referred to as the BOAI.

The group then released a declaration or statement of strategy and commitment to advocating for and realising Open Access infrastructures across diverse institutions around the world.

How did the BOAI meeting come about?

The Budapest Open Access Initiative (BOAI) was borne out of a meeting that the Open Society Foundation (OSF) organised in Budapest, having been based there up until 2018 when it finally moved to Berlin. The foundation supported the Network Library Program which worked in 24 countries across Central and Eastern Europe as well as the former Soviet Union and supported the science journals donations program through which they shipped hard copies of scientific journals to academies of sciences in these regions.

In 2000 the OSF’s board challenged them to find new ways to get academic literature into the hands of researchers who needed it, besides physically shipping these hard copies of journals. The meeting was also inspired by a petition which PLoS, the Public Library of Science, had released in the summer of 2001. This called on academics to withhold submissions from academic journals which did not make the resulting articles freely available after six months.

This culmination of forces together encouraged the OSF to seek out alternative solutions, starting with a clarion call to leaders across the gamut of scholarly communications, to see how they could best advance the goals that they were beginning to define as a community. During these formative stages, all participants agreed that any movement towards realising improved access to research materials needed to be global in its implications, in order to have a wide and lasting impact.

Agreeing on a common definition

Prior to the contemporary understanding of the term ‘Open Access’, practioners such as Peter Suber often referred to the concept of the free sharing of research material online as ‘Free Online Scholarship’. As the meeting convened in Budapest however, participants considered the antecedents of the new and growing movement in research and found areas of common ground with open source software development, a term, practice and movement which was already recognised in its own right.

The meeting in Budapest aimed to explore a similar kind of openness but with a focus on research texts and eventually research data. Finally, the term Open Access, while not quite as self-explanatory as initially hoped, was at least short. The participants agreed on the term, by analogy to the open source movement, and went on to define it:

 By “open access” to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

The Budapest Open Access Initiative 

The Strategic goals

A manifesto also emerged, as part of a collaborative writing endeavour, to express the delegates’ thoughts coming out of the meeting in Budapest. One of the issues which preoccupied them was agreeing on the means upon which to best achieve Open Access. The first strategy involved posting a copy of an article to an institutional or subject-based repository. The second strategy involved publishing in an open access journal. (‘Open Archives’ for instance had previously been suggested as another alternative definition to ‘Open Access’ but would have favoured the repository approach). Eventually the two routes to achieving Open Access were advocated for and these are widely referred to today as Green and Gold Open Access respectively (and were objects of attention in the later Finch report of 2012).

Moving forward

Throughout the BOAI meeting in 2002, the overriding questions that emerged included: were the participants’ visions compatible; could they work together; could they agree on common definitions. And the answer was yes. Today, Open Access practices and movements are beset by new challenges which once again highlight the need for global input and action from activists and scholar-led communities. These questions frame ‘Open Access’ not as an end goal but as a means to achieving equity (insofar as it can) as well as quality in research and scholarly communications.

On 14 February 2022, The Budapest Open Access Initiative will celebrate its 20th anniversary and in preparation, the BOAI2020 steering committee is working on a new set of recommendations, based on BOAI principles, current circumstances, and input from colleagues in all academic fields and regions of the world.