Personal tools
Document Actions

Acquisition, Management, Sharing and Ownership of Data

Background
PSU Policies
Federal Policies
PowerPoint Presentations
Case Studies
Online Learning Tools
Articles

Background

There are many facets to the responsible conduct of research, and one of the most important is the integrity of data. While fabrication and falsification obviously affect data integrity, many other factors have the potential to compromise research data and results. Responsible data management includes appropriate data collection and storage, issues of access to and sharing of data, and determination of custody and responsibility for the data record and any associated sensitive information.

There are many facets to the responsible conduct of research, and one of the most important is the integrity of data. While fabrication and falsification obviously affect data integrity, many other factors have the potential to compromise research data and results. Responsible data management includes appropriate data collection and storage, issues of access to and sharing of data, and determination of custody and responsibility for the data record and any associated sensitive information. Policies and guidelines regarding data management can vary among institutions and disciplines. In addition to their responsibility to conduct research ethically, researchers and scholars must abide by the procedures required by their funding agency, their institution or the source of the data (e.g. databanks, museum collections, research subjects). With the increasing use of technology in recording, storing, and sharing data, new norms are being developed for the ethical management of data.

What is data?

The National Institutes of Health guidelines on sharing research data refer to "final research data", which is defined as "recorded factual material commonly accepted in the scientific community as necessary to validate research findings. Final research data do not include laboratory notebooks, partial datasets, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as gels or laboratory specimens."1 These latter items are considered "research resources" and guidelines for sharing these can be found elsewhere.2 Data can vary widely in character, especially when research and scholarship in fields other than science are considered. Proper management of data will depend on the nature of the data, and may need to be determined for each project in a careful and thoughtful manner.

Preserving data integrity and data storage

Data must be archived in a controlled, secure environment in a way that safeguards the primary data, observations, or recordings. The archive must be accessible by scholars analyzing the data, and available to collaborators or others who have rights of access. Primary research data should be stored securely for sufficient time following publication, analysis, or termination of the project. The number of years that data should be retained varies from field to field and may depend on the nature of the data and the research.

Sharing

According to the National Institutes of Health1, "Data sharing achieves many important goals for the scientific community, such as

  • reinforcing open scientific inquiry
  • encouraging diversity of analysis and opinion
  • promoting new research, testing of new or alternative hypotheses and methods of analysis
  • supporting studies on data collection methods and measurement
  • facilitating education of new researchers
  • enabling the exploration of topics not envisioned by the initial investigators
  • permitting the creation of new datasets by combining data from multiple sources."

These benefits and values extend beyond the scientific community to include most forms of research and scholarship. Knowledge builds on prior scholarship, and often that involves further analysis of previously assembled data. An essential principle in most fields of scholarship is that research resources, including primary data, should be made available to others who wish to replicate or advance a line of work. This principle has been articulated by both the National Institutes of Health and the National Science Foundation, which have specific data sharing policies that apply to their funding recipients.

Many fields of research and scholarship are competitive, however, and often researchers are reluctant to share primary data before they have completed their analysis. Is it possible to find a balance that preserves the rights of those collecting the data to analyze it first, and the principle of shared scholarship? In response to protests from scientists that NIH's data sharing policy was too generous, that agency revised their definition of "timely release and sharing" to be "no later than the acceptance for publication of the main findings from the final data set."2 It is understood that researchers have a right to "first and continuing use" of the data in which they have invested their time and effort.2

Confidential and sensitive data

Scholars and researchers must be especially scrupulous in ensuring that confidential or sensitive data is stored and released in a way that does not compromise privacy or create risks for research participants. The Code of Ethics of the American Anthropological Association states, “In conducting and publishing their research, or otherwise disseminating their research results, anthropological researchers must ensure that they do not harm the safety, dignity, or privacy of the people with whom they work, conduct research, or perform other professional activities, or who might reasonably be thought to be affected by their research.”3

This is not only a moral and professional responsibility, but a legal requirement as well. Federal regulations, such as the Common Rule and FDA regulations, require attention to the privacy of research subjects, including the confidentiality of data about them. The “Privacy Rule” of the federal Health Insurance Portability and Accountability Act (HIPAA) describes requirements for most research data derived from health care records.4 Achieving appropriate confidentiality requires specification of data handling responsibilities and privileges – that is, who can handle which portion of data, at what point during the project, for what purpose, and so on.

Data that includes confidential or sensitive information can still be shared, however. There are a number of steps researchers can take to protect subjects' privacy2:

  • withholding part of the data
  • statistically altering the data in ways that will not compromise secondary analyses
  • requiring researchers who seek data to commit to protect privacy and confidentiality
  • providing data access in a controlled site, sometimes referred to as a data enclave.

Ownership and responsibility

Typically, when research is funded by federal or nonprofit granting agencies, the data is owned by the institution receiving the grant. The primary researcher or scholar receiving the grant has the responsibility for storage and maintenance of the data, including the protection of confidential or sensitive information. Data obtained through research supported by private or corporate funding, however, may have different guidelines for ownership and restrictions on sharing. This issue is further complicated when organizations such as universities patent data sets.

It is important for researchers to understand the relevant ownership rules for any data that they collect or use. From an ethical standpoint, researchers should consider the implications of data ownership agreements before they are made with other researchers, institutions, or funding agencies. Will the data collected and analyzed be freely available for future collaborations or further analysis?

Summary

These points all come together in the National Science Foundation’s data-sharing policy: "NSF expects significant findings from research and activities it supports to be promptly submitted for publication, with authorship that reflects the contributions of those involved. It expects investigators to share with others at no more than incremental cost and within a reasonable time, the data, the samples, physical collections and other supporting materials created or gathered in the course of the work. It also encourages awardees to share software and inventions to make them useful and usable. Exceptions may be allowed to safeguard the rights of individuals and subjects, the validity of results or the integrity of collections."5

References

  1. National Institutes of Health, Office of Extramural Research
    http://grants1.nih.gov/grants/policy/data_sharing/data_sharing_faqs.htm
  2. NIH Data Sharing Policy and Implementation Guidance
    http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#time
  3. The American Anthropological Association, AAA Code of Ethics (2009)
    http://www.aaanet.org/issues/policy-advocacy/Code-of-Ethics.cfm
  4. Health Information Privacy, US Department of Health and Human Services (2003)
    http://www.hhs.gov/ocr/privacy/hipaa/understanding/special/research/index.html
  5. Dissemination and Sharing of Research Results, National Science Foundation (2010)
    http://www.nsf.gov/bfa/dias/policy/dmp.jsp

PSU Policies

  • Guideline RAG16: The Responsible Conduct of Research

    The Pennsylvania State University is committed to fostering integrity in the conduct of research. All members of the research community, including faculty, research staff, students, fellows, adjunct faculty, and visiting researchers, are expected to adhere to the highest ethical and professional standards as they pursue research activities to further scientific understanding.

    The goal of the Guidelines is to offer a set of values, principles, and standards to guide decision-making and conduct throughout the research process. It is not intended to provide a set of rules that prescribe how researchers should act in all situations. Rather, the Guidelines are intended to increase awareness of research integrity and outline the University's expectations for ethical behavior amongst all researchers.

    The Guidelines discussed are not mutually exclusive. There are many circumstances when many of them apply to a single project or activity. The risks of non-adherence to the Guidelines can be both personally and institutionally great. Potential consequences of non-adherence are outlined in the University polices that form the foundation for these Guidelines.

  • Policy AD23: Use of Institutional Data

    To establish policy for the use of University institutional data (which will include paper, film, electronic, etc.) and the responsibilities for the protection of such data.

Federal Policies

Resources

PowerPoint Presentations

Case Studies

Online Learning Tools

Articles

Access to select articles is provided free of charge by Penn State Libraries. To access these articles from a non-campus location, you must authenticate using the Penn State Virtual Private Network (VPN).

Contact Us

Office for Research Protections
814-865-1775 • ORProtections@psu.edu
The ORP Education Program manages the SARI program and offers training and workshops on Responsible Conduct of Research (RCR) topics.
Contact the Education Program Staff