Acquisition, Management, Sharing and Ownership of Data
Background
PSU Policies
Federal Policies
PowerPoint Presentations
Case Studies
Online Learning Tools
Articles
Background
There are many facets to the responsible conduct of research. Paramount among them is the integrity of data. While fabrication and falsification obviously affect data integrity, many other factors have the potential to compromise research data and results. Responsible data management includes appropriate data collection and storage, issues of access to and sharing of data, and determination of custody and responsibility for the data record and any associated sensitive information. Policies and guidelines regarding data management can vary among institutions and disciplines. In addition to their responsibility to conduct research ethically, researchers and scholars must abide by the procedures required by their funding agency, their institution or the source of the data (e.g. databanks, museum collections, research subjects). With the increasing use of technology in recording, storing, and sharing data, new norms are being developed for the ethical management of data.
What is data?
The National Institutes of Health guidelines on sharing research data refer to "final research data", which is defined as "recorded factual material commonly accepted in the scientific community as necessary to validate research findings. Final research data do not include laboratory notebooks, partial datasets, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as gels or laboratory specimens."(1) These latter items are considered "research resources" and guidelines for sharing these can be found elsewhere (2). Data can vary widely in character, especially when research and scholarship in fields other than science (such as the humanities) are considered. Proper management of data will depend on the nature of the data, and may need to be determined for each project in a careful and thoughtful manner.
Preserving data integrity and data storage
Data must be archived in a controlled, secure environment in a way that safeguards the primary data, observations, or recordings. The archive must be accessible by scholars analyzing the data, and available to collaborators or others who have rights of access. Primary research data should be stored securely for sufficient time following publication, analysis, or termination of the project. The number of years that data should be retained varies from field to field and may depend on the nature of the data and the research.
Sharing
According to the National Institutes of Health (1), "Data sharing achieves many important goals for the scientific community, such as
- reinforcing open scientific inquiry
- encouraging diversity of analysis and opinion,
- promoting new research, testing of new or alternative hypotheses and methods of analysis
- supporting studies on data collection methods and measurement
- facilitating education of new researchers
- enabling the exploration of topics not envisioned by the initial investigators
- permitting the creation of new datasets by combining data from multiple sources."
These benefits and values extend beyond the scientific community to include most forms of research and scholarship. Knowledge builds on prior scholarship, and often that involves further analysis of previously assembled data. An essential principle in most fields of scholarship is that research resources, including primary data, should be made available to others who wish to replicate or advance a line of work. This principle has been articulated by both the National Institutes of Health and the National Science Foundation, which have specific data sharing policies that apply to their funding recipients.
Many fields of research and scholarship are competitive, however, and often researchers are reluctant to share primary data before they have completed their analysis. Is it possible to find a balance that preserves the rights of those collecting the data to analyze it first, and the principle of shared scholarship? In response to protests from scientists that NIH's data sharing policy was too generous, that agency revised their definition of "timely release and sharing" to be "no later than the acceptance for publication of the main findings from the final data set." (2) It is understood that researchers have a right to "first and continuing use" of the data in which they have invested their time and effort. (2)
Sharing of confidential and sensitive data
Data that is personal, confidential or sensitive in nature (particularly concerning human research subjects) can, and should, still be shared. Research must be especially scrupulous, however, in ensuring that confidential or sensitive data is stored and released in a way that does not compromise privacy or create risks for research participants.
In cases where the data includes confidential or sensitive information, both the original, complete data set and a redacted (or anonymized) version, intended for broader use, should be maintained. Data released for general use should be free of variables that might identify research participants or lead to their identification by deductive means. There are a number of steps researchers can take to protect subjects' privacy (2):
- withholding part of the data
- statistically altering the data in ways that will not compromise secondary analyses
- requiring researchers who seek data to commit to protect privacy and confidentiality
- providing data access in a controlled site, sometimes referred to as a data enclave.
Ownership and responsibility
Typically, when research is funded by federal or nonprofit granting agencies, the data is owned by the institution receiving the grant. The primary researcher or scholar receiving the grant has the responsibility for storage and maintenance of the data, including the protection of confidential or sensitive information. Data obtained through research supported by private or corporate funding, however, may have different guidelines for ownership and restrictions on sharing. This issue is further complicated when organizations such as universities patent data sets.
It is important for researchers to understand the relevant ownership rules for any data that they collect or use. From an ethical standpoint, researchers should consider the implications of data ownership agreements before they are made with other researchers, institutions, or funding agencies. Will the data collected and analyzed be freely available for future collaborations or further analysis?
References:
- National Institutes of Health, Office of Extramural research http://grants1.nih.gov/grants/policy/data_sharing/data_sharing_faqs.htm
- NIH Data Sharing Policy and Implementation Guidance http://grants.nih.gov/grants/policy/data_sharing/data_sharing_guidance.htm#time
PSU Policies
- Guideline RAG16: The Responsible Conduct of Research
The Pennsylvania State University is committed to fostering integrity in the conduct of research. All members of the research community, including faculty, research staff, students, fellows, adjunct faculty, and visiting researchers, are expected to adhere to the highest ethical and professional standards as they pursue research activities to further scientific understanding.
The goal of the Guidelines is to offer a set of values, principles, and standards to guide decision-making and conduct throughout the research process. It is not intended to provide a set of rules that prescribe how researchers should act in all situations. Rather, the Guidelines are intended to increase awareness of research integrity and outline the University's expectations for ethical behavior amongst all researchers.
The Guidelines discussed are not mutually exclusive. There are many circumstances when many of them apply to a single project or activity. The risks of non-adherence to the Guidelines can be both personally and institutionally great. Potential consequences of non-adherence are outlined in the University polices that form the foundation for these Guidelines.
- Policy AD23: Use of Institutional Data
To establish policy for the use of University institutional data (which will include paper, film, electronic, etc.) and the responsibilities for the protection of such data.
Federal Policies
Resources
|