DMP guidance - Kunnskapsbasen
DMP guidance
Guidance for the DMPOnline tool
This guidance document will help you write a Data management plan (DMP) for your project based on the Science Europe template in the DMPOnline tool, provided by the Digital Curation Centre (UK).
The guide provides NTNU specific guidance. If you choose other tools or templates, this guide might still be useful, even if topics and questions may not be identical.
For example plans created using DMPonline, see attachment (pdf).
Note: To ensure compliance with GDPR, all projects with personal data are required to send a notification form describing all relevant elements of the planned data processing to Sikt for an assessment. The only exception to this rule is health research projects at the Faculty of Medicine and Health Sciences. If your project includes personal data (any information relating to an identified or identifiable person), consider using the Sikt DMP Tool.
Innholdsfortegnelse [-]
- Guidance for the DMPOnline tool
- Creating an account
- Create plan
- Section: Data description and collection or re-use of existing data
- Section: Documentation and data quality
- Section: Storage and backup during the research process
- Section: Legal and ethical requirements, codes of conduct.
- Question: If personal data are processed, how will compliance with legislation on personal data and on data security be ensured?
- Question: How will other legal issues, such as intellectual property rights and ownership, be managed? What legislation is applicable?
- Question: How will possible ethical issues be taken into account, and codes of conduct followed?
- Section: Data sharing and long-term preservation
- Questions: How and when will data be shared? Are there possible restrictions to data sharing or embargo reasons?
- Question: How will data for preservation be selected, and where will data be preserved long-term (for example a data repository or archive)?
- Question: What methods or software tools will be needed to access and use the data?
- Question: How will the application of a unique and persistent identifier (such as a Digital Object Identifier (DOI)) to each data set be ensured?
- Section: Data management responsibilities and resources
- Question: Who (for example role, position, and institution) will be responsible for data management (i.e. the data steward)?
- Question: What resources (for example financial and time) will be dedicated to data management and ensuring that data will be FAIR (Findable, Accessible, Interoperable, Re-usable)?
- Reference document
Creating an account
- Go to DMPOnline
- Select Create an account (do not choose "Sign in with your institutional credentials”).
- Enter your name and e-mail
- Select other for the required field “organisation” (write “other” in the search box in order to get the option).
Create plan
After you have signed in, inside "My Dashboard", press Create plan. To choose the Science Europe Template, enter "Science Europe" in the field "Select the primary funding organisation".
As you progress in writing your plan, specific Guidance is provided on the right side of the screen. You can choose guidance from different institutions under tab "Project details" tab. Guidance from the Digital Curation Centre is provided as default in the tool.
Usually, you write several versions of your DMP. The first version, as you are starting your project, will usually not be complete and detailed. Do not despair, you can update and expand your DMP as the project progresses.
Section: Data description and collection or re-use of existing data
Question: How will new data be collected or produced and/or how will existing data be re-used?
According to the Science Europe Practical Guide, a sufficiently addressed DMP
- Gives clear details of where the existing data come from and how new data will be collected or produced. It clearly explains methods and software used.
- Explains, if existing data are re-used, how these data will be accessed and any constraints on their re-use.
- Explains clearly, if applicable, why new data must be collected, instead of re-using existing data.
Are there any existing datasets you could reuse? Visit the research data topic page at NTNU Innsida to see a list of places to search for data.
Question: What data (for example the kinds, formats, and volumes) will be collected or produced?
A sufficiently addressed DMP
- Clearly describes or lists what data types will be generated (for example numeric, textual, audio, or video) and their associated data formats, including, if needed, data conversion strategies.
- Explains why certain formats have been chosen and indicates if they are in open and standard format. If a proprietary format is used, it explains why.
- Provides information about the estimated data volume.
- Clearly states, if applicable, that no new data will be produced or generated by the project.
- NB. Information derived from previously existing data sources (namely output, processed, and analysed data) are to be considered new data under this question (i.e, you should state the types, formats and volume for this data as well).
Keep in mind whether the scale of the data will pose challenges when sharing or transferring data between sites; do you need to include additional costs? How will you address these challenges? If you have large volumes of data that need storage, please contact the IT department at NTNU.
Section: Documentation and data quality
Question: What metadata and documentation (for example the methodology of data collection and way of organising data) will accompany data?
A sufficiently addressed DMP
- Clearly outlines the metadata that will accompany the data, with reference to good practice in the scientific community (for example uses metadata standards where they exist).
- Clearly outlines the documentation needed to enable data re-use, stating where the information will be recorded (for example a database with links to each item, a ‘readme’ text file, file headers, code books, or lab notebooks).
- Indicates how the data will be organised during the project (for example naming conventions, version control strategy and folder structures).
What information is necessary for future users (including your future self) to find and understand the data? Tip: Read Making a Research Project Understandable: Guide for data documentation (Fuchs & Kuusniemi 2018).
Documentation might for instance include details on the methodology used, lab protocols, codebook, analytical and procedural information, definitions of variables, vocabularies, units of measurement, any assumptions made, and the format and file type of the data. Consider how you will capture this information and where it will be recorded (for instance in a ReadMe-file when you archive your data). Wherever possible you should identify and use existing community standards.
Metadata is often defined as “data about data”, and metadata standards are standardized ways of describing data. For more information about metadata and metadata standards, visit the website How to FAIR > Metadata. You could also see the RDA Metadata standards catalog to look for metadata standards for your field or data type.
A common metadata standard is Dublin Core, which is a list of 15 standardized elements describing a digital resource. Often, research data repositories (archives) will use some version of Dublin Core when describing datasets. An example is the NTNU institutional archive (NTNU Open Data), part of DataverseNO.
Also indicate how the data will be organised during the project, mentioning for example conventions, version control, and folder structures. Consistent, well-ordered research data will be easier to find, understand, and re-use.
For more information, see the wiki Metadata and dataset documentation.
Question: What data quality control measures will be used?
A sufficiently addressed DMP
- Clearly describes the approach taken to ensure and document quality control in the collection of data during the lifetime of the project.
How will the consistency and quality of data collection be controlled and documented? This may include processes such as calibration, repeated samples or measurements, standardised data capture, data entry validation, peer review of data, or representation with controlled vocabularies.
Section: Storage and backup during the research process
Question: How will data and metadata be stored and backed up during the research process?
A sufficiently addressed DMP
- Clearly (even if briefly) describes://
- The location where the data and backups will be stored during the research activities.
- How often backups will be performed.
- The use of robust, managed storage with automatic backup (for example storage provided by the home institution).
or
- Explains why institutional storage will not be used (and for what part of the data) and describes the (additional) locations, storage media, and procedures that will be used for storing and backing up data during the project.
Describe where your data will be stored during the project period. We recommend using NTNU’s standard storage solutions (see NTNU storage guide). For specific information about procedures for back-up for the solution you choose for your project, contact the IT support at NTNU.
Storing data on laptops, external hard drives, or external storage devices such as USB sticks is not recommended. Be sure to consider information security as well as data integrity and accessibility.
Question: How will data security and protection of sensitive data be taken care of during the research?
A sufficiently addressed DMP
- Clearly explains://
- How the data will be recovered in the event of an incident.
- Which institutional and/or national data protection policies are in place and provides a link to where they can be accessed.
- Who will have access to the data during the research.
- Clearly describes the additional security measures (in terms of physical security, network security, and security of computer systems and files) that will be taken to
- ensure that stored and transferred data are safe, when sensitive data are involved(for example personal data, politically sensitive information, or trade secrets).
Note that all data should be classified in order to choose the correct level of security and confidentiality. See NTNU's Data Storage Guide for information on how to classify research data. The page sikresiden.no also provides guidance on information security.
Relevant documents:
Section: Legal and ethical requirements, codes of conduct.
Question: If personal data are processed, how will compliance with legislation on personal data and on data security be ensured?
A sufficiently addressed DMP
- Clearly indicates if personal data will be collected/used as part of the project, and, if applicable, how compliance with applicable legislation will be ensured (for example by gaining informed consent, considering encryption, anonymisation, or pseudonymisation).
- Describes the procedure to manage access to only authorised users.
To ensure compliance with GDPR, all projects with personal data are required to send a notification form describing all relevant elements of the planned data processing to Sikt for an assessment. (The only exception: health research projects at the Faculty of Medicine and Health Sciences.)
All projects with personal data must perform a risk assessment before data collection begins.
If your project includes personal data (any information relating to an identified or identifiable person), consider using the Sikt DMP Tool.
Relevant documents:
Question: How will other legal issues, such as intellectual property rights and ownership, be managed? What legislation is applicable?
A sufficiently addressed DMP
- Clearly explains, if applicable://
- Who will have the rights to control access to which part of the data.
- What access conditions and re-use licenses will apply to the data.
- Clearly explains, if applicable, how intellectual property rights will be managed.
- Explains for multi-partner projects and multiple data owners how these matters are addressed in the consortium agreement.
- Alternatively, there is a clear statement that there are no such restrictions on the data.
- Indicates, if applicable, whether there are any restrictions on the re-use of third-party data.
Consider who will have ownership and/or rights to the data (including copyright), meaning who will have the rights or responsibility to control access, and later decide publishing. In general, if the research project is conducted by NTNU employees, NTNU will have ownership to results and IPR (see the IPR policy, part 4.3), The Policy for Open Science at NTNU states that results from research at NTNU should made publicly available if possible (for Licensing principles see part 3.1 in Guidelines for Policy for Open Science). Therefore, consider what data (and other results, like code, models, simulations etc) be openly accessible after the project is finalized, or will there be access restrictions? In the latter case, what restrictions and why?
If there are external partners, how will this affect ownership and sharing of data and other intellectual property rights (IPR)? Make sure to cover these matters of rights to control access to data for multi-partner projects and multiple data owners, in the consortium agreement. The wiki on Contract templates provides more information on formal agreements in collaborative research projects (in Norwegian only).
Note that in some cases, export control regulations will apply to the project results. See Control of knowledge transfer for more information.
Question: How will possible ethical issues be taken into account, and codes of conduct followed?
A sufficiently addressed DMP
- Provides details of what ethical issues have been considered that may affect data storage, transfer, use, sharing and/or preservation, and demonstrates that adequate measures are in place to manage ethical requirements.
- Mentions, if applicable, whether ethical review is being pursued. If ethical approval has been obtained, refers to the relevant committee and documents.
- Refers to relevant ethical guidelines and/or codes of conduct or alternatively provides a clear statement that explains why ethical issues have not been considered.
Is an ethical review (for example by an ethics committee/REK or approval of use of experimental animals) required for data collection in the research project?
Section: Data sharing and long-term preservation
Questions: How and when will data be shared? Are there possible restrictions to data sharing or embargo reasons?
A sufficiently addressed DMP
- Clearly describes how the data and/or metadata will be made discoverable and shared.
- Specifies when data will be shared and under which license.
- Includes the name of the repository, data catalogue, or registry where data will or could be shared.
- Includes information on how long the data will be retained and gives precision on its timely release.
- Clearly explains, if applicable, why data sharing is limited or not possible, and who can access the data under which conditions (for example, only members of certain communities or via a sharing agreement).
- Explains, where possible, what actions will be taken to overcome or to minimise data sharing restrictions.
NTNU encourages Open Science and open data, but this does not mean that all data should be shared openly. Datasets containing personal information are examples of data that should not be shared without careful considerations. Still, in many cases, it is possible to publish selections of data or to anonymise the data. Another option is to obtain consent for archiving personal data with access restrictions. Note that Sikt provides options for archiving data with access restrictions. They do not, however, accept anonymised qualitative data due to the efforts required to verify anonymity. Qualitative data with indirect personal identifiers can only be archived with consent from informants.
For more NTNU-relevant information regarding data sharing, see next question.
Question: How will data for preservation be selected, and where will data be preserved long-term (for example a data repository or archive)?
A sufficiently addressed DMP
- Provides details of what data collected or created in the project will be preserved in the long term and clearly indicates for how long. This should be in alignment with funder, institutional, or national policies and/or legislation, or community standards.
- Provides details of which (versions of) data and accompanying documentation will be retained or destroyed, and explains the rationale (for example contractual, legalrequirements, or regulatory purposes).
- Provides details of how the selection is made, and what possible interest there would be for re-use (or not).
- Provides details on how the data, accompanying documentation, and any other required technology such as copies of software in specific versions will be archived in the long term.
- Explains how data will be managed in a sustainable way beyond the lifetime of the grant.
- Provides the name of the archive or trustworthy repository – or the way to curate and preserve data – that will be used to make data available for re-use.
Are there any relevant community specific repositories/archives for your type of data? You can search for data repositories/archives at www.re3data.org. Data can also be published through generic repositories like Zenodo, or NTNU's institutional archive. This archive is part of DataverseNO, a Core Trust-certified data archive which provides access to the data for at least 10 years after deposition. Archiving data through NTNU Open Research Data/DataverseNO facilitates FAIR data by providing a DOI, following the Dublin Core metadata standard and by providing a license for reuse of data (CC0 is the standard). See also the wiki Research Data Repository.
For principles when choosing licenses for data and other research results, see part 3.1 of the Guidelines for Open Science.
Question: What methods or software tools will be needed to access and use the data?
A sufficiently addressed DMP
- Clearly indicates which specific tools or software (for example specific scripts, codes, or algorithms developed during the project, version of the software) potential users may need to access, interpret, and (re-)use the data.
- Provides information, if relevant, on any protocol to access the data (for example if authentication is needed or if there is a data access request procedure).
NTNU recommends the use of open source and open formats where possible. If the data is only available in proprietary formats requiring specific software or tools, consider the possibility of providing an additional copy in an open format even if that entails data loss. If specific code, scripts or algorithms are developed during the project, they can be published using the GitHub/Zenodo-integration. See also the wiki Open source. For principles when choosing licenses, see part 3.1 of the Guidelines for Open Science.
Question: How will the application of a unique and persistent identifier (such as a Digital Object Identifier (DOI)) to each data set be ensured?
A sufficiently addressed DMP
- Specifies how the data can be re-used in other contexts. Clearly indicates if and which PIDs are provided for all datasets, individual datasets, data collections, or subsets. If PIDs will not be used, it explains why.
- Clearly presents the approach, and the choice of identifiers is justified and refers to international standards.
Persistent identifiers like DOIs are usually issued by data repositories (e.g., Zenodo, NTNU Open Research Data etc.). In most cases, your choice of repository will therefore determine the use of persistent identifiers.
Section: Data management responsibilities and resources
Question: Who (for example role, position, and institution) will be responsible for data management (i.e. the data steward)?
A sufficiently addressed DMP
- Clearly outlines the roles and responsibilities for data management/stewardship (for example data capture, metadata production, data quality, storage and backup, data archiving, and data sharing), naming responsible individual(s) where possible.
- Clearly indicates who is responsible for day-to-day implementation and adjustments to the DMP.
- Explains, for collaborative projects, the co-ordination of data management responsibilities across partners.
Some funders consider costs related to research data management as a legitimate addition to the project budget. Consider whether you should include costs such as these when applying for research grants and contact the financial officer at your faculty/institute.
Question: What resources (for example financial and time) will be dedicated to data management and ensuring that data will be FAIR (Findable, Accessible, Interoperable, Re-usable)?
A sufficiently addressed DMP
- Provides clear estimates of the resources and costs (for example storage costs, hardware, staff time, costs of preparing data for deposit, and repository charges) that will be dedicated to data management and ensuring that data will be FAIR and describes how these costs will be covered. Alternatively, there is a statement that no additional resources are needed.
The FAIR acronym points to overarching principles for data management that will enhance the potential for reusability. For more information, see the wiki FAIR research data or visit the Go FAIR website.
In short, you increase the FAIRness of data by
- Depositing your data/metadata in a searchable resource
- Providing all information required for users (computer or human) to read and interpret the data.
- Using available community standards for data and metadata.
- Using open formats and assigning persistent identifiers.
- Providing your data with an appropriate license
In some cases, there will be additional costs involved in managing data in a way that promotes reuse. Examples could be costs for storage and processing of large amounts of data, or costs related to making particular data types available through repositories. For some projects, there might also be a need for a dedicated data manager or data steward. Although it might be difficult to pinpoint the exact costs, OpenAIRE has developed a data costing tool that lists elements that could be useful to consider when attempting to estimate.
Reference document
This guide contains excerpts from DMP Evaluation Rubric of Science Europe Practical Guide to the International Alignment of Research Data Management - Extended Edition (DOI 10.5281/zenodo.4915861).