Blog Main Image
Ítalo Oliveira
8/7/2025
5 min read

FAIR data concerns data that satisfiesthe FAIR principles:

  • Findability: Metadata and data should be easy to find for both humans andcomputers.
  • Accessibility: Data and metadata should be accessible, even ifauthentication and authorization procedures are necessary.
  • Interoperability: The data should be integrated with other data andinteroperate with applications or workflows for analysis, storage, andprocessing.
  • Reusability: Ultimately, provided that data and metadata are well-described,the data should be reusable in different settings.

Although the idea of FAIR principles was defined in 2016 for scientific data, its value for the industry in the public and private sectors is clear: It is all a matter of data quality and data management. Given the increasing volume and diversity of data production, complying with the FAIR principles has become the key to running organizations and applications efficiently.

This happens because unFAIR data causes numerous problems across the organization. For example, many important questions involve data from multiple sources: What are the most likely and impactful cyberattacks enabled by certain vulnerabilities? It turns out that, in cybersecurity, different open-source knowledge bases address each of these elements: CAPEC and ATT&CK include attack patterns; CVE and CWE concern vulnerabilities; CVSS catalogs severity and other metrics. In this case, although the data is findable and accessible, its integration and interoperability are limited. Consequently, the data reusability is compromised to some extent, requiring tweaks and ad hoc, inefficient solutions.

This example regards public data. Handling internal closed data silos usefully can be at least as challenging, since we ideally want both an internal data integration and connections to external public data. For instance, consider an AI model that predicts diseases based on medical reports and blood test data. To best train such a model, regardless of the technique employed, we need to properly organize internal private data and link it to other publicly available data.

The data FAIRification process is not trivial, involving a pipeline that handles data and metadata from various sources.


The outcome more than pays off, since the data integration and reusability allow for advanced services and AI applications by leveraging data links that were only informally understood before the FAIRification process. It is important to highlight that privacy and security concerns should be an essential part of this process, particularly in sensitive domains, such as healthcare.

Go FAIR!

Our team at Y.digital has expertise in this FAIRification process, especiallyin semantic information modeling and linked data. If you want to know more, feel free to contact us:

References

Ítalo Oliveira
July 8, 2025
5 min read
By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.