Overview
This research introduces a design method for identifying reusable metadata within property-graph schemas. The method addresses the challenge faced by schema designers in determining whether descriptive properties, which frequently appear across diverse nodes and edges, should remain embedded or be restructured as reusable metadata. It proposes a systematic framework for this decision-making process.
Research Context
The problem investigated arises in the context of property-graph schemas, where descriptive properties often recur. Schema designers currently lack a defined methodology for deciding on the externalization versus embedding of these recurring properties. The proposed method operates within a 5GNF-oriented modeling perspective, offering a structured approach to this design-stage problem.
Approach
The method for identifying metadata candidates is based on five specific criteria:
- Cross-element occurrence
- Conceptual independence
- Lossless externalization
- Reuse potential
- Governance relevance
A rule-based decision workflow is employed to classify properties according to these criteria. This workflow categorizes properties into three groups: trait candidates, embedded properties, and borderline cases. The application of this approach was demonstrated using an example from a library domain.
An illustrative validation was conducted, involving participant-based classification tasks. This validation took place in two distinct schema contexts. The purpose of this validation was to examine the practical application and implications of the proposed method.
Findings
The illustrative validation yielded specific findings regarding metadata identification:
- Recurrence, by itself, was observed to be an insufficient basis for the externalization of properties.
- The identification of metadata candidates necessitates semantic interpretation, extending beyond mere frequency of property occurrence.
The core contribution of this paper is methodological, providing a more explicit and systematic foundation for making decisions about when descriptive properties should be modeled as reusable metadata within property-graph schemas.