Data modelling

von Niklas K.

What is the goal of graph data modeling with respect to queryability?

Key steps:
1. Define user goals.
2. Turn goals into questions.
3. Identify relevant entities and relationships.
4. Model them as graph patterns or path expressions.

What should be avoided when modeling graphs, and why?

Avoid encoding entities as relationships — use relationships to express how entities are related, not what they are.

Be cautious: nouns aren’t always entities, and verbs aren’t always relationships (e.g., “Alice sends a mail to Bob”).
Don’t merge data prematurely to optimize for queries; instead, let the graph grow naturally with nodes and relationships that match typical queries.

What are the general guidelines for using nodes and relationships in graph modeling?

Nodes represent entities (things of interest), with labels and properties for attributes and metadata.

Relationships define connections and semantics between entities.
Use direction to clarify meaning; for symmetric cases, ignore direction in queries rather than duplicating.
Relationship properties can store strength, weight, or metadata (e.g., timestamps).

What is the trade-off between fine-grained and generic relationships in graph modeling?

Fine-grained relationships (e.g., DELIVERY_ADDRESS, HOME_ADDRESS) offer faster traversal by distinguishing semantics via relationship names.

Generic relationships (e.g., ADDRESS {type: 'delivery'}) reduce schema complexity but incur extra I/O to check properties during traversal.
Choosing fine-grained names can optimize performance and clarity, especially for frequent queries.

When should you model something as a separate node in a graph?

Facts: When multiple entities interact over time (e.g., events), model the interaction as a node with connections and timestamps.

Complex Value Types: For multi-field, identity-less values (e.g., Address), model them as nodes to encapsulate their structure.

How can time be modeled in graph databases?

Two common techniques:
- Timeline trees: Break down time hierarchically (e.g., Year → Month → Day → Event).
- Linked lists: Represent time-ordered sequences with NEXT and PREVIOUS relationships.
These methods can be combined for efficient querying of both structure and chronology.

Zuletzt geändert
vor einem Monat