Introduction
A theory has only the alternative of being right or wrong. A model has a third possibility: it may be right but irrelevant.
The Physicists Conception of Nature (ed. Jagdish Mehra) Reidel
There is only one reality, however, there are multiple ways to model that reality. Information modelling by nature is a reduction exercise — given that information models are used within particular contexts, only those aspects useful to those contexts should remain in the resulting information models.
Omission of essential aspects of the reality in an information model can be detrimental, and to a lesser extent, representation of non-essential aspects are superfluous.
The art of creating information models interoperable between groups and domains is the more challenging as multiple domains require different aspects of the same information model.
The practice of information modelling is one that rewards care, accuracy and expertise, and this course explains the tried and true best practices for the art of modelling with EXPRESS.
Modellers and the modelling team
In an information modelling exercise with a slightly larger scope requires collaboration between individuals, which we call the "modelling team".
The modeling team consists of three groups:
-
Technical experts who know the model domain.
-
Modeling experts who help the technical experts develop the model.
-
Reviewers, who are domain experts, to ‘keep the developers honest’
Process of modelling
Generally, modelling consists of the following steps:
-
Define the scope
-
‘Find the objects’
-
‘Sketch’ the structure
-
Generalisation and specialisation
-
Refine the structure
-
Eliminate synonyms and homonyms
-
Cardinality constraints
-
Local constraints
-
Global constraints
-
Partition
Principles of modelling
These principles form the core knowledge of modelling best practices. Each aspect below are elaborated separately in the sections that follow.
-
Define Scoping
-
Maintain Readability
-
Rely on Simple Types as possible
-
Independence between information models
-
Good Structural separation
-
Proper Constraints modelling
Best practices
Scoping
A model must have a well defined scope (otherwise how do you know when you are finished?).
Before modeling:
-
Define the intended model scope.
-
Document the assumptions.
Readability
An information model, although if defined via EXPRESS is computer interpretable, should primarily be human understandable.
Modeling constructs should be chosen to aid the reader rather than obfuscate understanding by using complex, intertwined or opaque definitional relationships.
Simple Types
The EXPRESS simple types (i.e Integer, String, etc) carry virtually no semantics.
As far as possible, avoid the use of simple types as entity attributes. Instead, use Types to provide the semantics.
ENTITY antique_object;
produced : date;
maker : name;
DERIVE
years_old : age := elapsed(produced);
END_ENTITY;
TYPE name = STRING; END_TYPE;
TYPE age = INTEGER; END_TYPE;
Independence
Implementation independence
An information model defines the information necessary for some purpose; it does not define the data structuring.
-
Model objects, not syntax.
-
Model the ‘real world’ not a technology-based representation.
-
Do not define ‘precision’ of data types.
-
Do not define ‘default’ data.
Schema independence
In EXPRESS, each Schema defines a scope; definition names need only be unique within a Schema.
Attempt to maintain name uniqueness across all schemas in a model (see the Nym Principle). This will assist when restructuring a model, if necessary, by modifying the schema boundaries.
Context independence
Each entity exists in a context in which it may be used. This may vary from extremely broad to highly specific. An entity definition should be as context independent as possible, yet as context specific as required.
-
Only apply the minimum necessary number of constraints.
-
Use Subtyping to get more specificity.
Structural
Partitioning
-
A
SCHEMA
defines a scope and a context. -
A model is often partitioned into a set of schemas (e.g to increase readability, to partition the development work, etc.)
-
Minimise the interaction between schemas.
-
Within a schema, minimise the constraints on the objects in question (to promote re-usability).
Nym Principle
In general, ‘one name, one meaning, one definition’. Synonyms and homonyms in a model are a fruitful and never-ending source of confusion.
-
‘If things are the same, then they should have the same name.’
-
‘If things are not the same, then they are different.’
-
‘Different things should have different names.’
Invariance
The meaning of an entity should not be dependent on the values of its attributes. Do not use ‘flags’ to change meanings.
ENTITY poor_person_model;
sex : enumeration_of_male_female;
... -- gender related attributes
... -- non-gender attributes
END_ENTITY;
ENTITY good_person_model
SUPERTYPE OF (ONEOF(male, female));
... -- non-gender attributes
END_ENTITY;
-- gender related attributes put into
-- the relevant subtypes
Redundancy
A model should not contain redundant information; redundancy leads to the possibility of data inconsistencies.
ENTITY circle;
center : point;
radius : REAL;
DERIVE
perimeter : REAL := 2.0*PI*radius;
diameter : REAL := 2.0*radius;
END_ENTITY;
Inheritance
A Subtype inherits all the properties of its Supertype.
For readability it may appear desirable to migrate the common properties down to the leaves of the supertype tree. This, however, implies that the common properties are semantically different.
All common properties should be moved as close to the root of the Supertype tree as possible. This demonstrates that they ARE common.
SCHEMA first;
ENTITY aaa;
-- attributes
END_ENTITY;
ENTITY original;
attr : NUMBER;
END_ENTITY;
END_SCHEMA; -- first
SCHEMA second;
USE FROM first (aaa AS bbb);
REFERENCE FROM first (original);
ENTITY constrained
SUBTYPE OF (original);
attr : INTEGER(7);
WHERE
positive : attr > 0;
END_ENTITY;
END_SCHEMA; -- second
Constraints
General
An EXPRESS information model is permissive (i.e what is not explicitly prohibited is permissable).
Add all necessary constraints — a model is as much about the limitations of objects as it is about the objects themselves.
TYPE age = INTEGER;
WHERE
non_negative : SELF >= 0;
END_TYPE;
Ordering of constraints
Specify constraints by the following ordered preferences:
-
Model structure
-
Local constraints
-
Global rules
Global rules
ENTITY male SUBTYPE OF (person);
wife : OPTIONAL female;
-- other attributes
END_ENTITY;
ENTITY female SUBTYPE OF (person);
husband : OPTIONAL male;
-- other attributes
END_ENTITY;
RULE married FOR (male, female);
-- check declared husbands
-- and wives match each other
END_RULE;
Local rules
ENTITY male SUBTYPE OF (person);
wife : OPTIONAL female;
-- other attributes
WHERE
-- check wife says she is
-- married to me
END_ENTITY;
ENTITY female SUBTYPE OF (person);
husband : OPTIONAL male;
-- other attributes
WHERE
-- check husband says he is
-- married to me
END_ENTITY;
Structural
ENTITY male SUBTYPE OF (person);
-- other attributes
END_ENTITY;
ENTITY female SUBTYPE OF (person);
-- other attributes
END_ENTITY;
ENTITY married;
husband : male;
wife : female;
UNIQUE
no_bigamy : husband;
no_polyandry : wife;
END_ENTITY;