www.entitymodelling.org - entity modelling introduced from first principles - relational database design theory and practice - dependent type theory
In Part One we have shown how entity models describe both the types of things, entity types, and the relationships between different types of things within a chosen perspective and which perspective is then considered the whole or the absolute from a logical point of view. In this chapter we focus on how entitites may be communicated. In doing so we explain how entity models are used for data specification i.e. how they may prescribe or, a posteriori, document the structure of data.
To use entity modelling in this way in the description and construction of information systems, we require, alongside of the meta-concepts of entity type and relationship, a third and vital meta-concept — that of attribute. In the literature an attribute is variously defined as
Whatever you make of these definitions it is clear that examples are needed; our first examples are given below and then others follow in subsequent sections. Regarding such examples, Schlaer and Mellor4 illustrate these same concepts with examples in which entitites correspond to the rows of example tables and attributes correspond to columns of the table; other authors do likewise and one at least treats column of table and attribute as synonymous terms5. We will also rely on examples in tabular form but with one important difference: we shall have examples in which some rows have other rows nested within them; in our examples the nested rows represent dependent entitites — they are a visible representation of compositional structure within an entity model i.e. of composition relationships between entitites.
Now, when we call to mind data, then we think of names, quantities, monetary values, addresses, dates, temperatures, geographical coordinates, and so on. Such items of data as these convey information only within specific contexts and when attributed to subjects at hand. A temperature, a colour, a price, a height, a distance — all these tell us nothing less they be the temperature, the colour, the price, the height or the distance of some thing. We can paraphrase in the language of entity modelling and say they tell us nothing less they be attributed to an entity. This then is our starting point. I ask you to start with the view that data only conveys information when it is embedded in messages built systematically following some sorts of rules, just as vocabulary is only meaningful within the context of text that is grammatical and free from category mistakes and other howlers. Primarily we focus on data as the consituent parts of messages rather than thinking of it as content within a database. If we do this then data specification is a more general term than data modelling and a methodology for data specification i.e. for the specification of message structures is of more general utility than one for specification of database structure i.e for data modelling — for the former subsumes the latter. Data specification is the act of specifying rules by which data will be combined and communicated to convey information about subject entities. The same entities, essentially the same data, may take many different forms — they may be stored in a database, they may be communicated over a network, enriched by a program according to a set of rules, displayed to a user, as, say, on a web page. Each form that they take, when analysed, will likely have different message structure but each will map one to the other and therefore one to all.
Just as we conceive abstract lingustic structure common to speech and writing, for our purposes here we require an abstract concept of a message system — what we lose in ease of explanation we gain in generality. To achieve this level of abstraction, we take it as incidental, i.e. as a given, how we communicate universals; among these are the terminal instances, including numbers, dates and strings, and the identities of entity types and attributes and we also take it as given that one set of messages may be embedded or otherwise presented within the context of another.
We require that a message communicates the identifying features of the subject entity, the attribute values for each of its attributes, which we have said is given, and that it communicates all the relationships of the subject entity with other entities. Optionally, it may communicate one or more of the subject entitites parts (i.e. entitites reached through composition relationships), recursively. Finally we require that all data in a message communicate something i.e. that there is no redundant data in the overall message set that describes a state of affairs.
Historically, as used for data specification, an ER model will often be said to be logical or physical or to constitute a data model; I will use the terms somewhat differently as explained in the final section which finishes with an example from Chen.