The taxonomic dictionary
1. Some definitions
What is a taxonomy?
What is a taxonomic dictionary?
What is an ontology?
2. The taxonomic dictionary content
3. Kinds of relations
4. Overview of the dictionary content
5. What makes the Gellish Dictionary a Smart Dictionary
6. Relations that enable Knowledge-aided design
A taxonomy is a hierarchical network of concepts, in which each concept is related to one or more other concepts that are its supertype concepts. The top of the total hierarchy is the concept ‘anything’. A domain taxonomy is a taxonomy for a particular domain that has one or more top level concepts that are direct or indirect subtype-concepts of anything.
A subtype-supertype relation is also called a specialization-generalization relation and can be denoted by the phrase <is a kind of> or by one of its synonymous phrases.
If a subtype concept is qualitative or is a value, then the kind of relation between the qualitative value and its conceptual supertype concept can be specified by a relation that is a specialization of such a subtype-supertype relation and is called a qualitative subtype relation. It is denoted by the phrase ‘is a qualitative subtype of’.
If a subtype concept is a manufacturers model or model and size, then the kind or relation may be specified by a further subtype of the qualitative subtype which is denoted by the phrase ‘is a manufacturers model of’.
Examples of subtype-supertype relations are:
- centrifugal pump <is a kind of> pump
- man <is a kind of> person
- red <is a qualitative subtype of> color
- Ford mustang <is a manufacturers model of> car
In Gellish, the concepts are represented by language independent unique identifiers that enable that each concepts is denoted by one or more names, synonyms, codes, abbreviations and translations.
A taxonomic dictionary is a dictionary of which each lemma represents a name of a concept, whereas the concepts are explicitly defined and related to each other as in a taxonomy.
An ontology is a taxonomic dictionary that is extended with relations of other kinds between concepts, whereas those relations represent knowledge about the related concepts.
A Gellish taxonomic dictionary, such as the dictionary of Formalized English, is an electronic ‘smart’ dictionary that is also a taxonomy and an ontology. This means that it contains definitions of concepts, each of which is identified by a unique identifier (UID) and can be referenced by one or more ‘names’ (terms, including synonyms, abbreviations and codes) and includes relations between concepts, among others subtype-supertype relations. The vocabulary of the formal language, being the names of concepts, are mainly ordinary natural language terms, many of them can also be found in ordinary English dictionaries. All definitions satisfy the rules for proper definitions of concepts in Gellish. This means among others that every concept has a textual definition that refers to its supertype concept(s), and it also means that they have at least one explicit relation with its supertype concept. Thus together the concepts form a consistent subtype-supertype hierarchy of concepts, called a taxonomy. The Gellish Modeling Methodology provides guidelines for the extension of the taxonomic dictionary and for the creation of domain taxonomic dictionaries or just vocabularies in other languages. Such extensions or modifications may lead to separate domain dictionaries for specialized application areas. As long as the common unique identifiers are used, the availability of vocabularies in various languages enable automated translation of Gellish expressions and their presentations in a generalized user interface.
The Gellish basic language definition in the taxonomic dictionary of Formalized English is free of charge available under Open Source conditions (through an Open Source License) via this website.
The most important concepts in the Gellish taxonomic dictionary are kinds of relations. They determine the semantic expression power of the language. Each kind of relation is identified by a UID and is denoted by a name, which may be a short expression, such as ‘being manager of an organization’. Such a name is accompanied by a base phrase, such as ‘is a manager of’ and an inverse phrase, such as ‘has as manager. A separate relation denotes which kind of thing should play the first role in the relation (in this example the role ‘manager’ and the role player ‘person’) and which kind or thing should play the second role (the role ‘managed’ and the role player ‘organization’). A base phrase requires that the player of the first role is located at the left hand of the phrase according to the normal English grammar and the player of the second role is at its right hand side. For an inverse phrase this is the other way around.
When searching for kinds of relations for expressing information users (and supporting systems) should be aware that kinds of relations are typically denoted in expressions by ‘phrases’; thus kinds of relations have phrases as their ‘names’. The phrases that denote the kinds of relations obey to a logical pattern. The following patterns can be recognized and should be helpful in searching for the proper kinds of relations:
- – is a / has a – Information about individual things is typically expressed as relations between two individual things. Such relations are typically denoted by phrases that conform to a particular pattern. Typically the phrases start with ‘is a’ or can be denoted by their inverse phrases that typically start with ‘has a’ or ‘has as’. Such words are typically followed by the kind of role that is played by one of the objects in the relation. Typically such phrases do not terminate with ‘a’. For example, the phrases: ‘is a friend of’ and ‘has as friend’ both conform to that pattern. Software that support searching for such phrases should thus enable to search for a combination such words and should display also the subtypes of found kinds of relations.
- – can be / can have – Knowledge about kinds of things usually expresses possibilities, thus they express what can be the case. This is typically expressed as relations between two kinds of things. Such relations are typically denoted by phrases that start with ‘can be’ or ‘can have’, followed a kind of role and terminating with ‘a’. Natural language expressions of knowledge typically also start with ‘a’. For example: ‘a pump can have as part a bearing’. In the formalized language we ignore the first ‘a’, whereas we include the second ‘a’ as part of the phrase that denotes the standard kind of relation. Thus the formalized expression and its inverse becomes:
Name of left hand object
Name of kind of relation
Name of right hand object
can have as part a
can be a part of a
- – shall be / shall have – Requirements about what shall be the case (in a particular applicability context) for things of specific kinds are subtype of possibilities. Such requirements are typically expressed as phrases that start with ‘shall be’ or ‘shall have’ and are further similar to the possibilities.
- – is by definition / has by definition – Kinds of relations about what is by definition the case for things of particular kinds are subtypes of requirements. It is not only possible, and required but also necessarily the case. Phrases that denote such relations typically start with ‘is by definition’ or ‘has by definition’ and are further similar to the possibilities.
- – is … a / has … a – Classification relations and other relations between individual things and kinds of things are typically denoted by phrases that start with ‘is’ or ‘has’, followed by a kind of role and typically terminating with ‘a’. For example, Rotterdam ‘is classified as a’ city and My car ‘has as part a’ turbo (in which ‘My car’ denotes an individual thing and ‘turbo’ denotes a kind of thing).
All those kinds of relations are defined in the taxonomic dictionary. The phrases can be used in Gellish for making formalized expressions close to natural language. Note that Gellish dictionary does not provide definitions of the separate words, such as is, a, part and of, but it gives definitions of the whole phrases, because the whole phrases represent kind of relation concepts: ways of being related. The definitions of the standard kinds of relations of the Gellish languages is given in the base ontology section of the Gellish taxonomic dictionary. The kinds of relations currently have ‘names’ and ‘Gellish phrases’ in English and in Dutch (Nederlands) and some in German and French. That base ontology section also contains definitions (and names) of the kinds of the roles that are played by objects in relations of those kinds and it contains definitions of the kinds of things that can play such roles. The definitions of kinds of relations also satisfy the rules for proper definitions of concepts in Gellish and thus form a consistent subtype-supertype hierarchy of kinds of relations. As all concepts in the dictionary, including the kinds of relations, roles and other kinds of things, are arranged in a subtype-supertype hierarchy of concepts, the Gellish dictionary is also a taxonomy. The collection of expressions that form the base ontology in the dictionary composes the top of that subtype-supertype hierarchy of concepts. All other concepts in the dictionary are subtypes of those generic concepts. The data in the Gellish dictionary are stored in a Gellish database or collection of files in Gellish Expression Format. For further information about that format see the Gellish Expression Format and its definition in the document ‘The Gellish syntax and contextual facts’.
The Gellish taxonomic dictionary defines concepts by expressing computer interpretable expressions of ideas about the concepts. The prime expressions have the form of specialization relations or qualification relations. The expressions are grouped in collections of expressions that define domain related subsets of the dictionary. In the current Gellish dictionary collections of expressions can be distinguished about the following domains:
- Formal language definition base – Generic concepts as required to define the Gellish grammar (in particular the kinds of relations, roles and related concept (role players).
- Documents, information and identification.
- Activities – Occurrences, events, activities and processes, including physical, chemical, and business processes.
- Civil, building, furniture and structural technology
- Static and heat transfer equipment technology
- Rotation equipment technology
- Transport technology, including rail, road, air and water transport
- Piping technology
- Connection and protection materials
- Electrical technology
- Control technology, including instrumentation
- Facilities and systems
- Geographic and marine objects
- Biology, Organisms, persons and organizations
- Business, procurement and financial objects
- Aspects, characteristics, properties, qualities, and laws.
- Qualitative and quantitative aspects (standard values)
- Units of measure, scales and currencies
- Symbols and annotation
- Materials (substances), solids and fluids, chemicals and radiation
- Mathematics and geometry (shapes)
- Roles of aspects (intrinsic aspects)
- Roles (usages, applications) of physical objects
Every concept is a subtype of a more generic concept, up to the top concept, called anything.
Lower in the hierarchy you will find the more specialized concepts as defined in engineering standards and in proprietary standards. Further specialized concepts, such as catalog items and manufacturer’s models are again subtypes of more generalized concepts.
A smart dictionary has a number of characteristics in addition to ordinary dictionaries.
The Gellish taxonomic dictionary is an electronic smart dictionary because it satisfies the following rules:
- It contains a definition per concept, whereas ordinary dictionaries usually provide various definitions of a term, where it is unclear whether those definitions are alternative definitions of the same concept or whether they are definitions of different concepts. Thus a smart dictionary explicitly distinguished homonyms (the same term for different concepts) and explicitly specifies which terms are used in the dictionary as true synonyms.
- It is completely arranged as a taxonomy, which is a subtype-supertype hierarchy of concepts. This means that each concept is defined as an explicit subtype of one or more supertype concepts by specialization relations (A is a kind of B and B is a kind of C, etc.).
- It includes also specialized concepts that are denoted by multiple term names. For example: line shaft centrifugal pump.
- It defines kinds of relations (being a special kind of concepts) that are denoted by phrases. These kinds of relation enable to make computer and human interpretable expressions that express ideas, including the expression of knowledge, requirements, definitions and other information.
- It contains explicit relations of specified kinds between concepts. For example, kinds of roles that are specific for things of particular kinds (and their subtypes) are not only related to their supertype concept, but also to the kind of thing of which members are by definition playing such a role. These additional relations make that the dictionary is not only also a taxonomy but also an ontology.
- It can be integrated with expressions of knowledge and with documents (multi-madia files), thus making it also a knowledge base or encyclopedia.
- Concepts are related to other concepts in various ways by explicit relations of standardized relation types. Those additional relations express additional knowledge about the concepts. This knowledge about the concepts can be used by computers, because such knowledge about a concept is inherited by all the subtypes of that concept in the subtype hierarchy (taxonomy).
- It uses one language independent unique identifier (a natural number) to represent each concept. This enables that facts that are expressed in one language can be automatically presented by a computer in any other language for which a dictionary is available.
- It can be extended by private and proprietary concepts and terms. For example, company specific codes and proprietary knowledge. Some instructions are given in ‘Proper definition of a concept’. Further instructions are given in the Gellish Dictionary Extension Manual.
- It is computer interpretable and system independent.
The Gellish Dictionary itself defines the Gellish languages: it is a language defining ontology.
The Gellish dictionary contains a special collection of facts that specify how expressions of knowledge can be used to create expressions of real facts about individual things.
These are <can be realized by a> relations. Each relation of that kind relates two kinds of relations:
- a relation type that is used to express knowledge facts, being a relation type that relates two classes
- a relation type that is used to express real facts, being a relation type that relates two individual things
An example of such a relation is the expression of the fact that:
- a <can have as part a> relation <can be realized by a> <has as part> relation.
The collection of such relations in the Gellish smart dictionary specify which kind of relation type should be used when knowledge or requirements are used to create facts about individual things in an imaginary or real world. This is typically used when knowledge or requirements are turned into designs.
The following example illustrates the basic principles of knowledge-based design using Gellish. Assume that a requirement expresses that
- a pump <shall have as part a> bearing.
The above-mentioned Gellish language relations define how to apply that knowledge to create bearings for individual pumps. For example, assume that P-101 is classified as a pump, then software that is powered with Gellish can conclude that
- P-101 shall have a part X,
- X is classified as a bearing.
Such software can also derive from the Gellish dictionary what kinds of bearings there are and from a knowledge model in Gellish it can derive which characteristics such components normally have.
Continue with Integrated information