Gellish Syntax – Expression format

The Gellish Syntax – The Expression Format

The Gellish Expression Format is a tabular format. It can be expressed as a Gellish file in various basic formats, such as CSV and JSON, using the UTF-8 encoding standard. It can also be created in spreadsheet formats, such as XLSX and then exported in CSV format.
A Gellish file shall start with a file header line. The first field in that line shall contain the word ‘Gellish’. The following fields can contain optional ‘parameter=value’ combinations. For example: Language=English, which is the default value. Other possible values are e.g. Taal=Nederlands, etc. The optional parameters are given in the following table.

Parameter Example value Description
Language English The language in which the file is written
Version 9.0 The version number of the language definition that is used
Date 13 Sep 2019 The date of the latest update of the file content
Category Product model A categorization of the content of the file
Title Demo example A description of the content of the file
Prefix Demo A prefix for the allocation of new unique identifiers (UIDs).
Obj_uid 1:10000 A numeric range for allocation of unique identifiers (UIDs) for new objects
Idea_uid 10000:20000 A numeric range for allocation of unique identifiers (UIDs) for new ideas (expressions)

Note: New UIDs shall be a concatenation of a prefix, a colon (:) and a sequence number within the specified range. An exception to this rules are standard Gellish UIDs, which are character strings that denote integer numbers. Reserved prefixes are hashes(#) and the prefix ‘dd’ which are reserved for numbers and dates or date-times respectively.

The second and third line in a Gellish file shall define the table header. The second line specifies table column IDs, which determine the meaning of the fields in the columns, the third line contains free text supporting human understanding of the meaning of the columns. The following table header specify a table with columns that form the core of the expressions. Together with the first line, using English terminology and provided with an example of a Gellish expression the table becomes:

Gellish Language=English
43 101 3 201 7 4
Name of an intention Name of a left hand object Name of a kind of relation Name of a right hand object Symbol of unit of measure Textual definition
assertion The Euromast is located in Rotterdam

The above Gellish expression table contains columns that are identified by language independent column IDs (the numbers 43, 101, 3, etc.). The names of the columns as given on the second row are free text. The use of numeric column IDs makes that the table structure becomes language independent and enables that the columns may be arranged in any sequence that is convenient for the user. But the content of the above table is still human readable and language dependent. However, the content will be made language independent because Gellish enabled software will insert additional columns with unique language independent identifiers (UIDs) for all objects, whereas those objects may be denoted by various language dependent names, translation, synonyms, codes, etc. Thus by extending the table with columns for such UIDs and interpreting the expressions as relations between UIDs, the expressions become language independent. The additional columns with object UIDs are illustrated in the following extension of the above table that uses the prefix ‘pr’:

Gellish Language=English Prefix=pr
1 5 2 60 15 66
UID of an idea UID of an intention UID of a left hand object UID of a kind of relation UID of a right hand object UID of a unit of measure
pr:1 970025 pr:101 5138 pr:102

Note that

  • The UIDs of new user objects shall be distinguished from the numeric UIDs that are standard in Gellish. Users can make their own UIDs by using a prefix, followed by a colon (:), followed by free codes. In this example column 3 with Column ID 2, containing the UIDs of the left hand objects therefore contains the prefix ‘pr’ and code ‘101’ for The Euromast, resulting in UID pr:101 as the unique identifier of The Euromast.
  • The UID of an idea is intended for being used for making statements about the expression as a whole. A name for the idea is usually not applicable. Therefore the column for those names is not included here.
  • An ‘assertion’ is a standard kind of intention with the Gellish UID 970025.
  • The Euromast and Rotterdam do not appear in the Gellish Dictionary (assumed), thus the user can allocate his own UIDs for those concepts.
  • The kind of relation ‘is located in’ is a standard phrase for the standard kind of relation with UID 5138.
  • The unit of measure column is not applicable for this expression, thus the columns for the UID as well as the symbol is left empty. If the columns are not required in a whole table, then the columns can be deleted from the table.
  • The Textual definition column is intended for a human readable definition of a concept. It is intended to be used only on a row where the concept is introduced by a classification relation (‘is classified as a’) or by a specialization relation (‘is a kind of’). As this is not applicable in the example, it is left empty. Note that the definition text is not denoting a separate object and has no UID.

Multi language support

There is a separate pair of columns available on each row for specifying the UID and name of the language of the name of the left hand object. This enables the use of various languages in the expressions in one table, including the specification that term is a translation of another term for the same concept. Furthermore, one or more separate columns can be inserted each of which for specifying alternative names for the left hand object in a specific language. The column ids for those columns should be the Gellish UIDs for the particular languages. For example, a table in English may include an expression in Dutch and an additional column with a name of the left hand objects in German (where applicable). This is illustrated in the following table:

54 31 101 910038 3 201
Name of a language Name of an intention Name of a left hand object Name in German Name of a kind of relation Name of a right hand object
English assertion The Euromast is located in Rotterdam
Dutch assertion De Euromast Die Euromast is a translation of The Euromast

Synonyms

Alias names for objects, such as synonyms, abbreviations, codes, etc. can be specified by explicit statements using phrases for the kinds of relations that express the appropriate subtype of the alias relation. There is no extra column required for such expressions. Aliases are specified in a similar way as the specification of a translation in the above example. For example:

101 3 201
Name of left hand object Name of kind of relation Name of right hand object
PC is an abbreviation of personal computer

Homonyms

Using the same name for different concepts, being homonym names, is enabled by distinguishing the objects by their different UIDs and by specifying different language communities in which the homonym names find their home. This is supported by a special pair of columns in the expression table for the UID and the name of the applicable language community. For example the term ‘bank’ in a the language community ‘business’ denotes an object with UID 990152, whereas the term ‘bank’ in the language community ‘civil technology’ denotes an object with UID 700140. Their defining expressions are illustrated in the following core table:

16 101 3 201 4
Language community Name of left hand object Name of kind of relation Name of right hand object Textual definition
business bank is a kind of organization that is intended to provide financial services.
civil technology bank is a kind of land that is located alongside the border of a water.

Contextual facts

Other columns are available for expression of contextual facts, such as approval status (column ID 8), date-time of creation, creator, etc. Those columns can be added depending in the requirements of the user by selecting them from a list of available columns. Multiple tables can be combined and different tables may consist of different collections of columns. This is described in detail in the document ‘The Gellish Syntax and contextual facts’

Header rows

The table has three header rows. The first header row contains the following fields:

Gellish English Version version code date of release name or category of expressions file name

followed on the same row by an optional sequence of name based parameter names and values:

Lower_obj_uid=n Upper_obj_uid=n Lower_rel_uid=n Upper_rel_uid=n Prefix=prefix ref_iris=IRIs

Notes:

  • The string ‘Gellish’ is an obligatory standard term in the first field.
  • The name ‘English’ in the second field may be replaced by another language name, such as ‘Nederlands’. It specifies the language that is used in text fields, although on each row it is possible to overwrite this by specifying the language that is used for the name of the left hand object. Furthermore there can be one or more additional columns that specify names of the left hand object in a particular language (as is described above).
  • The following five fields are optional free text fields.

The name based parameter names and values are intended for use with automated UID generation only.

  • The optional Lower_obj_uid and next three UIDs may specify ranges for new numeric UIDs.
  • The optional prefix specifies a prefix that may be used to precede a colon (:) and newly generated numeric codes for generating new UIDs for new concepts. For example, the prefix ‘pre’ can be used by software for generating the following sequence of UIDs within the object uid range 20-30: pre:20, pre:21, pre:22, etc. The software should first verify which is the highest current value within the range for the prefix.

The second header row of a table contains language independent column IDs as are shown in the above tables. The third header row contains free text names of the columns, corresponding with the column IDs, also as shown above.

Various examples of tables in Gellish Expression Format are given in the download area.

A detailed specification of the definition of the tabular format is given in the document ‘The Gellish Syntax and contextual facts’ that is available in the download section of the gellish website.