Gene 4.3 Technical Notes

This HTML file documents the Gene shareware genealogy program for the Apple Macintosh, including information about its database storage format and about the resource data used to define new card types. You do not need to read this file to learn how to use Gene; instead refer to the Gene User Guide.

This documentation provides information on the following topics.

 

1. Gene File Format

Gene's files are stored in a flat-file text format, for ease of network transmission and use by other applications. You can use Gene to read any text file on your computer, whether or not that file was created by Gene, however Gene will complain if the file it reads does not conform to the format Gene expects. This section documents that format.

As with any Macintosh file, Gene's database files have two parts: the resource fork and the data fork. The resource fork contains pictures as well as descriptions of any extra nonstandard card types used by the database. The resource fork format is described later, when we disuss card type definitions. The data fork is text, organized as a sequence of cards, separated by blank lines. Some of the information about pictures is also stored here as a special kind of card. Each card is stored as a sequence of text lines, one for each field and a variable number for the card's unstructured text notes. The first line of a card gives the type and name of the card, and succeeding lines fill in field values and supply the card's text. An empty line in the database file signals the end of the card and a start of a new card.

As written by Gene, cards of the same type will be stored consecutively in the file, in alphabetical order by the cards' names. However, this order is not important to Gene when it reads the file. We describe below the details of the card format, and provide a short example of database file text.

 

1.1. Card Data Format

As discussed above, each card is stored as a sequence of lines in the database file. Blank lines are used to separate cards from each other. Gene's format requires all the different lines describing a card to include a colon. In the first line, the colon separates the card's type from its name. In the field lines, it separates the field name from its value. And in the text lines, the colon appears at the start of the line.

 

1.2. Format for Card Type and Name

The first line of a card supplies the card's type and name. The card type comes first, followed by a colon. If the card has a name, the card name follows the colon. No two cards of the same type can have the same name. If the card type does not include a name field, or if the card can have a name but is unnamed, the first line of the card ends immediately after the colon.

An unnamed card should always include a link to some other card; otherwise there would be no way to open the card. However Gene does not check to make sure that the cards in its files have links or names.

 

1.3. Format for Card Fields

The card format includes one line for each nonempty field of the card. Each such line consists of the field's name and a colon, followed by the field's value. Field values are written the same way they would be displayed when the card is viewed, and their format is described in the Gene User Guide.

A link field may refer to a card appearing later in the database. If a link refers to a card that does not appear in the database, a new card will be created with the given name.

 

Fields in Picture Cards

Gene represents pictures as a special type of card. Pictures are distinguished from other cards in Gene by having fields of type "PictID" and "PictLink".

 

1.4. Format for Card Text

The text area in a Gene card is by definition unformatted. However there is some minimal amount of formatting needed within the database file, so that Gene can distinguish between the text of one card and data from subsequent cards. In particular, we allow the text area to contain blank lines, and these must not be confused in the database file with the blank lines used to separate cards.

For this reason, every line of text in the database file begins with a colon. Text is thus distinguished from the fields in that the lines for fields begin with the field name (field names are not allowed to begin with a colon).

When Gene writes a database file it will place the text after the fields. However it can read files in which text and fields are mixed. The lines of text will occur in the card in the same order in which they appear in the database file.

 

1.5. Picture Format

Gene represents pictures as a special type of card, together with some extra information stored in the resource fork of the database file. Pictures are distinguished from other cards in Gene by having two special fields, of type PictID and PictLink. The PictID gives the resource IDs of the PICT or alis resources defining the picture itself, and the PictLink stores the buttons linking the picture to other cards. Because the fields of a picture can only be manipulated indirectly with the Edit Picture dialog, a picture card should have at most one more field, its name.

 

1.6. Example of Database Format

The following lines are taken from a database file and show three cards. The first two (a person and a document) have names, while the third (a marriage) is of an unnamed type. The first card has five lines of text, while the others have empty text panes. In addition, the person and marriage cards contain links to two other person cards, for Hannah Brooks and Thomas Fox (1).
Document: Descendants of Thomas Fox
Author: Henry Fox
:Full title:
:Descendants of Thomas Fox, circa 1620, of Concord, Massachusetts
:
:Lots of information on the Fox family.
:The NYPL copy is in pretty bad condition, and some pages are missing.

Person: Thomas Fox (2)
Birthday: 26 February 1650
Mother: Hannah Brooks
Father: Thomas Fox (1)
Sex: male

Marriage:
Wife: Hannah Brooks
Husband: Thomas Fox (1)
Date: 13 December 1647

 

2. Resources for Defining Card Types

All the cards used by Gene were defined by including certain resources (a form of structured information used in all Macintosh files) as part of the Gene application. By including similar resources in a Gene database file, you can define additional card types that can be used in that file. For instance, in a database of royalty and nobility, it might make sense to have cards describing the titles and heraldry of each person. Here we define the format of those resources; Gene also allows some other resources, described later, to appear in database files.

 

2.1. Creating and Saving Files with Resource Definitions

Gene does not provide an interface for editing file resources; instead you must use a general purpose resource editor such as ResEdit (available from Apple). Always make a backup copy of your file before using ResEdit. As a convenience for ResEdit users, Gene includes a TMPL resource in its database files, which ResEdit uses to describe the format of Gene's card definition resources.

If Gene reads a database file that contains definitions of new card types, it will remember those definitions and write them out again when it saves the file. If a Gene database includes resources that are not meaningful to Gene, those resources may be lost when the file is saved.

When you use the Merge command to combine two Gene databases, and the second file contains definitions of new card types, something mysterious happens that is likely to produce a bogus file. It should probably be ok if the first file is the only one with new card type definitions.

 

2.2. Specific Resource Types

The resources used to define card types are the following; the most important of these for new card definitions are "Card", "Link", and "Lnk#".

 

"Card" Resources

The "Card" resource contains the actual definition of each new card type. All other resources are less important (although some such as Link resources are necessary for certain types of cards). For each new card type, there should be a "Card" resource, having the card type name as the resource name. The resource ID is unimportant.

Card resources have four values that are part of the overall definition, and a sequence of values defining each field of the card. The four values are:

After the four general values in a card definition, the Card resource includes a sequence of two values for each field: the field name (any string not including a colon) and a string describing the field type. The possible field types are: We now go through an example of a card definition for the Person card in Gene. The four general values of the "Person" card resource are:
        Named Field: 0
        Sort Field: 1
        Link String: 128
        Name Sort: Person
The six fields of a person have as names and field descriptors:
        Name            Name Person
        Birthday        Date
        Birthplace      Link Place
        Mother          Link Person
        Father          Link Person
        Sex             Enum Sex

 

"Enum" Resources

Each enumerated type (such as the sex of a person) is specified using an "Enum" resource. The ID is unimportant, but the name of the resource is used for the name of the corresponding field type (e.g. the Enum resource named "Sex" corresponds to fields with type "Enum Sex").

The values of an Enum resource are simply a sequence of strings specifying the values the corresponding fields can have. Any field of the given type must either have one of the given strings as its value, or may be blank. For the "Sex" Enum resource in Gene, these strings are "male" and "female".

 

"Link" Resources

The "Link" resource is used by Gene to turn information from cards into text strings, in several situations: when generating the information about the card in the link pane of another card, when drawing verbose trees, and when exporting Gene data to a GEDCOM file. A given Link resource will handle one of these situations, for one specific type of card; Lnk# resources are used to tell which Link resource to use.

Each Link resource consists of a sequence of operations, and should be thought of as a small computer program consisting of a sequence of simple operations that translate the information on a card's fields into a text string. When a Link resource is used, the operations are performed one at a time, starting from the first one. The order in which operations are defined in the Link resource is largely irrelevant; after each operation is performed, information in that operation tells Gene the next operation to perform. Six fields appear in an individual operation, although most operations will leave one or more of these fields blank.

The operations in a Link resource can be broadly classed into two types. Some operations output a string from information on the given card, and then always perform the next operation according to their "Next" field, which should give the number of another operation in the Link resource. The Next field of the last operation in the sequence should be set to zero.

Other link operations are "conditional" operations that do not output anything, but instead control the order in which other operations are performed. Each such operation tests a condition (such as whether some field of the card is empty), and depending on whether the answer is yes or no, performs next the operation with the number in the "Yes" or "No" field of the conditional. If this number is zero, no operation is performed. A conditional operation may still have a nonzero "Next" field, just like an output operation; in this case the operation at that number is performed after the operations from the "Yes" or "No" field.

The following Link operation types are currently available:

We give as an example a link resource used to create the link pane text for "Death" cards. We show side by side the actual sequence of operations, and a version of the same program expressed in slightly more intelligable pseudocode.
1. C Fld:0 Next:5 Y:2 N:3       If link pane is for person who died
2. S Str:"Died "                        output "Died "
3. F Fld:0 Next:4               else    output name of person who died
4. S Str:" died "                       output " died "
5. F Fld:1 Next:6               Output date of death
6. C Fld:2 No:7                 If link pane isn't for place of death
7. E Fld:2 No:8                         and place of death isn't empty
8. S Next:9 Str:" at "                  output " at "
9. F Fld:2                              output place of death

 

"Lnk#" Resources

The "LnkN" resource provides a mechanism for applying different Link resources to produce strings for a given card type in different contexts. Contexts such as creating link panes, tree drawings, and GEDCOM output, are represented in Gene simply as small numbers, starting from zero. Each Lnk# resource is simply an array of Link resource IDs. The first ID in the list is used to create strings for context zero, the second for context one, and so on. If there is no need to create a string in a certain context, the ID in that position of the array should be zero. The Lnk# resource also supplies a "default" Link resource id to be used when no context-specific resource is available.

The following contexts are currently in use by Gene.

0: Link pane text.
1-2: Verbose tree person and links.
3-4: GEDCOM output (controlled by gOut resources).
5: Window name (for window menu, also used by some tree output formats).
6-8: Terse tree birth date, death date, close paren.
9-10: Name&Date tree birth date, death date.
11: Additional events for All Events tree format.
For backward compatibility with older versions of Gene, if no Lnk# resource has the resource number specified in the card definition, Gene will instead look for a Link resource with that number, and use it in all contexts.

 

"Redr" Resources

The "Redr" resource is used to "redirect" one card to point to another, so that a reference to the first type of card becomes an indirect reference to the second type of card. The resource has two fields: the name of a card type, and the number of the field on that card that should be used as the indirect reference. There can be at most one Redr resource per card type.

For tree drawings, a Redr resource is used on adoption cards, pointing to the person card of the adoptee. The tree drawing commands use this information to include adopted children in trees.

It is intended that Redr resources can be used on named cards, causing any links to those cards to automatically become links to the indirect reference. This could only happen if both cards had names in the same card name list, which is not yet possible.

 

3. Other Resources

Along with the resources used to define card types, a Gene database file can contain the following other types of resource.

 

"alis" Resources

The "alis" resource is a Macintosh-standard way of storing aliases to files. (An alias contains not only the file name but also some other information that helps the Macintosh system to make sure that it is pointing to the correct file and to find the file if it has moved.) Gene uses aliases for storing pointers to PICT and JPEG files, containing pictures that are stored externally to the Gene database itself. Each alis resource should correspond to a number in the PictID field of a picture card. Since pictures can also be stored as PICT resources within the database, the alis and PICT resource ID's should be chosen to be distinct from each other. This numbering is done automatically by Gene when a user creates a picture.

 

"Dflt" Resources

The "Dflt" resource lets Gene automatically fill in certain enum fields when links are made. For instance, when the "Wife" field of a marriage card is set, the sex of the wife is set to female if it was blank. The name and ID of Dflt resources are unimportant. Each Dflt resource has four values:

 

"gOut" Resources

The "gOut" resource controls the translation of the Gene database to a GEDCOM file in Gene's Export command.

A GEDCOM file consists of a sequence of lines. Each line consists of a "level number", an optional name, a four-letter "tag", and then some data which may be the name of another line or unstructured text. In Gene, names of GEDCOM lines have the form "@Xid@" or "@Xid1:id2@" where X is a letter and id, id1, and id2 are numbers formed using the 'I' operation in a Link resource.

The level numbers represent a tree: the root of the tree is not specified and should be thought of as being at level -1; lines with a level number of zero are children of the root; lines with level number one are children of the previous level-zero line; and so on. Gene's output consists of some preamble lines, the data itself, and some postamble lines. The data lines are structured as a sequence of level-0 subtrees, each of which corresponds to either a single Gene card or a pair of cards. Each subtree can contain lines from the corresponding card(s), and also from the cards with links to those cards.

One particular type of GEDCOM line that needs to be dealt with specially in Gene is the "NOTE" (the word "NOTE" appears as the tag of such lines). NOTEs have a function analogous to the text fields of Gene's cards. Text in note lines is limited to 80 characters, but notes can be extended by two kinds of lines, each of which must have a level number one greater than the "NOTE" line: "CONC" adds more text to the previous NOTE line itself, and "CONT" continues the note on a new line. As we discuss below, gOut resources provide a mechanism for translating text into NOTE lines and their continuations. Gene automatically breaks the note into shorter chunks using CONC and CONT lines.

Each gOut resource controls the lines output by some card type in one of these subtrees. A gOut has five components:

For example, Gene has two gOut resources with card type "Person". The first is used to create a subtree containing most of the information about the person, which in Gene is stored in several different cards; this first gOut has link type 3, ID1=0 and ID2=-1 (specifying that the information is to be put in a subtree corresponding to the given person), and note number 1. The second gOut is used to put a pointer to the person in a subtree corresponding to a "family". Gene itself has no concept of families, so it creates families in the GEDCOM file for any two people involved in a birth or marriage. The second gOut has link type 4, ID1=2 and ID2=1, telling Gene to put the GEDCOM text in a subtree indexed by the father and mother; it has note number -1 since the person's text pane is handled by the other gOut.

 

"PICT" Resources

The "PICT" resource is a Macintosh-standard way of storing bitmap images. Gene uses PICTS for storing pictures internally to the Gene database. Each PICT resource should correspond to a number in the PictID field of a picture card. Since pictures can also be stored as alis resources within the database, the alis and PICT resource ID's should be chosen to be distinct from each other.

 

"TMPL" Resources

The "TMPL" resource is used to describe the format of the other resources, so that they can be edited easily within ResEdit. When you save a database file, Gene will automatically create a set of TMPL resources in that file, describing the other possible types of resource. Even if a file uses only the standard types of cards, the TMPL resources will still be created, so that you can use ResEdit to define new card types. These TMPL resources are not used by Gene, but they should not be changed.

 

"Tplt" Resources

The Tplt resource is used to define ways of creating new cards using information from existing cards. Tplt resources consist of some general values specifying what kind of card the template creates and when it applies, followed by a list of fields to copy from the old card to the new one.

Each Tplt resource corresponds to an entry in the Templates menu. The name of the menu entry is taken from the name of the resource. Each menu entry may come from several Tplt resources, for instance if the template depends on the gender of a person. If no Tplt resource with a given name applies, the menu entry will be inactive.

Each Tplt resource contains the following values:

After the general values specified above, each Tplt resource contains a list of the fields to copy from the previously existing card to the new card created by the template. Each entry in the list consists of two numbers, the first one specifying a number of a field in the existing card, and the second one specifying the number of the field in the new card to which the information should be copied. When Gene performs a template command, it goes through this list and copies the field values as strings. It is not necessary for the fields on the existing card and the newly created card to have the same types; Gene will create a string for the copied value and attempt to set the new card's field value to that string, as if you had typed that string directly into the new card.

The following example gives the values describing the "Child" template, active when the currently open card refers to a male person.

Command Key     B       Cmd-B is a shortcut for this template.
From Card       Person  It copies information from a person card
To Card         Person  to a new person card.
Enum Value      male    It is only active when the person is male
Enum Field      5       (there is a similar Tplt for female people).

From Field      0       The template copies the person's name
To Field        4       to the Father field of the new card
A template that copies more than one field (such as the Divorce template for Marriage cards, which copies both the Husband and Wife fields) would have more than one From Field and To Field pair, one for each piece of information that is copied.

 

"Tree" Resources

Tree resources describe commands in the Tree menu. The name of each command is taken from the corresponding resource name. Each Tree resource consists of the following values.

 

4. A Brief History of Gene

We divide the history of Gene's development into three parts:

 

4.1. Previous Versions of Gene

Gene 4.x is the fourth incarnation of a program that has been rewritten in several languages, on several platforms, with several database formats.

David Eppstein wrote Gene 1.0 in Pascal on a DECsystem-20 at Stanford University. It used a standard DEC-20 command line interface and its data format resembled the fields of the present person cards. He later rewrote Gene 2.0 using Prof. Don Knuth's WEB system of "literate programming".

David then moved to Columbia University, and rewrote Gene 3.0 in C++ for Unix, keeping a similar command-line interface. To make up for the lack of a text pane or other cards than people, he added complex user-defined fields to the file format (making it more similar to GEDCOM than Gene's present format).

After David and Diana married, moved to Irvine, and started using the Macintosh, Diana rewrote Gene 4.0, with design input from David. The tree drawing code was adapted from the previous version of Gene, and we re-used some basic data structures such as the splay trees used to look up people's names, but the data storage and user interface code is completely new. Gene is now written in Symantec C++ for Macintosh, and uses a simple but flexible card-based format (documented in this file) and a point-and-click user interface (documented in the Gene User Guide).

 

4.2. Recent Changes to Gene

After the original release of Gene 4.0 in June 1994, three minor releases 4.0.1, 4.0.2, and 4.0.3 were made. These mostly contained bugfixes, but Gene 4.0.3 also included some changes of functionality: a better user interface for setting tree drawing widths, templates for children from marriage cards, and backward compatibility with 68000 machines and system 6. Along with more bugfixes, our next major release, 4.1, added the following features to Gene 4.0:

Another minor release, Gene 4.1.1, did not add any new features, but removed a number of bugs in Gene 4.1.

Major release 4.2 added the following new features:

It also made some minor user interface improvements, such as pop-up menus to set sex fields and allowing a small margin of overlap when printing multi-page tree drawings. Another minor release, Gene 4.2.1, did not add any new features, but removed a number of bugs in Gene 4.2.

This document describes Gene 4.3. It adds the following new features:

 

4.3. Gene Futures

We are continuing to maintain and improve Gene. Our top priorities for new features are Other possible future features include

Copyright 1995-2000 David and Diana Eppstein.