One of the most important decisions is how the PIM data is encoded whithin a data file or database. The format must be extensible and support a rich set of features, but must still be easy to handle.
We take a suggestion from the SyncML specification: Any SyncML server that supports a contact database must support the vCard 2.1 (see vCard21) and the vCard 3.0 (see [RFC 2425] and [RFC 2426]) format.
The “v” in vCard stands for the versit consortium. This consortium has also published other standards, such as vCalender and vTodo. Although the versit consortium itself does not exist anymore, those standards are still the mostly used and most widely accepted. Many open source applications use vCard internally as data format and many E-mail programs have the capability to attach business cards in vCard format.
The vCard format is uses the standard 7-bit ASCII character set
for its contents. A vCard is a line oriented text file. Each line
consists of a property, a colon, and a value. Multiple vCards may
exist in one file: the two special lines
BEGIN:VCARD
and
END:VCARD
define the begin and end of
a vCard entry. The vCard format itself is very easy human readable, so
let us just take a look at Example 4.1, “Minimal version of my personal vCard”:
When it comes to 8-bit enconding, the Versit format shows its
origin in the US: 8-bit encoding itself is simple, character set
selection is not. For encoding of 8-Bit data the vCard standard
defines encode parameters for “quoted-printable” and
“base64”. This allows vCards to contain other data such
as photographs (PHOTO
), company logos
(LOGO
), public cryptographic keys
(KEY;PGP
) and others.
The character set selection has to happen on a higher level, outside the actual vCard data stream. This makes 8-bit charachters such as German umlauts dependent on the processing system. vCard 2.1 defined a way to specify the character set of single entries, but this is dropped in the newer 3.0 version.
The vCard specification defines many properties. Most of them are self explanatory and not really relevant for the sync process. Some fields have special functions and need to be explained:
- VERSION
Defines the vCard version. Can be either 2.1 or 3.0. Depending on the version the processing has to be a little different. This is explained in the changes section.
- FN
This field contains the formatted name for a person. Although it is not handled specially in any way, this is the field we want to use when referring to a card in display outputs, like a debug log.
- N
The N property contains the name parts for a person, separated by semicolons. They are: family name, given name, additional names, name prefix and name suffix. For comparison, theses five fields are considered like five separate properties.
- UID
The Unique Identifier for this card. Although it is supposed to be unique, it might differ from client to client. So we have to know about it to change for every client.
- REV
The REV property contains the revision on an element or, more commonly speaking, the last changed date. This is used for dynamically building transaction logs.
Although it lacks some very nice additions, the vCard 2.1 format is still the most widely used standard. The newer 3.0 version, however, has some new features:
In vCard 2.1 parameters are just added to the properties with a semicolon. In vCard 3.0 those parameters are described with the type keyword.
The vCard 3.0 format offers some new fields and new types. Those have to be removed when syncing with 2.1 clients.
And last, but not least, the vCard 3.0 standard defines quoting of 8-bit characters a little different than 2.1 did. It no longer supports the “quoted-printable” format.
Here is my vCard from Example 4.1, “Minimal version of my personal vCard” again, this time in 3.0 format: