ISO 639/JAC N3R
ISO 639 Joint Advisory Committee
Working principles for ISO 639 maintenance
(8 March 2000)
The following documents working principles for the maintenance of language
codes by the ISO 639 Joint Advisory Committee both in ISO 639-1 (Alpha-2
code) and ISO 639-2 (Alpha-3 code). It repeats some information that is
in ISO 639-2:1998 in section 4 (Language codes) and the normative Annex
A. In addition, it gives further details as to how language code changes
that are submitted are considered and how the two parts of ISO 639 are related.
1. Definition of new language codes
1.1. Procedures
- A Registration
form is available on the Web for requesting new language codes,
which is submitted to the appropriate ISO 639 Registration Authority
for consideration.
- The Registration Authority will review applications, obtain additional
information and/or justification from the submitter, and suggest the
assignment of a code when the relevant criteria are met.
1.2. Criteria for ISO 639-2
- Number of documents. The request for a new language code shall
include evidence that one agency holds 50 different documents in the
language or that five agencies hold a total of 50 different documents
among them in the language. Documents include all forms of material
and is not limited to text.
- Collective codes. If the criteria above are not met the language
may be assigned a new or existing collective language code. The words
languages or other as part of a language name indicates
that a language code is a collective one.
- Scripts. A single language code is normally provided for a
language even though the language is written in more than one script.
ISO DIS 15924 Codes for the representation of names of scripts
is under development by ISO/TC46/SC2.
- Dialects.A dialect of a language is usually represented by
the same language code as that used for the language. If the language
is assigned to a collective language code, the dialect is assigned to
the same collective language code. The difference between dialects and
languages will be decided on a case-by-case basis.
- Orthography. A language using more than one orthography is
not given multiple language codes.
1.3. Criteria for ISO 639-1
- Relation to ISO 639-2. Since ISO 639-1 is to remain a subset
of ISO 639-2, it must first satisfy the requirements for ISO 639-2 and
also satisfy the following.
- Documentation.
- a significant body of existing documents (specialized texts, such
as college or university textbooks, technical documentation manuals,
specialized journals, subject-field related books, etc.) written
in specialized languages
- a number of existing terminologies in various subject fields (e.g.
technical dictionaries, specialized glossaries, vocabularies, etc.
in printed or electronic form)
- Recommendation.A recommendation and support of a specialized
authority (such as a standards organization, governmental body, linguistic
institution, or cultural organization)
- Other considerations
- the number of speakers of the language community
- the recognized status of the language in one or more countries
- the support of the request by one or more official bodies
- Collective codes. ISO 639-1 does not use collective codes.
If these are necessary the alpha-3 code shall be used.
2. Choice of new language codes
- Language codes consist of the following 26 letters of the Latin alphabet
in lower case with no diacritical marks or modified characters: a, b,
c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y,
z.
- ISO 639-2 uses three alphabetic characters, and ISO 639-1 uses two
alphabetic characters.
- Codes need not be abbreviations for the language as they are intended
to serve as an arbitrary device to identify a given language or group
of languages. Mnemonicity of codes is striven for, but this may not
always be possible or appropriate.
- An effort is made to derive a language code from a language's name
for itself, when possible. For historical reasons, some codes may be
based on the name of a language in English.
- There are 23 language names in ISO 639-2 that have variant codes,
one for bibliographic applications, the other for terminological applications.
This was because of established usage in national and international
bibliographic databases which employed codes based on English language
forms of names.
- New language codes shall be based on the vernacular form of name unless
- another language code is requested by the country or countries
using the language or the sponsor submitting the request;
- if the vernacular cannot be determined; or
- if a suitable code is not available
In the latter two cases, an English form of name may be used for to
derive the language code.
- A language code already in ISO 639-2/T which is based on the English
form of the name shall not be changed even if the vernacular form is
determined and/or added to ISO 639-1. This is to ensure continuity and
stability and to prevent the proliferation of multiple or alternative
codes.
- A prefix is not regarded as part of the language name for purposes
of assigning a code (e.g. Swahili is language name, although "KiSwahili"
is often used).
3. Changes of existing language codes
- To ensure continuity and stability in support of online retrieval
from large databases built over many years, codes shall not be changed.
- Where codes have been changed or discontinued in the past, the old
codes shall not be reassigned.
- Language codes shall not be changed if the conventional name of a
language is changed. However, language names associated with codes may
be changed.
- Variant forms of a language name may be included in the entry, separated
by a semicolon in the future. No effort will be made by the Registration
Authorities to collect those variants that were previously not included.
- The MARC Code List
for Languages maintains variant names of languages and may be
used as a reference source.
4. Relationship between ISO 639-1 to ISO 639-2
- In development of ISO 639-2 there was a principle that a code in the
alpha-3 list would include the 2 characters from the alpha-2 where possible.
An exception was the alternative codes, where longstanding and widespread
existing usage of bibliographic codes did not permit this.
- New codes introduced in ISO 639-1 that are already included in ISO
639-2 should follow this principle. If the vernacular form had not been
used in ISO 639-2/T, the ISO JAC will attempt establish an alpha-2 code
with two letters in common with the alpha-3 code when possible.
- ISO 639-1 shall be a subset of ISO 639-2.
- New codes will no longer be added to ISO 639-1 after the publication
of a revised standard unless they are also added to ISO 639-2.
- A language code already in ISO 639-2 at the point of freezing ISO
639-1 shall not later be added to ISO 639-1. This is to ensure consistency
in usage over time, since users are directed in Internet applications
to employ the alpha-3 code when an alpha-2 code for that language is
not available.
- New language codes may be considered for inclusion in both parts or
in ISO 639-2 only. If request is to add to ISO 639-1 it must also be
added to ISO 639-2 and satisfy the stated criteria.
See also Rules of procedure for conducting business (ISO
639/JAC N2R) .
Comments on this document: iso639-2@loc.gov
|