<!--This file created 09/09/96 10:59 by Claris Home Page version 1.0-->
<HTML>
<HEAD>
   <TITLE>Collation Guide</TITLE>
   <META NAME=GENERATOR CONTENT="Claris Home Page 1.0">
   <X-SAS-WINDOW TOP=41 BOTTOM=831 LEFT=19 RIGHT=791>
</HEAD>
<BODY BGCOLOR="#FFFFFF">

<H2 ALIGN=CENTER><I>Collation Demo Guide</I></H2>

<P><TABLE BORDER=0 CELLPADDING=10 WIDTH="100%" align=center>
   <TR>
      <TD align=right width=50%>
         <P ALIGN=CENTER><APPLET codebase="../code" CODE="CollateDemo.class" WIDTH=55
         HEIGHT=30 ALIGN=middle>(Java not enabled in this browser)
                                    </APPLET>
      </TD></TR>
</TABLE></P>

<P>Text collation supports language-sensitive comparison of strings,
allowing for text searching and alphabetical sorting. The
collation classes provide a choice of ordering strength (for example,
to ignore or not ignore case differences) and handle ignored,
expanding, and contracting characters.</P>

<P>Developers don't need to know anything about the collation rules
for various languages. Any features requiring collation can use the
collation object associated with the current default locale, or with
a specific locale (like France or Japan) if appropriate.</P>

<P ALIGN=CENTER><A HREF="#basics">Collation Basics</A> &nbsp;
<A HREF="#localizable">Localizable Collation</A> &nbsp;
<A HREF="#anchor3910987">Customization</A> &nbsp;
<A HREF="CollationDetails.html">Details</A></P>

<P><HR></P>

<H3><A NAME="basics"></A>Collation Basics</H3>

<P>Correctly sorting strings is tricky, even in English. The results
of a sort must be consistent&emdash;any differences in strings must
always be sorted the same way. The sort assigns relative priorities
to different features of the text, based on the characters themselves
and on the current ordering strength of the collation object.</P>

<P><TABLE BORDER=2 CELLPADDING=1 WIDTH="100%">
   <TR>
      <TH VALIGN=bottom align=left>
         <P ALIGN=LEFT>To See This...
      </TH><TH VALIGN=bottom align=left>
         <P ALIGN=LEFT>Do This...
      </TH></TR>
   <TR>
      <TD VALIGN=top width=50%>
         <P><I>Consistent sorting: </I>In English, uppercase letters
         always sort after lowercase letters whenever there are no
         other differences in compared strings.
      </TD><TD VALIGN=top>
         <P>&#183; Click on the <I>Sort Ascending </I>button.<BR>
         
         &#183; Click on the <I>Sort Descending </I>button.<BR>
         
         &#183; The relative order of "pat", "Pat", and "PAT"
         reverses.
      </TD></TR>
   <TR>
      <TD VALIGN=top width=50%>
         <P><I>Differences in ordering strength: </I>Secondary
         ordering strength means case differences are disregarded
         (enabling case-insensitive searching). With primary ordering
         strength accents are also ignored&emdash;only base letters
         are compared.
      </TD><TD VALIGN=top>
         <P>&#183; Select <I>Primary </I>from the Strength menu.<BR>
         
         &#183; Click alternately on <I>Sort Ascending </I>and
         <I>Sort Descending.</I><BR>
         
         &#183; The relative order of "pat", "Pat", and "PAT" stays
         the same
      </TD></TR>
</TABLE></P>

<P>Other special characters, including accented or grouped
characters, add other complications. For example, the "-" hyphen
character in the word "black-bird" is only significant if the other
letters in the compared strings are identical.</P>

<P ALIGN=CENTER><HR WIDTH="50%" align="LEFT"></P>

<H3><A NAME="localizable"></A>Localizable Collation</H3>

<P>Different collation objects associated with various locales handle
the differences required when sorting text strings for different
languages.</P>

<P><TABLE BORDER=2 CELLPADDING=5 WIDTH="100%">
   <TR>
      <TH VALIGN=bottom align=left>
         <P ALIGN=LEFT>To See This...
      </TH><TH VALIGN=bottom align=left>
         <P ALIGN=LEFT>Do This...
      </TH></TR>
   <TR>
      <TD VALIGN=top width=50%>
         <P>In French, accent differences are sorted from the
         <I>end</I> of the word, so the ordering of "p&ecirc;che" and
         "p&eacute;ch&eacute;" changes from the English ordering.
      </TD><TD VALIGN=top>
         <P>&#183; Select <I>Tertiary </I>from the Strength menu.<BR>
         
         &#183; Select <I>French (France)</I> from the Locale menu
      </TD></TR>
   <TR>
      <TD VALIGN=top width=50%>
         <P>In German the ordering of "T&ouml;ne" changes, because
         German treats <I>o + umlaut</I> (&ouml;) as if it were
         <I>oe</I>.
      </TD><TD VALIGN=top>
         <P>&#183; Select <I>German (Germany)</I> from the Locale
         menu.
      </TD></TR>
</TABLE></P>

<H3 ALIGN=CENTER><HR WIDTH="50%" align="LEFT"></H3>

<H3><A NAME="anchor3910987"></A>Customization</H3>

<P>You can produce a new collation by adding to or changing an
existing one. You can do this in the demo using the Collation Rules
field in the demonstration. This field shows the rules that make up
the collation sequence for that language. (At the start of the list,
are a number of odd-looking items such as"\u0308". These use Java
notation for Unicode characters, used here because most browsers are
currently unable to display the full range of Unicode characters.)
</P>

<BLOCKQUOTE><P>In all of the following examples, you can cut and
paste sample rules or test cases instead of typing them in manually.
Paste them at the end of the respective fields.</P></BLOCKQUOTE>

<P><TABLE BORDER=2 CELLPADDING=1 WIDTH="100%">
   <TR>
      <TD>
         <H4>To See This...</H4>
      </TD><TD>
         <H4>Do This...</H4>
      </TD></TR>
   <TR>
      <TD>
         <P>You can modify an existing collation. Adding items at the
         end of a collation overrides earlier information.</P>
         
         <P>For example, you can make the letter <I>P</I> sort at the
         end of the alphabet.
      </TD><TD VALIGN=top>
         <P>&#183; Enter the sample rules at the end of the Collation
         Rules field.<BR>
         
         &#183; Hit the Set Rules button.<BR>
         
         &#183; Select Sort Ascending to see the resulting sort
         order.
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P><B>Sample Rules:</B>
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P>&lt; p , P
      </TD></TR>
</TABLE></P>

<P>Making P sort at the end may not seem terribly useful, but it is
used to make modifications in the sorting sequence for different
languages.</P>

<H4><TABLE BORDER=2 CELLPADDING=1 WIDTH="100%">
   <TR>
      <TD>
         <H4>To See This...</H4>
      </TD><TD>
         <H4>Do This...</H4>
      </TD></TR>
   <TR>
      <TD VALIGN=top>
         <P>You can add new rules to an existing collation. For
         example, you can add CH as a single letter after C, as in
         traditional Spanish sorting.
      </TD><TD VALIGN=top>
         <P>&#183; Enter the sample rules at the end of the Collation
         Rules field.<BR>
         
         &#183; Enter the test cases at the end of the test
         field.<BR>
         
         &#183; Hit the Set Rules button.<BR>
         
         &#183; Select Sort Ascending to see the resulting sort
         order.
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P><B>Sample Rules:</B>
      </TD></TR>
   <TR>
      <TD VALIGN=top COLSPAN=2>
         <P>&amp; c &lt; ch , cH, Ch, CH
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P><B>Sample Test Cases:</B>
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P>cat<BR>
         
         czar<BR>
         
         churo<BR>
         
         darn
      </TD></TR>
</TABLE></H4>

<P>As well as adding sequences of characters that act as a single
character (this is known as <I>contraction</I>), you can also add
characters that act like a sequence of characters (this is known as
<I>expansion</I>).</P>

<P><TABLE BORDER=2 CELLPADDING=1 WIDTH="100%">
   <TR>
      <TH>
         <H4 ALIGN=LEFT>To See This...</H4>
      </TH><TH>
         <H4 ALIGN=LEFT>Do This...</H4>
      </TH></TR>
   <TR>
      <TD VALIGN=top>
         <P>You can also add other sequences to the collation rules,
         such as sorting symbols with their alphabetic equivalents.
      </TD><TD VALIGN=top>
         <P>&#183; Enter the sample rules at the end of the Collation
         Rules field.<BR>
         
         &#183; Enter the test cases at the end of the test
         field.<BR>
         
         &#183; Hit the Set Rules button.<BR>
         
         &#183; Select Sort Ascending to see the resulting sort
         order.
      </TD></TR>
   <TR>
      <TD VALIGN=top COLSPAN=2>
         <P><B>Sample Rules:</B>
      </TD></TR>
   <TR>
      <TD VALIGN=top COLSPAN=2>
         <P>&amp; Question-mark ; ?<BR>
         
         &amp; Hash-mark ; #<BR>
         
         <A NAME="Ampersand"></A>&amp; Ampersand ; '&amp;'
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P><B>Sample Test Cases:</B>
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P>?<BR>
         
         #<BR>
         
         &amp;
      </TD></TR>
</TABLE></P>

<P>Expansion and contraction can actually be combined.</P>

<P><TABLE BORDER=2 CELLPADDING=1 WIDTH="100%">
   <TR>
      <TD>
         <H4>To See This...</H4>
      </TD><TD>
         <H4>Do This...</H4>
      </TD></TR>
   <TR>
      <TD VALIGN=top>
         <P>In Japanese there is a length character that acts as
         though it doubles a character in sorting. Using analogous
         English letters, it would be as though "a-" sorts as "aa",
         "e-" sorts as "ee", etc.
      </TD><TD VALIGN=top>
         <P>&#183; Enter the sample rules at the end of the Collation
         Rules field.<BR>
         
         &#183; Enter the test cases at the end of the test
         field.<BR>
         
         &#183; Hit the Set Rules button.<BR>
         
         &#183; Select Sort Ascending to see the resulting sort
         order.
      </TD></TR>
   <TR>
      <TD VALIGN=top COLSPAN=2>
         <P><B>Sample Rules:</B>
      </TD></TR>
   <TR>
      <TD VALIGN=top COLSPAN=2>
         <P>&amp; aa ; a-<BR>
         
         &amp; ee ; e-<BR>
         
         &amp; ii ; i-<BR>
         
         &amp; oo ; o-<BR>
         
         &amp; uu ; u-
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P><B>Sample Test Cases:</B>
      </TD></TR>
   <TR>
      <TD COLSPAN=2>
         <P>aardvark<BR>
         
         a-rdvark<BR>
         
         abbot<BR>
         
         coop<BR>
         
         co-p<BR>
         
         cop
      </TD></TR>
</TABLE></P>

<P ALIGN=CENTER><HR WIDTH="50%" align="LEFT"></P>

<P>For more information on how collation rules are constructed, see
<A HREF="CollationDetails.html">Details</A>. You can also type in
additional words to see different collation behaviors. Try it out!
</P>

<HR> 
<BR>
<a href="../code/CollateDemo.java">The source.</a>
<p>
<HR>
<BR><a href="http://www.taligent.com/Copyright.html">&copy; Copyright 
1997</a>. All rights reserved. Taligent, Inc., IBM Corp.
</BODY>
</HTML>

