Version 3 (modified by vadim.godunko, 10 years ago) ( diff )


League --- Localization, Internationalization and Globalization

League provides support for unbounded form of string of full range of Unicode characters including Unicode algorithms such as normalization, case conversion and case folding, collation. Elements of Universal_String can be accessed directly by index or using cursors. Two forms of cursors are available now: characters cursor - iterate string character by character, and grapheme clusters cursor - iterate string as sequences of characters forms visible character. Several techniques used to speedup performance and minimize memory footprint, see League/Performance for more information.

Strings and Cursors

The type Universal_String is declared in the package League.Strings. It represents string of Unicode characters. Two conversion operations To_Universal_String and To_Wide_Wide_String allow conversion between Universal_String and Wide_Wide_String. Set of overloaded concatenation operators --- "&" --- provides concatenation operations between Universal_String, Universal_Character, Wide_Wide_String and Wide_Wide_Character and returns Universal_String. See following example:

with Ada.Wide_Wide_Text_IO; use Ada.Wide_Wide_Text_IO;

with League.Strings;        use League.String;

procedure Example is
   S : Universal_String;

   S := To_Universal_String ("Hello");
   S := S & ',';
   S := S & " world";

   Put_Line (S.To_Wide_Wide_String);
end Example;


It is possible to use usual loop for iteration:

procedure Iterator (S : Universal_String) is
   C : Universal_Character;

   for J in 1 .. S.Length loop
      C := S.Element (J);
   end loop;
end Iterator;

Additionally, several kinds of cursors are available:

  • Character_Cursor --- iterates over each character in the string;
  • Grapheme_Cluster_Cursor --- iterates over grapheme clusters (sequence of character represented as one character for user) in the string.

Iteration loop with cursor:

procedure Iterator (S : Universal_String) is
   C : Universal_String;
   J : Grapheme_Cluster_Cursor := First (S);

   while not J.Has_Element loop
      C := J.Element;
   end loop;
end Iterator;

Equivalence and compare

There are several kinds of important equivalence for strings. "Default" equivalence and compare are based on binary order of code points.


Collation is a same thing as comparison, but it result is more user friendly. Collation also can be tailored by the current locale. Comparison using collation is divided into two steps --- construction of sort keys and compare them. Following example show difference between binary comparison and collation:

with Ada.Wide_Wide_Text_IO; use Ada.Wide_Wide_Text_IO;

with League.Strings;        use League.Strings;

procedure Collation is
   S1 : Universal_String := To_Universal_String ("ёж");
   S2 : Universal_String := To_Universal_String ("ель");
   K1 : Sort_Key := S1.Collation;
   K2 : Sort_Key := S2.Collation;

   if S1 < S2 then
      Put_Line ("Binary comparison: less");

   elsif S1 = S2 then
      Put_Line ("Binary comparison: equal");

      Put_Line ("Binary comparison: greater");
   end case;

   if K1 < K2 then
      Put_Line ("Collation: less");

   elsif K1 = K2 then
      Put_Line ("Collation: equal");

      Put_Line ("Collation: greater");
   end if;
end Collation;

Case conversions and case folding

It is possible to convert case of the all characters in the string and do case folding with To_Lowercase, To_Uppercase and To_Casefold functions.

Note: See TracWiki for help on using the wiki.