STRINGS EDIT
version 1.7
by Dmitry A. Kazakov
(mailbox@dmitry-kazakov.de)
This library is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
As a special exception, if other files instantiate generics from this unit, or you link this unit with other files to produce an executable, this unit does not by itself cause the resulting executable to be covered by the GNU General Public License. This exception does not however invalidate any other reasons why the executable file might be covered by the GNU Public License.
Download strings_1_7.tgz (tar + gzip, Windows users may use WinZip)The package Strings_Edit provides I/O facilities. The following I/O items are supported by the package:
The major differences to the standard Image/Value attributes and Text_IO procedures are:
The current version was tested with GNAT compiler. See also changes log.
Get procedures are used to scan strings. The first two parameters are always Source and Pointer. Source is the string to be scanned. Pointer indicates the current position. After successful completion it is advanced to the first string position following the recognized item. The value of Pointer shall be in the range Source'First..Source'Last+1. The Layout_Error exception is propagated when this check fails. The third parameter usually accepts the value. The following example shows how to use get procedures:
package Edit_Float is new Float_Edit (Float);
use Edit_Float;
. . .
Line : String (1..512); -- A line to parse
Pointer : Integer;
Value : Float;
TabAndSpace : Ada.Strings.Maps.Character_Set :=
To_Set (" " & Ada.Characters.Latin_1.HT);
begin
. . .
Pointer := Line'First;
Get (Line, Pointer, TabAndSpace); -- Skip tabs and spaces
Get (Line, Pointer, Value); -- Get number
Get (Line, Pointer, TabAndSpace); -- Skip tabs and spaces
. . .
The numeric get procedures have additional parameters controlling the range of the input value. The parameters First and Last define the range of the expected value. The exception Constraint_Error is propagated when the value is not in the range. The exception can be suppressed using the parameters ToFirst and ToLast, which cause the input value to be substituted by the corresponding margin when the parameter is True.
The numeric get procedures may have the parameter Base of the subtype NumberBase. The parameter defines the base of the expected number (2..16). Note that the base specification may not appear in the input.
Each get procedure returning some value has a corresponding function Value . The function Value has the same parameter profile with the exception that the parameter Pointer is absent and the value is returned via result. Unlike Get the function Value tolerates spaces and tabs around the converted value. The whole string should be matched, otherwise, the exception Data_Error is propagated.
Put procedures place something into the output string Destination. The string is written starting from Destination (Pointer). The parameter Field defines the output size. When it has the value zero, then the output size is defined by the output item. Otherwise the output is justified within the field and the parameter Justify specifies output alignment and the parameter Fill gives the pad character. When Field is greater than Destination'Last - Pointer + 1, the latter is used instead. After successful completion Pointer is advanced to the first character following the output or to Destination'Last + 1.
The numeric put procedures may have the parameter Base of the subtype NumberBase. The parameter defines the base of the output (2..16). Note that the base specification will not appear in the output.
Image functions convert a value into string. Unlike standard S'Image they do not place an extra space character.
The package Strings_Edit provides basic tools for string I/O.
procedure Get
( Source : in String;
Pointer : in out Integer;
Blank : Character := ' '
);
This procedure skips the character Blank starting from Source (Pointer). Pointer is advanced to the first non-Blank character or to Source'Last + 1. The exception Layout_Error is propagated if the value of Pointer is not in the range Source'First..Source'Last + 1.
procedure Get
( Source : in String;
Pointer : in out Integer;
Blanks : Ada.Strings.Maps.Character_Set
);
This procedure skips all the characters of the set Blanks starting from Source (Pointer). Pointer is advanced to the first non-blank character or to Source'Last + 1. The exception Layout_Error is propagated if the value of Pointer is not in the range Source'First..Source'Last + 1.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Character;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure places the specified character (Value parameter) into the output string Destination. The string is written starting from the Destination (Pointer). The exception Layout_Error is propagated if the value of Pointer is not in Destination'Range or there is no room for the output.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : String;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure places the specified by the Value parameter string into the output string Destination. The string is written starting from the Destination (Pointer). The exception Layout_Error is propagated if the value of Pointer is not in Destination'Range or there is no room for the output.
The child package Strings_Edit.Quoted provides functions for handling quoted strings. A quoted string is put in quotation marks, while each quotation mark within the string is doubled. This allows unambiguously restore the original string from its quotation.
function Get_Quoted
( Source : String;
Pointer : access Integer;
Mark : Character := '"'
) return String;
This function gets a quoted string. String (Pointer.all) is the first character of the string. Pointer is advanced to the the first character following the input, note that it is an access to integer rather than pöain integer, because functions in Ada cannot have in out parameters. The parameter Marks specifies the quotation marks to use. Within the body of a quoted text this character is doubled. The result is the original quoted text with quotation marks around it removed. The quotation marks within the text are halved. The exception Data_Error is propagated when the string at Pointer.all does not contain a Mark character or else when no closing Mark character appears before the string end. The exception Layout_Error is propagated if the value of Pointer.all is not in the range Source'First..Source'Last + 1.
procedure Put_Quoted
( Destination : in out String;
Pointer : in out Integer;
Text : String;
Mark : Character := '"';
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure puts Text in Mark quotes and places the result into String starting from the position indicated by Pointer. Pointer is advanced to the the first character following the output. Mark characters are doubled within the string body. The exception Layout_Error is propagated if there is no room for output or Pointer is not in Source'First..Source'Last + 1.
function Quote
( Text : String;
Mark : Character := '"'
) return String;
This function returns Text quoted using the Mark character.
The child package Roman_Edit provides I/O routines for roman numbers. The type Roman is defined there as follows:
type Roman is range 1..3999;
The following subroutines are declared for the type:
procedure Get
( Source : in String;
Pointer : in out Integer;
Value : out Roman;
First : Roman := Roman'First;
Last : Roman := Roman'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
);
This procedure gets a roman number from the string Source. The process starts from Source (Pointer). The exception Constraint_Error is propagated if the number is not in the range First..Last. Data_Error indicates a syntax error in the number. End_Error is raised when no number was detected. Layout_Error is propagated when Pointer is not in the range Source'First .. Source'Last + 1. See also description of get procedures.
function Value
( Source : String;
First : Roman := Roman'First;
Last : Roman := Roman'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
) return Roman;
This function gets the roman number from the string Source. The number can be surrounded by spaces and tabs. The whole string Source should be matched. Otherwise the exception Data_Error is propagated. Also Data_Error indicates a syntax error in the number. The exception Constraint_Error is propagated if the number is not in the range First..Last. End_Error is raised when no number was detected.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Roman;
LowerCase : Boolean := False;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure places the number specified by the parameter Value into the output string Destination. The string is written starting from Destination (Pointer). The parameter LowerCase determines whether upper or lower case letters should be used. The exception Layout_Error is propagated when Pointer is not in Destination'Range or there is no room for the output.
function Image
( Value : Roman;
LowerCase : Boolean := False
) return String;
This function converts Value to string. The parameter LowerCase indicates whether upper or lower case letters shall be used.
The package Strings_Edit has a generic child package Integer_Edit:
generic
type Number is range <>;
package Strings_Edit.Integer_Edit is ...
It is parameterized by an integer type. There is also package Strings_Edit.Integers which is an instantiation of Integer_Edit with the type Integer as the parameter. The generic package has the following subprograms:
procedure Get
( Source : in String;
Pointer : in out Integer;
Value : out Number'Base;
Base : NumberBase := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
);
This procedure gets an integer number from the string Source. The process starts from Source (Pointer). The parameter Base indicates the base of the expected number. The exception Constraint_Error is propagated if the number is not in the range First..Last. Data_Error indicates a syntax error in the number. End_Error is raised when no number was detected. Layout_Error is propagated when Pointer is not in the range Source'First .. Source'Last + 1. See also description of get procedures.
function Value
( Source : String;
Base : NumberBase := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
) return Number'Base;
This function gets an integer number from the string Source. The number can be surrounded by spaces and tabs. The whole string Source should be matched. Otherwise the exception Data_Error is propagated. Also Data_Error indicates a syntax error in the number. The exception Constraint_Error is propagated if the number is not in the range First..Last. End_Error is raised when no number was detected.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Number'Base;
Base : NumberBase := 10;
PutPlus : Boolean := False;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure places the number specified by the parameter Value into the output string Destination. The string is written starting from Destination (Pointer). The parameter Base indicates the number base used for the output. The base itself does not appear in the output. The parameter PutPlus indicates whether the plus sign should be placed if the number is positive. The exception Layout_Error is propagated when Pointer is not in Destination'Range or there is no room for the output. For example the code:
Text : String (1..20) := (others =>'#');
Pointer : Positive := Text'First;
. . .
Put (Text, Pointer, 5, 2, True, 10, Center, '@');
will set Pointer to 11 and overwrite the first 10 characters of the string Text:
@ @ @ + 1 0 1 @ @ @ # # # # # # # # # #
function Image
( Value : Number'Base;
Base : NumberBase := 10;
PutPlus : Boolean := False
) return String;
This function converts Value to string. The parameter Base indicates the number base used for the output. The base itself does not appear in the output. The parameter PutPlus indicates whether the plus sign should be placed if the number is positive.
The package Strings_Edit.Integers is an instance of Strings_Edit.Integer_Edit with the type Integer as the parameter.
The package Strings_Edit has a generic child package Float_Edit:
generic
type Number is digits <>;
package Strings_Edit.Float_Edit is ...
The package is parametrized by a floating-point type. There is also package Strings_Edit.Floats which is an instantiation of Float_Edit with the type Float as the parameter. The package defines the following subprograms:
procedure Get
( Source : in String;
Pointer : in out Integer;
Value : out Number'Base;
Base : NumberBase := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
);
This procedure gets a number from the string Source. The process starts from Source (Pointer). The number in the string may be in either floating-point or fixed-point format. The point may be absent. The mantissa can have base 2..16 (defined by the parameter Base). The exponent part (if appears) is introduced by 'e' or 'E'. It is always decimal of Base radix. Space characters are allowed between the mantissa and the exponent part as well as in the exponent part around the exponent sign. If Base has the value 15 or 16 the exponent part shall be separated by at least one space character from the mantissa. The exception Constraint_Error is propagated if the number is not in the range First..Last. Data_Error indicates a syntax error in the number. End_Error is raised when no number was detected. Layout_Error is propagated when Pointer is not in the range Source'First .. Source'Last + 1. See also description of get procedures.
function Value
( Source : String;
Base : NumberBase := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
) return Number'Base;
This function gets a floating-point number from the string Source. The number can be surrounded by spaces and tabs. The whole string Source should be matched. Otherwise the exception Data_Error is propagated. Also Data_Error indicates a syntax error in the number. The exception Constraint_Error is propagated if the number is not in the range First..Last. End_Error is raised when no number was detected.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Number'Base;
Base : NumberBase := 10;
PutPlus : Boolean := False;
RelSmall : Positive := MaxSmall;
AbsSmall : Integer := -MaxSmall;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
);
This procedure places the number specified by the parameter Value into the output string Destination. The string is written starting from Destination (Pointer). The parameter Base indicates the number base used for the output. Base itself does not appear in the output. The exponent part (if used) is always decimal. PutPlus indicates whether the plus sign should be placed if the number is positive. There are two ways to specify the output precision:
From two parameters RelSmall and AbsSmall, the procedure chooses one, that specifies the minimal number of mantissa digits, but no more than the machine representation of the number allows. If the point would appear in the rightmost position it is omitted. The pure zero is always represented as 0. If the desired number of digits may be provided in the fixed-point format then the exponent part is not used. For example, 1.234567e-04 gives 0.0001234567 because fixed- and floating-point formats have the same length. But 1.234567e-05 will be shown in the floating-point format. For bases 15 and 16 the exponent part is separated from the mantissa by space (to avoid ambiguity: F.Ee+2 is F.EE + 2 or F.E * 16**2?). The exception Layout_Error is propagated when Pointer is not in Destination'Range or there is no room for the output.
function Image
( Value : Number'Base;
Base : NumberBase := 10;
PutPlus : Boolean := False
RelSmall : Positive := MaxSmall;
AbsSmall : Integer := -MaxSmall;
) return String;
This procedure converts the parameter Value to String. The parameter Base indicates the number base used for the output. Base itself does not appear in the output. The exponent part (if used) is always decimal. PutPlus indicates whether the plus sign should be placed if the number is positive. For precision parameters see Put.
The package Strings_Edit.Floats is an instance of Strings_Edit.Float_Edit with Float as the parameter.
The package Strings_Edit.UTF8 is the parent package for dealing with Unicode Transformation Format UTF-8 encoded strings. Ada 95 supports Latin-1 (type Character) and UCS-2 (Wide_Character) of ISO 10646 BMP. At the same time many applications and libraries are using rather UTF-8 than UCS-2. Because UTF-8 was designed for backward compatibility with 7-bit ASCII applications and is a multi-byte encoding format, I chose not to introduce a separate string type for UTF-8. Conventional Ada strings are used instead. It is important to note:
The package defines the type UTF8_Code_Point that represents the Unicode codespace:
type Code_Point is mod 2**32;
subtype UTF8_Code_Point is Code_Point range 0..16#10FFFF#;
The following subroutines are provided by the package:
procedure Get
( Source : String;
Pointer : in out Integer;
Value : out UTF8_Code_Point
);
This procedure decodes one UTF-8 code point from the string Source. It starts at Source (Pointer). After successful completion Pointer is advanced to the first character following the input. The result is returned through the parameter Value.
Data_Error | Illegal UTF-8 string Source |
End_Error | Nothing found. Pointer = Source'Last + 1 |
Layout_Error | Pointer is not in Source'First..Source'Last + 1 |
function Image (Value : UTF8_Code_Point) return String;
This function is a simplified version of the procedure Put. It returns UTF-8 encoded Value.
function Length (Source : String) return Natural;
This procedure evaluates the length of a UTF-8 encoded string in code points. Data_Error is propagated when Source is not a valid UTF-8 string.
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : UTF8_Code_Point
);
This procedure puts one UTF-8 code point into the string Source starting from the position Source (Pointer). Pointer is then advanced to the first character following the output. Layout_Error is propagated when Pointer is not in Destination'Range or there is no room for output. Note that parameters Field, Justify and Fill usual for other Put-procedures would have no meaning here.
procedure Skip
( Source : String;
Pointer : in out Integer;
Count : Natural := 1
);
This procedure skips Count UTF-8 encoded code points in the string Source starting from Source (Pointer). After successful completion Pointer indicates is the first character following the skipped UTF-8 encoded sequence.
Data_Error | Illegal UTF-8 string Source |
End_Error | Less than Count characters detected before the string end |
Layout_Error | Pointer is not in Source'First..Source'Last + 1 |
function Value (Source : String) return UTF8_Code_Point;
This function decodes one UFT-8 code point stored in Source. The whole string Source should be matched. Otherwise the exception Data_Error is propagated. It is also propagated when Source is not a legal UTF-8 string.
The package Strings_Edit.UTF8.Handling provides the following conversion functions between UTF-8 encoded strings and Ada strings:
function To_String (Value : String) return String;
function To_String
( Value : String;
Substitute : Character
) return String;
These functions convert a UTF-8 encoded string to Latin-1 character string (standard Ada string). The parameter Substitute specifies the character that substitutes non-Latin-1 code points in Value. If omitted Constraint_Error is propagated when a non-Latin-1 code point appears in Value.
Constraint_Error | Non-Latin-1 code point detected |
Data_Error | Illegal UTF-8 string Value |
function To_UTF8 (Value : Character ) return String;
function To_UTF8 (Value : String ) return String;
function To_UTF8 (Value : Wide_Character) return String;
function To_UTF8 (Value : Wide_String ) return String;
These functions convert the parameter Value to a UTF-8 encoded string. The parameter can be Character, String, Wide_Character or Wide_String. The result of a character conversion can be from 1 to 3 bytes long. Note that Ada's Character has Latin-1 encoding which differs from UTF-8 in the code positions greater than 127.
function To_Wide_String (Value : String) return Wide_String;
function To_Wide_String
( Value : String;
Substitute : Wide_Character
) return Wide_String;
These functions convert a UTF-8 encoded string to UCS-2 character string (Ada's Wide_String). The parameter Substitute specifies the character that substitutes non-UCS-2 code positions in Value. If omitted Constraint_Error is propagated when a non-UCS-2 code point appears in Value.
Constraint_Error | Non-UCS-2 code point detected |
Data_Error | Illegal UTF-8 string Value |
The package Strings_Edit.UTF8.Integer_Edit provides integer I/O for special encodings of digits, such as subscript and superscript.
generic
type Number is range <>;
with procedure Get_Digit
( Source : String;
Pointer : in out Integer;
Digit : out Natural
) is <>;
with procedure Get_Sign
( Source : String;
Pointer : in out Integer;
Sign_Of : out Sign
) is <>;
with procedure Put_Digit
( Destination : in out String;
Pointer : in out Integer;
Digit : Script_Digit
) is <>;
with procedure Put_Sign
( Destination : in out String;
Pointer : in out Integer;
Sign_Of : Sign
) is <>;
package Strings_Edit.UTF8.Integer_Edit is
...
The generic parameters of the package are:
The package provides the following procedures and functions:
procedure Get
( Source : in String;
Pointer : in out Integer;
Value : out Number'Base;
Base : Script_Base := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
);function Value
( Source : String;
Base : Script_Base := 10;
First : Number'Base := Number'First;
Last : Number'Base := Number'Last;
ToFirst : Boolean := False;
ToLast : Boolean := False
) return Number'Base;procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Number'Base;
Base : Script_Base := 10;
PutPlus : Boolean := False
);function Image
( Value : Number'Base;
Base : Script_Base := 10;
PutPlus : Boolean := False
) return String;
These subroutines work exactly as ones of String_Edit.Integer_Edit with the difference that the number base is specified by the parameter of Script_Base type defined in Strings_Edit.UTF8 as an integer type with the range 2..10.
The generic package Strings_Edit.UTF8.Subscript.Integer_Edit is a specialization of Strings_Edit.UTF8.Integer_Edit for integer I/O of subscript numbers.
generic
type Number is range <>;
package Strings_Edit.UTF8.Subscript.Integer_Edit is
...
The package provides the subroutines described in Strings_Edit.UTF8.Integer_Edit.
This package has a non-generic instantiation with the type Integer: Strings_Edit.Integers.Subscript.
The generic package Strings_Edit.UTF8.Superscript.Integer_Edit is a specialization of Strings_Edit.UTF8.Integer_Edit for integer I/O of superscript numbers.
generic
type Number is range <>;
package Strings_Edit.UTF8.Superscript.Integer_Edit is
...
The package provides the subroutines described in Strings_Edit.UTF8.Integer_Edit.
This package has a non-generic instantiation with the type Integer: Strings_Edit.Integers.Superscript.
The package Strings_Edit.Fields can be used to write new Put-procedures, when the output size cannot be easily estimated. It contains two subprograms Get_Output_Field and Adjust_Output_Field. Get_Output_Field is used to calculate the available space in the output string. It raises Layout_Error exception as necessary. The program can then output into that space and call Adjust_Output_Field to move the output within the output field, fill and advance the string pointer. The following code fragment shows how it could be made:
procedure Put
( Destination : in out String;
Pointer : in out Integer;
Value : Something;
Field : Natural := 0;
Justify : Alignment := Left;
Fill : Character := ' '
) is
Out_Field : constant Natural :=
Get_Output_Field (Destination, Pointer, Field);
subtype Output is String (Pointer..Pointer + Out_Field - 1);
Text : Output renames
Destination (Pointer..Pointer + Out_Field - 1);
Index : Integer := Pointer;
begin
--
-- The output for Value is done in Text using Index as the pointer
--
Adjust_Output_Field
( Destination,
Pointer,
Index,
Out_Field,
Field,
Justify,
Fill
);
end Put;
Package | Provides | |||
Strings_Edit | The basic string I/O | |||
Fields | Tools for writing new Put-procedures | |||
Float_Edit | Generic I/O of floating-point numbers | |||
Floats | I/O of standard Float (instantiation of Float_Edit) | |||
Integer_Edit | Generic I/O of integer numbers | |||
Integers | I/O of standard Integer (instantiation of Integer_Edit) | |||
Subscript | I/O of standard Integer using UTF-8 subscript characters | |||
Superscript | I/O of standard Integer using UTF-8 superscript characters | |||
Quoted | I/O of strings put in Ada-style quotes | |||
Roman_Edit | I/O of roman numbers | |||
UTF8 | The base UTF-8 package. UTF-8 string length, skipping UTF-8 encoded characters | |||
Handling | Conversions of UTF-8 encoded strings to and from standard Ada strings | |||
Integer_Edit | Generic I/O of integer numbers using UTF-8 characters different from standard ASCII digits | |||
Subscript | Dealing with UTF-8 subscript characters | |||
Integer_Edit | Generic I/O of integer numbers using UTF-8 subscript characters | |||
Superscript | Dealing with UTF-8 superscript characters | |||
Integer_Edit | Generic I/O of integer numbers using UTF-8 superscript characters |
Changes to the version 1.6:
Changes to the version 1.5:
Changes to the version 1.4:
Changes to the version 1.3:
Changes to the version 1.2:
Changes to the version 1.1:
Changes to the version 1.0:
1. Input from String
1.1. Get procedures
1.2. Value functions
2. Output into String
2.1. Put procedures
2.2. Image functions
3. String I/O
3.1. Quoted strings
4. Roman I/O
5. Integer I/O
6. Floating-point I/O
7. UTF-8
7.1. Handling UTF-8 strings
7.2. Generic integer I/O of UTF-8 strings
7.3. Subscript UTF-8 integer I/O
7.4. Superscript UTF-8 integer I/O
8. Fields
9. Packages
10. Changes log
11. Table of contents