Maps

Rationale for Ada 2005: 6a Containers

ENG

3. Maps

@ We will now turn to the maps and sets packages. We will start by considering maps which are more exciting than sets and begin with ordered maps which are a little simpler and then consider hashed maps.

@ Remember that a map is just a means of getting from a value of one type (the key) to another type (the element). This is not a one-one relationship. Given a key there is a unique element (if any), but several keys may correspond to the same element. A simple example is an array. This is a map from the index type to the component type. Thus if we have

  1        S : String := "animal";

@ then this provides a map from integers in the range 1 to 6 to some values of the type Character.

@ Given an integer such as 3 there is a unique character 'i' but given a character such as 'a' there might be several corresponding integers (in this case both 1 and 5).

@ More interesting examples are where the set of used key values is quite sparse. For example we might have a store where various spare parts are held. The parts have a five-digit part number and there are perhaps twenty racks where they are held identified by a letter. However, only a handful of the five digit numbers are in use so it would be very wasteful to use an array with the part number as index. What we want instead is a container which holds just the pairs that matter such as (34618, 'F'), (27134, 'C') and so on. We can do this using a map. We usually refer to the pairs of values as nodes of the map.

@ There are two maps packages with much in common. One keeps the keys in order and the other uses a hash function. Here is the specification of the ordered maps package generally showing just those facilities common to both.

  1        generic
  2                type Key_Type is private;
  3                type Element_Type is private;
  4                with function "<" (Left, Right : Key_Type) return Boolean is <>;
  5                with function "=" (Left, Right : Element_Type) return Boolean is <>;
  6        package Ada.Containers.Ordered_Maps is
  7                pragma Preelaborate (Ordered_Maps);
  8                function Equivalent_Keys (Left : Right : Key_Type) return Boolean;

@ The generic parameters include the ordering relationship "<" on the keys and equality for the elements.

@ It is assumed that the ordering relationship is well behaved in the sense that if x < y is true then y < x is false. We say that two keys x and y are equivalent if both x < y and y < x are false. In other words this defines an equivalence class on keys. The relationship must also be transitive, that is, if x < y and y < z are both true then x < z must also be true.

@ This concept of an equivalence relationship occurs throughout the various maps and sets.

@ Sometimes, as here, it is defined in terms of an order but in other cases, as we shall see, it is defined by an equivalence function.

@ It is absolutely vital that the equivalence relations are defined properly and meet the above requirements. It is not possible for the container packages to check this and if the operations are wrong then peculiar behaviour is almost inevitable.

@ For the convenience of the user the function Equivalent_Keys is declared explicitly. It is equivalent to

  1        function Equivalent_Keys (Left, Right : Key_Type) return Boolean is
  2        begin
  3                return not (Left < Right) and not (Right < Left);
  4        end Equivalent_Keys;

@ The equality operation on elements is not so demanding. It must be symmetric so that x = y and y = x are the same but transitivity is not required (although cases where it would not automatically be transitive are likely to be rare). The operation is only used for the function "=" on the containers as a whole.

@ Note that Find and similar operations for maps and sets work in terms of the equivalence relationship rather than equality as was the case with lists and vectors.

  1        type Map is tagged private;
  2        pragma Preelaborable_Initialization (Map);
  3        type Cursor is private;
  4        pragma Preelaborable_Initialization (Cursor);
  5        Empty_Map : constant Map;
  6        No_Element : constant Cursor;

@ The types Map and Cursor and constants Empty_Map and No_Element are similar to the corresponding entities in the lists and vectors containers.

  1        function "=" (Left, Right : Map) return Boolean;
  2        function Length (Container : Map) return Count_Type;
  3        function Is_Empty (Container : Map) return Boolean;
  4        procedure Clear (Container : in out Map);

@ These are again similar to the corresponding entities for lists. Note that two maps are said to be equal if they have the same number of nodes with equivalent keys (as defined by "<") whose corresponding elements are equal (as defined by "=").

  1        function Key (Position : Cursor) return Key_Type;
  2        function Element (Position : Cursor) return Element_Type;
  3        procedure Replace_Element (Container : in out Map;
  4                Position : in Cursor;
  5                New_Item : in Element_Type);
  6        procedure Query_Element (Position : in Cursor;
  7                Process : not null access procedure (Key : in Key_Type; Element : in Element_Type));
  8        procedure Update_Element (Container : in out Map; Position : in Cursor;
  9                Process : not null access procedure (Key : in Key_Type; Element : in out Element_Type));

@ In this case there is a function Key as well as a function Element. But there is no procedure Replace_Key since it would not make sense to change a key without changing the element as well and this really comes down to deleting the whole node and then inserting a new one.

@ The procedures Query_Element and Update_Element are slightly different in that the procedure Process also takes the key as parameter as well as the element to be read or updated. Note again that the key cannot be changed. Nevertheless the value of the key is given since it might be useful in deciding how the update should be performed. Remember that we cannot get uniquely from an element to a key but only from a key to an element.

  1        procedure Move (Target, Source : in out Map);

@ This moves the map from the source to the target after first clearing the target. It does not make copies of the nodes so that after the operation the source is empty and Length (Source) is zero.

  1        procedure Insert (Container : in out Map;
  2                Key : in Key_Type;
  3                New_Item : in Element_Type;
  4                Position : out Cursor;
  5                Inserted : out Boolean);
  6        procedure Insert (Container : in out Map;
  7                Key : in Key_Type;
  8                Position : out Cursor;
  9                Inserted : out Boolean);
 10        procedure Insert (Container : in out Map;
 11                Key : in Key_Type;
 12                New_Item : in Element_Type);

@ These insert a new node into the map unless a node with an equivalent key already exists. If it does exist then the first two return with Inserted set to False and Position indicating the node whereas the third raises Constraint_Error (the element value is not changed). If a node with equivalent key is not found then a new node is created with the given key, the element value is set to New_Item when that is given and otherwise it takes its default value (if any), and Position is set when given.

@ Unlike vectors and lists, we do not have to say where the new node is to be inserted because of course this is an ordered map and it just goes in the correct place according to the order given by the generic parameter "<".

  1        procedure Include (Container : in out Map;
  2                Key : in Key_Type;
  3                New_Item : in Element_Type);

@ This is somewhat like the last Insert except that if an existing node with an equivalent key is found then it is replaced (rather than raising Constraint_Error). Note that both the key and the element are updated. This is because equivalent keys might not be totally equal.

@ For example the key part might be a record with part number and year of introduction, thus

  1        type Part_Key is
  2                record
  3                        Part_Number : Integer;
  4                        Year : Integer;
  5                end record;

@ and we might define the ordering relationship to be used as the generic parameter simply in terms of the part number

  1        function "<" (Left, Right : Part_Key) return Boolean is
  2        begin
  3                return Left.Part_Number < Right.Part_Number;
  4        end "<";

@ In this situation, the keys could match without the year component being the same and so it would need to be updated. In other words with this definition of the ordering relation, two keys are equivalent provided just the part numbers are the same.

  1        procedure Replace (Container : in out Map;
  2                Key : in Key_Type;
  3                New_Item : in Element_Type);

@ In this case, Constraint_Error is raised if the node does not already exist. On replacement both the key and the element are updated as for Include.

@ Perhaps a better example of equivalent keys not being totally equal is if the key were a string. We might decide that the case of letter did not need to match in the test for equivalence but nevertheless we would probably want to update with the string as used in the parameter of Replace.

  1        procedure Exclude (Container : in out Map; Key : in Key_Type);

@ If there is a node with an equivalent key then it is deleted. If there is not then nothing happens.

  1        procedure Delete (Container : in out Map; Key : in Key_Type);
  2        procedure Delete (Container : in out Map; Position : in out Cursor);

@ These delete a node. In the first case if there is no such equivalent key then Constraint_Error is raised (by contrast to Exclude which remains silent in this case). In the second case if the cursor is No_Element then again Constraint_Error is raised – there is also a check to ensure that the cursor otherwise does designate a node in the correct map (remember that cursors designate both an entity and the container); if this check fails then Program_Error is raised.

@ Perhaps it is worth observing that Insert, Include, Replace, Exclude and Delete form a sort of progression from an operation that will insert something, through operations that might insert, will neither insert nor delete, might delete, to the final operation that will delete something. Note also that Include, Replace and Exclude do not apply to lists and vectors.

  1        function First (Container : Map) return Cursor;
  2        function Last (Container : Map) return Cursor;
  3        function Next (Position : Cursor) return Cursor;
  4        procedure Next (Position : in out Cursor);
  5        function Find (Container : Map; Key : Key_Type) return Cursor;
  6        function Element (Container : Map; Key : Key_Type) return Element;
  7        function Contains (Container : Map; Key : Key_Type) return Boolean;

@ These should be self-evident. Unlike the operations on vectors and lists, Find logically searches the whole map and not just starting at some point (and since it searches the whole map there is no point in having Reverse_Find). (In implementation terms it won't actually search the whole map because it will be structured in a way that makes this unnecessary – as a balanced tree perhaps.) Moreover, Find uses the equivalence relation based on the "<" parameter so in the example it only has to match the part number and not the year. The function call Element (My_Map, My_Key) is equivalent to Element (Find (My_Map, My_Key)).

  1        function Has_Element (Position : Cursor) return Boolean;
  2        procedure Iterate (Container : in Map;
  3        Process : not null access procedure (Position : in Cursor));

@ These are also as for other containers.

@ And at last we have

  1        private
  2                ... -- not specified by the language
  3        end Ada.Containers.Ordered_Maps;

@ We have omitted to mention quite a few operations that have no equivalent in hashed maps – we will come back to these in a moment.

@ As an example we can make a container to hold the information concerning spare parts. We can use the type Part_Key and the function "<" as above. We can suppose that the element type is

  1        type Stock_Info is
  2                record
  3                        Shelf : Character range 'A' .. 'T';
  4                        Stock : Integer;
  5                end record;

@ This gives both the shelf letter and the number in stock.

@ We can then declare the container thus

  1        package Store_Maps is
  2                new Ordered_Maps (Key_Type => Part_Key,
  3                        Element_Type => Stock_Info, "<" => "<");
  4        The_Store : Store_Maps.Map;

@ The last parameter could be omitted because the formal has a <> default.

@ We can now add items to our store by calling

  1        The_Store.Insert ((34618, 1998), ('F', 25));
  2        The_Store.Insert ((27134, 2004), ('C', 45));
  3        ...

@ We might now have a procedure which, given a part number, checks to see if it exists and that the stock is not zero, and if so returns the shelf letter and year number and decrements the stock count.

  1        procedure Request (Part : in Integer; OK : out Boolean;
  2                Year : out Integer; Shelf : out Character) is
  3                C : Cursor;
  4                K : Part_Key;
  5                E : Stock_Info;
  6        begin
  7                C := The_Store.Find ((Part, 0));
  8                if C = No_Element then
  9                        OK := False; return; -- no such key
 10                end if;
 11                E := Element (C); K := Key (C);
 12                Year := K.Year; Shelf := E.Shelf;
 13                if E.Stock = 0 then
 14                        OK := False; return; -- out of stock
 15                end if;
 16                Replace_Element (C, (Shelf, E.Stock–1));
 17                OK := True;
 18        end Request;

@ Note that we had to put a dummy year number in the call of Find. We could of course use the new <> notation for this

  1        C := The_Store.Find ((Part, others => <>));

@ The reader can improve this example at leisure – by using Update_Element for example.

@ As another example suppose we wish to check all through the stock looking for parts whose stock is low, perhaps less than some given parameter. We can use Iterate for this as follows

  1        procedure Check_Stock (Low : in Integer) is
  2                procedure Check_It (C : in Cursor) is
  3                begin
  4                        if Element (C).Stock < Low then
  5                                -- print a message perhaps
  6                                Put ("Low stock of part ");
  7                                Put_Line (Key (C).Part_Number);
  8                        end if;
  9                end Check_It;
 10        begin
 11                The_Store.Iterate (Check_It'Access);
 12        end Check_Stock;

@ Note that this uses a so-called downward closure. The procedure Check_It has to be declared locally to Check_Stock in order to access the parameter Low. (Well you could declare it outside and copy the parameter Low to a global variable but that is just the sort of wicked thing one has to do in lesser languages (such as even Ada 95). It is not task safe for one thing.) Another approach is to use First and Next and so on thus

  1        procedure Check_Stock (Low : in Integer) is
  2                C : Cursor := The_Store.First;
  3        begin
  4                loop
  5                        exit when C = No_Element;
  6                        if Element (C).Stock < Low then
  7                                -- print a message perhaps
  8                                Put ("Low stock of part ");
  9                                Put_Line (Key (C).Part_Number);
 10                        end if;
 11                        C := The_Store.Next (C);
 12                end loop;
 13        end Check_Stock;

@ We will now consider hashed maps. The trouble with ordered maps in general is that searching can be slow when the map has many entries. Techniques such as a binary tree can be used but even so the search time will increase at least as the logarithm of the number of entries. A better approach is to use a hash function. This will be familiar to many readers (especially those who have written compilers). The general idea is as follows.

@ We define a function which takes a key and returns some value in a given range. In the case of the Ada containers it has to return a value of the modular type Hash_Type which is declared in the root package Ada.Containers. We could then convert this value onto a range representing an index into an array whose size corresponds to the capacity of the map. This index value is the preferred place to store the entry. If there already is an entry at this place (because some other key has hashed to the same value) then a number of approaches are possible. One way is to create a list of entries with the same index value (often called buckets); another way is simply to put it in the next available slot.

@ The details don't matter. But the overall effect is that provided the map is not too full and the hash function is good then we can find an entry almost immediately more or less irrespective of the size of the map.

@ So as users all we have to do is to define a suitable hash function. It should give a good spread of values across the range of Hash_Type for the population of keys, it should avoid clustering and above all for a given key it must always return the same hash value. A good discussion on hash functions by Knuth will be found in [1].

@ Defining good hash functions needs care. In the case of the part numbers we might multiply the part number by some obscure prime number and then truncate the result down to the modular type Hash_Type. The author hesitates to give an example but perhaps

  1        function Part_Hash (P : Part_Key) return Hash_Type is
  2                M31 : constant := 2**31–1; -- a nice Mersenne prime
  3        begin
  4                return Hash_Type (P.Part_Number) * M31;
  5        end Part_Hash;

@ On reflection that's probably a very bad prime to use because it is so close to half of 2**32 a typical value of Hash_Type'Last+1. Of course it doesn't have to be prime but simply relatively prime to it such as 5**13. Knuth suggests dividing the range by the golden number t = (v5+1)/2 = 1.618... and then taking the nearest number relatively prime which is in fact simply the nearest odd number (in this case it is 2654435769).

@ Here is a historic interlude. Marin Mersenne (1588-1648) was a Franciscan monk who lived in Paris.

@ He studied numbers of the form Mp = 2p – 1 where p is prime. A lot of these are themselves prime.

@ Mersenne gave a list of those upto 257 which he said were prime (namely 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257). It was not until 1947 that it was finally settled that he got some of them wrong (61, 89, and 107 are also prime but 67 and 257 are not). At the time of writing there are 42 known Mersenne primes and the largest which is also the largest known prime number is M25964951 – see www.mersenne.org.

@ The specification of the hashed maps package is very similar to that for ordered maps. It starts

  1        generic
  2                type Key_Type is private;
  3                type Element_Type is private;
  4                with function Hash (Key : Key_Type) return Hash_Type;
  5                with function Equivalent_Keys (Left, Right : Key_Type) return Boolean;
  6                with function "=" (Left, Right : Element_Type) return Boolean is <>;
  7        package Ada.Containers.Hashed_Maps is
  8                pragma Preelaborate (Hashed_Maps);

@ The differences from the ordered maps package are that there is an extra generic parameter Hash giving the hash function and the ordering parameter "<" has been replaced by the function Equivalent_Keys. It is this function that defines the equivalence relationship for hashed maps; it is important that Equivalent_Keys (X, Y) is always the same as Equivalent_Keys (Y, X). Moreover if X and Y are equivalent and Y and Z are equivalent then X and Z must also be equivalent.

@ Note that the function Equivalent_Keys in the ordered maps package discussed above corresponds to the formal generic parameter of the same name in this hashed maps package. This should make it easier to convert between the two forms of packages.

@ Returning to our example, if we now write

  1        function Equivalent_Parts (Left, Right : Part_Key) return Boolean is
  2        begin
  3                return Left.Part_Number = Right.Part_Number;
  4        end Equivalent_Parts;

@ then we can instantiate the hashed maps package as follows

  1        package Store_Maps is
  2                new Hashed_Maps (Key_Type => Part_Key,
  3                        Element_Type => Stock_Info,
  4                        Hash => Part_Hash,
  5                        Equivalent_Keys => Equivalent_Parts);
  6        The_Store : Store_Maps.Map;

@ and then the rest of our example will be exactly as before. It is thus easy to convert from an ordered map to a hashed map and vice versa provided of course that we only use the facilities common to both.

@ We will finish this discussion of maps by briefly considering the additional facilities in the two packages.

@ The ordered maps package has the following additional subprograms

  1        procedure Delete_First (Container : in out Map);
  2        procedure Delete_Last (Container : in out Map);
  3        function First_Element (Container : Map) return Element_Type;
  4        function First_Key (Container : Map) return Key_Type;
  5        function Last_Element (Container : Map) return Element_Type;
  6        function Last_Key (Container : Map) return Key_Type;
  7        function Previous (Position : Cursor) return Cursor;
  8        procedure Previous (Position : in out Cursor);
  9        function Floor (Container : Map; Key : Key_Type) return Cursor;
 10        function Ceiling (Container : Map; Key : Key_Type) return Cursor;
 11        function "<" (Left, Right : Cursor) return Boolean;
 12        function ">" (Left, Right : Cursor) return Boolean;
 13        function "<" (Left : Cursor; Right : Key_Type) return Boolean;
 14        function ">" (Left : Cursor; Right : Key_Type) return Boolean;
 15        function "<" (Left : Key_Type; Right : Cursor) return Boolean;
 16        function ">" (Left : Key_Type; Right : Cursor) return Boolean;
 17        procedure Reverse_Iterate (Container : in Map;
 18        Process : not null access procedure (Position : in Cursor));

@ These are again largely self-evident. The functions Floor and Ceiling are interesting. Floor searches for the last node whose key is not greater than Key and similarly Ceiling searches for the first node whose key is not less than Key – they return No_Element if there is no such element. The subprograms Previous are of course the opposite of Next and Reverse_Iterate is like Iterate only backwards.

@ The functions "<" and ">" are mostly for convenience. Thus the first is equivalent to

  1        function "<" (Left, Right : Cursor) return Boolean is
  2        begin
  3                return Key (Left) < Key (Right);
  4        end "<";

@ Clearly these additional operations must be avoided if we wish to retain the option of converting to a hashed map later.

@ Hashed maps have a very important facility not in ordered maps which is the ability to specify a capacity as for the vectors package. (Underneath their skin the hashed maps are a bit like vectors whereas the ordered maps are a bit like lists.) Thus we have

  1        procedure Reserve_Capacity (Container : in out Map; Capacity : in Count_Type);
  2        function Capacity (Container : Map) return Count_Type;

@ The behaviour is much as for vectors. We don't have to set the capacity ourselves since it will be automatically extended as necessary but it might significantly improve performance to do so. In the case of maps, increasing the capacity requires the hashing to be redone which could be quite time consuming, so if we know that our map is going to be a big one, it is a good idea to set an appropriate capacity right from the beginning. Note again that Length (M) cannot exceed Capacity (M) but might be much less.

@ The other additional subprograms for hashed maps are

  1        function Equivalent_Keys (Left, Right : Cursor) return Boolean;
  2        function Equivalent_Keys (Left : Cursor; Right : Key_Type) return Boolean;
  3        function Equivalent_Keys (Left : Key_Type; Right : Cursor) return Boolean;

@ These (like the additional "<" and ">" for ordered maps) are again mostly for convenience. The first is equivalent to

  1        function Equivalent_Keys (Left, Right : Cursor) return Boolean is
  2        begin
  3                return Equivalent_Keys (Key (Left), Key (Right));
  4        end Equivalent_Keys;

@ Before moving on to sets it should be noticed that there are also some useful functions in the string packages. The main one is

  1        with Ada.Containers;
  2        function Ada.Strings.Hash (Key : String) return Containers.Hash_Type;
  3        pragma Pure (Ada.Strings.Hash);

@ There is a similar function Ada.Strings.Unbounded.Hash where the parameter Key has type Unbounded_String. It simply converts the parameter to the type String and then calls Ada.Strings.Hash. There is also a generic function for bounded strings which again calls the basic function Ada.Strings.Hash. For completeness the function Ada.Strings.Fixed.Hash is a renaming of Ada.Strings.Hash.

@ These are provided because it is often the case that the key is a string and they save the user from devising good hash functions for strings which might cause a nasty headache.

@ We could for example save ourselves the worry of defining a good hash function in the above example by making the part number into a 5-character string. So we might write

  1        function Part_Hash (P : Part_Key) return Hash_Type is
  2        begin
  3                return Ada.Strings.Hash (P.Part_Number);
  4        end Part_Hash;

@ and if this doesn't work well then we can blame the vendor.

Rationale for Ada 2005: 6a Containers

ENGRUSTOPBACKNEXT

3. Карты

@ Теперь обратимся к пакетам карт и пакетам наборов. Мы начнём с рассмотрения карт, которые являются более объемлющими чем наборы и начнём с упорядоченных карт, которые несколько проще и затем рассмотрим хэш карты.

@ Запомните, что карта - только средство получения из значения одного типа (ключа) значения другого типа (элемента). Это не взаимно-однозначное отношение. На каждый ключ имеется свой уникальный элемент (если таковой имеется), но несколько ключей могут соответствовать одному и тому же элементу. Простой пример - массив. Это - карта от индексного типа до составляющего типа. Таким образом, если мы имеем:

  1        S : String := "animal";

@ то это обеспечивает нам карту от целых чисел в диапазоне 1 .. 6 к небольшому количеству значений типа Character.

@ Здесь целому числу 3 соответствует уникальный символ 'i', но некоторому символу 'a' может соответствовать несколько целых чисел (в данном случае 1 и 5).

@ Более интересным примером является случай когда набор используемых ключевых значений весьма разрежен. Пусть, у нас имеется склад запчастей. Каждая запчасть идентифицируется пятизначный кододом. Запчасти хранятся в стойках обозначаемых буквенным литералом. Если запчастей не очень много, то только часть пятизначных чисел реально используются, и поэтому было бы слишком расточительно использовать для этого массив размером 10^5. Вместо этого мы можем использовать контейнер, который содержал бы только пары следующего вида (34618, 'F'), (27134, 'C') и так далее. Мы можем сделать это используя карту. Мы обычно именуем пару значений узлами карты.

@ Есть два пакета карт которые почти не отличаются друг от друга. Первый использует упорядоченные ключи, а другой использует хеш-функцию. Вот спецификация пакета упорядоченных карт где показаны только средства свойственные им обоим.

  1        generic
  2                type Key_Type is private;
  3                type Element_Type is private;
  4                with function "<" (Left, Right : Key_Type) return Boolean is <>;
  5                with function "=" (Left, Right : Element_Type) return Boolean is <>;
  6        package Ada.Containers.Ordered_Maps is
  7                pragma Preelaborate (Ordered_Maps);
  8                function Equivalent_Keys (Left : Right : Key_Type) return Boolean;

@ Настраиваемые параметры включают отношение упорядочения "<" ключей и отношение эквивалентности "=" для элементов.

@ Предполагается, что отношение упорядочения правильное если X < Y является истиной а Y < X - ложь. Мы говорим, что два ключа X и Y эквивалентны если и X < Y и Y < X являются ложью. Другими словами, это определяет эквивалентный класс ключей. Отношения должны также быть транзитивны, то есть, если X < Y и Y < Z истинны, тогда X < Z также должно быть истинно.

@ Это понятие эквивалентных отношений принято для любых карт и наборов.

@ Иногда (как здесь) оно определено в терминах упорядочения, в других случаях, как мы увидим, оно определено функцией эквивалентности.

@ Жизненно важно чтобы отношения эквивалентности были определены должным образом и отвечали вышеупомянутым требованиям. Для контейнерных пакетов это не возможно проверить и если операции являются неправильными тогда, специфическое поведение почти неизбежно.

@ Для удобства пользователя функция Equivalent_Keys объявлена явно. Она выполняет примерно следующее:

  1        function Equivalent_Keys (Left, Right : Key_Type) return Boolean is
  2        begin
  3                return not (Left < Right) and not (Right < Left);
  4        end Equivalent_Keys;

@ Операция равенства для элементов не настолько требовательна. Она должна быть симметрична так, чтобы X = Y и Y = X имели один и тот же результат, но транзитивность здесь не требуется (за редким исключением). Операция только используется для функции "=" на контейнерах в целом.

@ Отметим, что Find и подобные операции для карт и наборов работают в терминах эквивалентных отношений, а не равенства как это имело место со списками и векторами.

  1        type Map is tagged private;
  2        pragma Preelaborable_Initialization (Map);
  3        type Cursor is private;
  4        pragma Preelaborable_Initialization (Cursor);
  5        Empty_Map : constant Map;
  6        No_Element : constant Cursor;

@ Типы Map и Cursor и константы Empty_Map и No_Element подобны соответствующим объектам в списковых и векторных контейнерах.

  1        function "=" (Left, Right : Map) return Boolean;
  2        function Length (Container : Map) return Count_Type;
  3        function Is_Empty (Container : Map) return Boolean;
  4        procedure Clear (Container : in out Map);

@ Они также подобны соответствующим объектам для списков. Отметим, что две карты, как говорится, равны, если у них есть то же самое число узлов с эквивалентными ключами (как определено "<"), чьи соответствующие элементы равны (как определено "=").

  1        function Key (Position : Cursor) return Key_Type;
  2        function Element (Position : Cursor) return Element_Type;
  3        procedure Replace_Element
  4        (       Container : in out Map;
  5                Position  : in Cursor;
  6                New_Item  : in Element_Type);
  7        procedure Query_Element
  8        (       Position : in Cursor;
  9                Process  : not null access procedure
 10                (       Key     : in Key_Type;
 11                        Element : in Element_Type));
 12        procedure Update_Element
 13        (       Container : in out Map;
 14                Position  : in Cursor;
 15                Process   : not null access procedure
 16                (       Key     : in Key_Type;
 17                        Element : in out Element_Type));

@ В этом случае есть функция Key такая же как и функция Element. Но нет никакой процедуры Replace_Key, так как не имеет смысла изменять ключ, не изменяя также и элемент, и это в действительности сводится к удалению старого узла и затем вставке нового.

@ Процедуры Query_Element и Update_Element немного различаются в том что процедура Process также берет ключ и елемент в качестве параметра, который будет читаться или обновляться. Отметим также, что ключ не может быть изменён. Однако значение ключа задаётся, так как это может быть полезно в случае если обновление должно быть выполнено. Напомним, что мы не можем добраться непосредственно от элемента к ключу, это возможно только от ключа к элементу.

  1        procedure Move (Target, Source : in out Map);

@ Эта процедура перемещает карту из Target в Source после первоначальной очистки Target. При этом после выполнения операции Source очищается, и его Length (Source) становился равным нулю.

  1        procedure Insert
  2        (       Container : in out Map;
  3                Key       : in Key_Type;
  4                New_Item  : in Element_Type;
  5                Position  : out Cursor;
  6                Inserted  : out Boolean);
  7        procedure Insert
  8        (       Container : in out Map;
  9                Key       : in Key_Type;
 10                Position  : out Cursor;
 11                Inserted  : out Boolean);
 12        procedure Insert
 13        (       Container : in out Map;
 14                Key       : in Key_Type;
 15                New_Item  : in Element_Type);

@ Эти процедуры вставляют новый узел в карту, но только если узел с таким ключом отсутствует в карте. Если узел с таким ключом уже существует, тогда первые две процедуры возвращают его позицию в поле Position, a поле Inserted устанавливается в False, тогда как третя процедура поднимает исключение Constraint_Error (при этом значение элемента не изменяется). Если узел с эквивалентным ключом не найден тогда создаётся новый узел с данным ключом, значение элемента устанавливается в New_Item, когда он задан, в противном случае он берет свое значение по умолчанию (если оно имеется), и поле Position установливается когда оно дано.

@ В отличие от векторов и списков, мы не должны указывать, где новый узел должен быть вставлен потому что это упорядоченная карта, и поэтому узел помещается в правильное место согласно порядка установленного настраиваемым параметром "<".

  1        procedure Include
  2        (       Container : in out Map;
  3                Key       : in Key_Type;
  4                New_Item  : in Element_Type);

@ Это несколько походит на последнюю процедуру Insert за исключением того, что если уже имеется узел с эквивалентным ключом тогда он заменяется (а не возбуждается исключение Constraint_Error). Заметим, что и ключ и элемент обновляются т.к. наличие эквивалентных ключей в пределах одной карты не допускается.

@ Например, ключевая часть может быть записью с номером запчасти и годом производства

  1        type Part_Key is
  2                record
  3                        Part_Number : Integer;
  4                        Year        : Integer;
  5                end record;

@ и мы могли бы определить отношение упорядочивания просто в терминах номера запчасти которое будет использоваться как настраиваемый параметр

  1        function "<" (Left, Right : Part_Key) return Boolean is
  2        begin
  3                return Left.Part_Number < Right.Part_Number;
  4        end "<";

@ В этой ситуации ключи будут соответствовать без компоненты Year, и таким образом, это должно быть обновлено. Другими словами, с этим определением отношения упорядочения два ключа эквивалентны только если номера запчастей одинаковы.

  1        procedure Replace
  2        (       Container : in out Map;
  3                Key       : in Key_Type;
  4                New_Item  : in Element_Type);

@ В этом случае будет возбуждено исключение Constraint_Error если такого узла не существует. При замене и ключ и элемент обновляются так же как и для Include.

@ Возможно лучший пример эквивалентных ключей, не являющихся полностью равными, это если бы ключ был бы строкой. Мы могли бы решить, что не все символы должны соответствовать в тесте на эквивалентность, но мы могли бы захотеть обновить строку как используется в параметре Replace.

  1        procedure Exclude (Container : in out Map; Key : in Key_Type);

@ Если имеется узел с эквивалентным ключом, то он удаляется. Если нет, то ничто не происходит.

  1        procedure Delete (Container : in out Map; Key : in Key_Type);
  2        procedure Delete (Container : in out Map; Position : in out Cursor);

@ Эти процедуры удаляют узел. В первом случае, если нет эквивалентного ключа, тогда возбуждается исключение Constraint_Error (в отличие от Exclude). Во втором случае, если курсор - No_Element возбуждается исключение Constraint_Error. Выполняется также проверка, для гарантии того, что курсор определяет узел в правильной карте (помните, что курсоры определяют и объект и контейнер); если эта проверка терпит неудачу тогда возбуждается исключение Program_Error.

@ Обратите внимание, что последовательность Insert, Include, Replace, Exclude и Delete это своего рода прогрессия: вставка со строгими условиями; вставка с менее строгими условиями; строгая замена; менее строгая замена и строгое удаление. Отметим также, что Include, Replace и Exclude не относятся к спискам и векторам.

  1        function First (Container : Map) return Cursor;
  2        function Last (Container : Map) return Cursor;
  3        function Next (Position : Cursor) return Cursor;
  4        procedure Next (Position : in out Cursor);
  5        function Find (Container : Map; Key : Key_Type) return Cursor;
  6        function Element (Container : Map; Key : Key_Type) return Element;
  7        function Contains (Container : Map; Key : Key_Type) return Boolean;

@ Назначение этих процедур самоочевидно. В отличие от операций в векторах и списках, Find осуществляет логический поиск целой карты, а не только начинающийся в некоторой точке (а так как она ищет целую карту то нет никакого смысла в наличии Reverse_Find). (В терминах реализации она не будет фактически искать целую карту, потому что это будет структурировано в пути, который делает это ненужным - как сбалансированное дерево возможно). Кроме того, Find использует отношение эквивалентности основанное на параметре "<" как в предыдущем примере, это только должно соответствовать номеру части а не году. Вызов функции Element (My_Map, My_Key) эквивалентен вызову Element (Find (My_Map, My_Key)).

  1        function Has_Element (Position : Cursor) return Boolean;
  2        procedure Iterate
  3        (       Container : in Map;
  4                Process   : not null access procedure (Position : in Cursor));

@ Они также касаются и других контейнеров.

@ И в конце мы имеем

  1        private
  2                ... -- not specified by the language
  3        end Ada.Containers.Ordered_Maps;

@ Мы опустили упоминание о довольно большом количестве операций у которых нет никакого эквивалента в хаш картах, мы возвратимся к ним через мгновение.

@ Как пример, мы можем сделать контейнер, содержащий информацию о запасных частях. Мы можем использовать тип Part_Key и функцию "<" рассмотренные выше. Мы можем предположить, что тип элемента будет такой:

  1        type Stock_Info is
  2                record
  3                        Shelf : Character range 'A' .. 'T';
  4                        Stock : Integer;
  5                end record;

@ Это дает и символьный код стойки и числовой код в запчасти.

@ Мы можем тогда объявить контейнер таким образом:

  1        package Store_Maps is new Ordered_Maps
  2        (       Key_Type     => Part_Key,
  3                Element_Type => Stock_Info,
  4                "<"          => "<");
  5        The_Store : Store_Maps.Map;

@ Последний параметр может быть опущен, потому что формально он имеет значение по умолчанию <> .

@ Мы можем теперь добавить элементы к нашей памяти следующим образом:

  1        The_Store.Insert ((34618, 1998), ('F', 25));
  2        The_Store.Insert ((27134, 2004), ('C', 45));
  3        ...

@ Теперь у нас может быть процедура, которая по числовому коду запчасти выясняла бы имеется ли такая запчасть в наличии, и если да, то возвращаля бы литерный код полки и год производства, уменьшая при этом итоготовый счетчик.

  1        procedure Request
  2        (       Part  : in Integer;
  3                Ok    : out Boolean;
  4                Year  : out Integer;
  5                Shelf : out Character)
  6        is
  7                C : Cursor;
  8                K : Part_Key;
  9                E : Stock_Info;
 10        begin
 11                C := The_Store.Find ((Part, 0));
 12                if C = No_Element then
 13                        Ok := False; return; -- no such key
 14                end if;
 15                E := Element (C); K := Key (C);
 16                Year := K.Year; Shelf := E.Shelf;
 17                if E.Stock = 0 then
 18                        Ok := False; return; -- out of stock
 19                end if;
 20                Replace_Element (C, (Shelf, E.Stock–1));
 21                Ok := True;
 22        end Request;

@ Отметим, что мы должны были поместить фиктивное значение года в запрос Find. Мы могли, конечно, использовать новую нотацию <> для этого:

  1        C := The_Store.Find ((Part, others => <>));

@ Читатель может улучшить этот пример на досуге используя Update_Element, например.

@ Как другой пример предположим, что мы желаем проверить все полки, ища запчасти, полка которых низка, возможно меньше чем некоторый данный параметр. Мы можем использовать Iterate для этого:

  1        procedure Check_Stock (Low : in Integer) is
  2                procedure Check_It (C : in Cursor) is
  3                begin
  4                        if Element (C).Stock < Low then
  5                                -- print a message perhaps
  6                                Put ("Low stock of part ");
  7                                Put_Line (Key (C).Part_Number);
  8                        end if;
  9                end Check_It;
 10        begin
 11                The_Store.Iterate (Check_It'Access);
 12        end Check_Stock;

@ Заметим, что сдесь используется так называемое нисходящее замкнутое выражение. Процедура Check_It должна быть объявлена локальной внутри Check_Stock чтобы обратиться к параметру Low. (Конечно, мы могли объявить её и снаружи, а затем скопировать параметр Low для глобальной переменной, но это - дурная практика, которую нужно делать на отстойных языках (в том числе и на Ада 95). ??? Это не сейф задачи с одной стороны ???). Другой подход должен использовать функции First и Next (и так далее...) следующим образом:

  1        procedure Check_Stock (Low : in Integer) is
  2                C : Cursor := The_Store.First;
  3        begin
  4                loop
  5                        exit when C = No_Element;
  6                        if Element (C).Stock < Low then
  7                                -- print a message perhaps
  8                                Put ("Low stock of part ");
  9                                Put_Line (Key (C).Part_Number);
 10                        end if;
 11                        C := The_Store.Next (C);
 12                end loop;
 13        end Check_Stock;

@ Рассмотрим теперь хэш-карты. Неприятность с упорядоченными картами в основном состоит в том, что когда у карты много элементов то поиск в ней может быть очень медленным. Могут использоваться различные методики, например бинарное дерево, но даже в этом случае время поиска увеличится по крайней мере как логарифм от числа элементов. Лучший подход состоит в использовании хеш-функции. Это знакомо многим читателям (особенно тем, кто писал компиляторы). Общее представление следующие.

@ Мы определяем функцию, которая берет ключ и возвращает некоторое значение в заданном диапазоне. В случае Ада контейнеров она должна возвратить значение модульного типа Hash_Type, который объявлен в корневом пакете Ada.Containers. Тогда мы можем преобразовать это значение в диапазон, представляющий индекс массива, размер которого соответствует ёмкости карты. Это индексное значение - привилегированное место, чтобы сохранить вход. Если уже есть вход в этом месте (например, некоторый другой ключ уже хэширован к этому же самому значению), тогда возможны разные варианты. Один из них состоит в том, чтобы создать список входов с одним и тем же индексным значением (часто называемый кластером памяти); иначе должен просто поместить это в следующий доступный слот.

@ Подробности не имеют значения. Но конечный эффект состоит в том, что если карта не слишком полна, и выбранная хеш-функция хороша тогда, мы можем найти элемент почти немедленно более или менее независимо от размера карты.

@ Всё что должен сделать пользователь - это определить подходящую хеш-функцию. Она должна давать хорошее распределение значений на диапазон Hash_Type для используемой совокупности ключей, она должна избегать кластеризации, и прежде всего, для данного ключа она должна всегда возвращать одно и то же значение хеш-функции. Хороший материал по хеш-функциям можно найти в Knuth [1].

@ Определение хороших хеш-функций важная часть работы. В случае числовых кодов запчастей мы могли бы умножить код запчасти на некоторое простое число и затем усечь результат по модулю Hash_Type. Автор смущается давать пример, но возможно:

  1        function Part_Hash (P : Part_Key) return Hash_Type is
  2                M31 : constant := 2**31–1; -- a nice Mersenne prime
  3        begin
  4                return Hash_Type (P.Part_Number) * M31;
  5        end Part_Hash;

@ Это - вероятно очень плохой пример для применения, потому что это близко к половине величины 2 ** 32 для типичного значения Hash_Type'Last+1. Конечно оно не должно быть простым, но относительно простым к этому, такие как 5 ** 13. Knuth предлагает делить диапазон на золотое число t = (v5+1) / 2 = 1.618... и затем брать самое близкое относительно простое число, которое является фактически самым близким нечетным числом (в данном случае, это 2654435769).

@ Вот историческая справка. Marin Mersenne (1588-1648) был монахом, который жил в Париже.

@ Он изучил числа вида Mp = 2p - 1, где p является простым числом. Многие из них сами являются простыми.

@ Mersenne дал список таких чисел вподь до 257 которые были простыми (а именно, 2, 3, 5, 7, 13, 17, 19, 31, 67, 127, 257). Только в 1947 было обнаружено, что некоторые из них неправильны (61, 89, и 107 являются также простыми, но 67 и 257 - нет). К настоящему времени известно 42 числа Mersenne, наибольшее из которых является также наибольшим известным простым числом - M25964951 - см. www.mersenne.org.

@ Спецификация пакета хэш - карт очень похожа на спецификацию для упорядоченных карт. Она начинается:

  1        generic
  2                type Key_Type is private;
  3                type Element_Type is private;
  4                with function Hash (Key : Key_Type) return Hash_Type;
  5                with function Equivalent_Keys (Left, Right : Key_Type) return Boolean;
  6                with function "=" (Left, Right : Element_Type) return Boolean is <>;
  7        package Ada.Containers.Hashed_Maps is
  8                pragma Preelaborate (Hashed_Maps);

@ В отличие от пакета упорядоченных карт здесь есть дополнительный настраиваемый параметр Hash, задающий хеш-функцию, а также параметр упорядочения "<" здесь заменен функцией Equivalent_Keys. Именно эта функция определяет эквивалентные отношения для хэш - карт; при этом важно, что Equivalent_Keys (X, Y) и Equivalent_Keys (Y, X) равнозначны. Кроме того, если X и Y эквивалентны и Y и Z эквивалентны, тогда X и Z также должны быть эквивалентными.

@ Отметим, что функция Equivalent_Keys в пакете упорядоченных карт, обсужденном выше, соответствует формальному настраиваемому параметру того же самого названия в этом пакете хэш - карт. Это должно облегчить преобразование между двумя формами пакетов.

@ Возвращаясь к нашему примеру, если мы теперь пишем:

  1        function Equivalent_Parts (Left, Right : Part_Key) return Boolean is
  2        begin
  3                return Left.Part_Number = Right.Part_Number;
  4        end Equivalent_Parts;

@ тогда мы можем иллюстрировать пакет хэш - карт следующим образом:

  1        package Store_Maps is new Hashed_Maps
  2        (       Key_Type        => Part_Key,
  3                Element_Type    => Stock_Info,
  4                Hash            => Part_Hash,
  5                Equivalent_Keys => Equivalent_Parts);
  6        The_Store : Store_Maps.Map;

@ и затем остальная часть нашего примера будет точно такая же. Таким образом, мы можем легко преобразовать от упорядоченной карты к хэш - карте и наоборот, что обеспечивается тем что мы используем средства обычные для обоих.

@ Заканчивая обсуждение карт, кратко рассмотрим дополнительные средства этих пакетов.

@ У пакета упорядоченных карт есть следующие дополнительные подпрограммы:

  1        procedure Delete_First (Container : in out Map);
  2        procedure Delete_Last (Container : in out Map);
  3        function First_Element (Container : Map) return Element_Type;
  4        function First_Key (Container : Map) return Key_Type;
  5        function Last_Element (Container : Map) return Element_Type;
  6        function Last_Key (Container : Map) return Key_Type;
  7        function Previous (Position : Cursor) return Cursor;
  8        procedure Previous (Position : in out Cursor);
  9        function Floor (Container : Map; Key : Key_Type) return Cursor;
 10        function Ceiling (Container : Map; Key : Key_Type) return Cursor;
 11        function "<" (Left, Right : Cursor) return Boolean;
 12        function ">" (Left, Right : Cursor) return Boolean;
 13        function "<" (Left : Cursor; Right : Key_Type) return Boolean;
 14        function ">" (Left : Cursor; Right : Key_Type) return Boolean;
 15        function "<" (Left : Key_Type; Right : Cursor) return Boolean;
 16        function ">" (Left : Key_Type; Right : Cursor) return Boolean;
 17        procedure Reverse_Iterate (Container : in Map;
 18        Process : not null access procedure (Position : in Cursor));

@ Их назначение самоочевидно из названия. Здесь функции Floor и Ceiling представляют особый интерес. Функция Floor ищет последний узел, ключ которого не больше чем Key, функция Ceiling ищет первый узел, ключ которого не меньше чем Key, если такой элемент не найден - обе функции возвращают No_Element. Функция Previous является противоположностью Next, а Reverse_Iterate подобна Iterate но выполняется в обратном направлении.

@ Функции "<" и ">" введены, главным образом, для удобства. Таким образом, первая эквивалентна следующему:

  1        function "<" (Left, Right : Cursor) return Boolean is
  2        begin
  3                return Key (Left) < Key (Right);
  4        end "<";

@ Ясно этих дополнительных операций можно было бы избежать, если мы пожелаем сохранить опцию преобразования в хэш - карту и далее.

@ У хэш - карт есть очень важное средство отсутствующее в упорядоченных картах, это возможность определить ёмкость векторного пакета. (В этом смысле хэш - карты немного походят на векторы, тогда как упорядоченные карты больше похожи на списки). Таким образом, мы имеем:

  1        procedure Reserve_Capacity (Container : in out Map; Capacity : in Count_Type);
  2        function Capacity (Container : Map) return Count_Type;

@ ??? Поведение очень что касается векторов ???. Мы вовсе не обязаны устанавливать ёмкость самостоятельно, так как она расширяется автоматически по мере необходимости, но мы можем значительно улучшить производительность, если заранее укажем предполагаемую ёмкость. В случае карт увеличение ёмкости требует чтобы хеширование было восстановлено, которое может быть весьма трудоёмким, поэтому, если мы знаем, что наша карта будет больших размеров, лучше заранее установить соответствующую ёмкость. Отметим, что Length (M) не может превысить Capacity (M), но может быть намного меньше.

@ Другие дополнительные подпрограммы для хэш - карт:

  1        function Equivalent_Keys (Left, Right : Cursor) return Boolean;
  2        function Equivalent_Keys (Left : Cursor; Right : Key_Type) return Boolean;
  3        function Equivalent_Keys (Left : Key_Type; Right : Cursor) return Boolean;

@ Они (как и дополнительное "<" и ">" для упорядоченных карт) опять главным образом для удобства. Первое эквивалентно:

  1        function Equivalent_Keys (Left, Right : Cursor) return Boolean is
  2        begin
  3                return Equivalent_Keys (Key (Left), Key (Right));
  4        end Equivalent_Keys;

@ Перед переходом к наборам нужно заметить, что есть также некоторые полезные функции в строковых пакетах. Главная из которых:

  1        with Ada.Containers;
  2        function Ada.Strings.Hash (Key : String) return Containers.Hash_Type;
  3        pragma Pure (Ada.Strings.Hash);

@ Есть подобная функция Ada.Strings.Unbounded.Hash где у параметра Key есть тип Unbounded_String. Она просто преобразовывает параметр в тип String и затем вызывает Ada.Strings.Hash. Есть также настраиваемая функция для ограниченных строк, которая снова вызывает основную функцию Ada.Strings.Hash. Для законченности функция Ada.Strings.Fixed.Hash - переименована из Ada.Strings.Hash.

@ Они предоставлены, потому что часто имеет место, что ключ - строка, и они оберегают пользователя от изобретения собственных хеш-функций для строк, которые могли бы вызвать противную головную боль.

@ Мы могли например сохранить самостоятельно беспокойство определения хорошей хеш-функции в вышеупомянутом примере, превращая числовой код зачасти в 5-символьную строку. Таким образом, мы могли бы написать

  1        function Part_Hash (P : Part_Key) return Hash_Type is
  2        begin
  3                return Ada.Strings.Hash (P.Part_Number);
  4        end Part_Hash;

@ и если это не будет работать хорошо тогда, мы можем обвинить в этом разработчика пакета.

ENG RUS

TOP BACK NEXT

2010-10-24 00:26:57

. .