开发者

Ada Slicing with Strings

I'm a long time C++ programmer learning Ada for fun. If any of the following is bad form, please feel free to point it out. I'm trying to learn the Ada way to do things, but old habits are hard to break (and I miss Boost!)

I'm trying to load a file that contains an integer, a space, and then a string of characters. There may be a better way to do this, but I thought that I ought to load the line into a string buffe开发者_如何学Gor that I know won't be more than 80 characters. I declare a buffer variable like the following in the appropriate place:

 Line_Buffer : String(1..80);

After opening the file, I loop through each line and split the buffer at the space character:

 while not Ada.Text_IO.End_Of_File(File_Handle) loop
   Ada.Text_IO.Get_Line(File_Handle, Item=>Line_Buffer, Last=>Last);
   -- Break line at space to get match id and entry
   for String_Index in Line_Buffer'Range loop
     if Line_Buffer(String_Index) = ' ' then
       Add_Entry(Root_Link=>Root_Node,
        ID_String=> Line_Buffer(1..String_Index-1),
        Entry_String=> Line_Buffer(String_Index+1..Last-1)
        );
     end if;
   end loop;
 end loop;

What happens in Add_Entry is not that important, but its specification looks like this:

 procedure Add_Entry(
   Root_Link : in out Link;
   ID_String : in String;
   Entry_String : in String);

I wanted to use unbounded strings rather than bounded strings because I don't want to worry about having to specify size here and there. This compiles and works fine, but inside Add_Entry, when I try to loop over each character in Entry_String, rather than having indexes starting from 1, they start from the offset in the original string. For example, if Line_Buffer was "14 silicon", if I loop as follows, the index goes from 4 to 10.

for Index in Entry_String'Range loop
  Ada.Text_IO.Put("Index: " & Integer'Image(Index));
  Ada.Text_IO.New_Line;  
end loop;

Is there a better way to do this parsing so that the strings I pass to Add_Entry have boundaries that begin with 1? Also, when I pass a sliced string as an "in" parameter to a procedure, is a copy created on the stack, or is a reference to the original string used?


First off, my sympathies. Ada strings are probably the single most different thing between C++ and Ada. What makes it worse is that the differences are under the surface, so naive C/C++ coders start their Ada careers thinking they maybe aren't there, and they can treat Ada strings like they do C strings. Now for your specific question:

Ada arrays (including strings) all have implicit boundaries passed around with them. This means there is generally never a need for a special sentinel value (like nul), and rarely a need for separate length variables. It also means there is nothing special about 1 or 0 or any other index.

So the proper way to deal with arrays in Ada is that you don't assume inside a subroutine what your starting and ending boundaries are. You figure them out. The language provides 'first, 'last, and 'range specifically for that purpose. From your example, if you wanted to print the offset from the start of the given string (for some weird reason) it would be:

for Index in Entry_String'Range loop
  Ada.Text_IO.Put("Index offset: " & Integer'Image(Index-Entry_string'first));
  Ada.Text_IO.New_Line;  
end loop;

OK. Now for difference two between Ada and C. Your in parameter is not copied. This one is very important, so I will shout a bit: Ada parameters are not passed like C parameters! The exact rules are a bit complicated, but for your purpose the principle is that Ada will do the sensible thing. If the parameter can fit in a register, it will be passed by copy (or perhaps register). If the parameter is too big for that, it will be passed by reference. You don't get to decide this. It is an issue of optimization, and will be done by the compiler. But you can count on your compiler not creating copies of large arrays just to pass them in to a routine that isn't allowed to modify them anyway. That would be stoopid. Only a total idiot (or a C++ compiler) would do such a thing. If you ever find an Ada compiler doing it report it as a bug. It would be.

Lastly, in most cases creative use of Ada's scoping rules will allow you to use perfectly-sized constant "fixed" strings. You should almost never need to use dynamic strings or separate length variables. Sadly, Ada.Text_IO.Get_Line is one of the exceptions. If you don't care too much about performance (which you shouldn't if you are reading this string from the user), you can use Carlisle's routine to read in a perfectly-sized fixed string from Text_IO.


If you're okay with using GNAT implementation defined packages, the package Ada.Strings.Unbounded.Text_IO is available.

Also, the Ada.Strings child packages (specific to Fixed, Bounded, or Unbounded strings) provide some helpful subprograms for string processing, like Index() for finding specific strings within other strings--useful for locating embedded blanks :-)

There's another GNAT package, GNAT.Array_Split (which comes preinstantiated with strings as GNAT.String_Split) that supplies more subprograms oriented towards breaking apart arrays (and strings).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜