- placing a label next to each item of data improves our raw file
readability even more:
name = Joe
class = Soph
gpa = 4
branch = Valley Lane
name = Sally
class = Fresh
gpa = 11
branch = Mountain View
- now it is obvious what each piece of data represents within the block.
(not to say that sequential data can't be labeled, it is just more
common to find block data labeled.)
- but doesn't this complicate the reading process somewhat? well, yes.
but it is well worth it!
- a first approach is to do something simple like this:
file >> setw(MAX_LABEL) >> label >> sep
>> setw(MAX_NAME) >> name;
but this doesn't account for several things.
first off, the user might
re-order information within the block:
name = Joe
gpa = 4
class = Soph
branch = Valley Lane
class = Fresh
gpa = 11
name = Sally
branch = Mountain View
note that the same data is present, the blocks have just been
internally shuffled. the assumption here is that your program should
be able to place the correct data in the correct variables by the
context of the labels. that is a major assumption, are we ready for
it? SURE!
- after reading the label, we simply look for that label string in a list of
known label strings:
const char known_labels[MAX_KNOWN_LABELS][MAX_LABEL] = { "name",
"gpa",
"class",
"branch" };
file >> setw(MAX_LABEL) >> label >> sep;
L = 0;
while (L < MAX_KNOWN_LABELS && strcmp(label,known_labels[L]) != 0)
{
L++;
}
// L is either the index of the correct label or MAX_KNOWN_LABELS
it's just a simple linear search through an array of strings! now we can
switch on the label index to an appropriate
action:
switch (L)
{
case 0: file >> setw(MAX_NAME) >> name; break;
case 1: file >> gpa; break;
case 2: file >> setw(MAX_CLASS) >> year; break;
case 3: file.getline(branch, MAX_BRANCH); break;
default: // nothing -- ignore unknown labels
}
easy as pie! (although it is rather a pain to make a pie, this is an
old expression that means it really is easy...*shrug*)
speaking of hidden pain, don't forget to watch out for the unknown label
condition — when the loop above ends at MAX_KNOWN_LABELS!
- but,
if we are
labeling data so that it may be easily read, it might also be
edited/changed outside our program. user's don't type as carefully as
our program reads. the user might end up with something like:
name=Joe
class=Soph
gpa=4
branch= Valley Lane
name =Sally
class = Fresh
GPA = 11
branch =Mountain View
which our code can no longer read correctly.
the capitalization problems can be
side-stepped by using a case-insensitive string comparison. perhaps
you could do something about
that...
the spacing problems are much harder to deal with. since we don't know
whether they'll have or not have space preceding the separator character
(an '=' above), following the separator, or even before the label itself,
things are a bit more messy.
however, since all labeled data has but one item per file line, we can
read in the whole line and then do string processing (which we are
familiar with from previous studies) to break up the pieces within our
program:
file.getline(labeled_line, MAX_LINE);
sep_at = search(labeled_line, '=');
lcap = min(sep_at,MAX_LABEL)-1;
strncpy(label, labeled_line, lcap);
label[lcap] = '\0';
// copy rest into value string:
value_index = 0;
do
{
sep_at++;
value[value_index++] = labeled_line[sep_at];
} while (value_index != MAX_VALUE_LEN &&
labeled_line[sep_at] != '\0');
- did I mention that using pointers would have made this a little more
interesting?
char * sep_at; // make sep_at a pointer rather than an index
file.getline(labeled_line, MAX_LINE);
sep_at = strchr(labeled_line, '='); // use cstring library function to search
*sep_at = '\0'; // split string logically in two
strncpy(label, labeled_line, MAX_LABEL-1);
label[MAX_LABEL-1] = '\0';
strncpy(value, sep_at+1, MAX_VALUE_LEN-1); // use pointer to second half
value[MAX_VALUE_LEN-1] = '\0';
- then just strip off any leading or trailing spaces from each string (we
leave internal ones just in case the value — or even the label itself
— has spaces inside it):
// count number of leading spaces
lead_space = 0;
while (isspace(str[lead_space]))
{
lead_space++;
}
// shift data over
moving = 0;
while (str[moving+lead_space] != '\0')
{
str[moving] = str[moving+lead_space];
moving++;
}
str[moving] = '\0'; // not entirely necessary, but good form
// remove trailing spaces
while (moving != 0 && isspace(str[moving-1]))
{
moving--;
str[moving] = '\0';
}
the shifting loop could be done with a call to strcpy, of course — if you are willing
to suffer the pointers; I was just being obsessive...
(of course repeat this process for both the label and value strings
— perhaps calling a function...?)
- with all this in place, your class' reading function probably looks
something like this:
void Class::read(istream & strm)
{
// known label array
strm.peek();
while (!strm.eof() && !end_of_block)
{
// read line
// split line at separator
// search for label -- case-insensitively
// switch to translate and store value based on label
strm.peek();
}
return;
}
- But what about detecting the end of a re-arranged data block?
use an array of bool values — one for
each valid label you have. Initialize this array to all false and then, as you process each label, set the
corresponding value in the array to true.
Furthermore, if you find that you've seen a label (it has a true in its array spot), you know you've reached
the end of your logical block! Simply seek back to the beginning of that
line (you did a tellg before you input the
line, didn't you?) and 'return' all your collected data to the
caller.
in addition, after each labeled line you process, you can accumulate
whether you've seen any labels at all (to help alleviate false
eof reports) or if you've seen all
the labels you needed to see — which would indicate you can stop
processing and 'return' the data now.
- the only problem left is how to translate that value string into a
numeric value when the variable is of type double or long or such.
(perhaps a lab you once
did..?)