Part F - Refinements

More Input and Output

Implement algorithms using standard library procedures to incorporate existing technology
Stream data using standard library functions to interact with users and access persistent text

"The library implements a simple model of text input and output"
(Kernighan and Ritchie, 1988)

Input | Output | Custom Input | Exercises



The standard input and output library (stdio) that ships with C compilers provides comprehensive support for communicating with the user and with secondary storage.  This support includes numerical as well as character string processing under format control and optionally line by line processing without format control.  For platforms that don't support line by line input processing, we write our own custom procedures. 

This chapter reviews the conversion specifiers for formatted input and output along with the library functions for line by line input and output.  Specifiers not covered in previous chapters are included here.  This chapter concludes with two custom functions for input that safeguard line mismatching and memory overflow.


Input

The stdio library functions for processing input are:

  • scanf() - input from standard input under format control
  • fscanf() - input from file under format control
  • getchar() - character by character input from standard input (see Input Functions)
  • fgetc() - character by character input from file (see Input Functions)
  • gets_s() - line by line input from standard input (not universally implemented)
  • fgets() - line by line input from file

Typically, standard input refers to the keyboard.

Formatted Input

The scanf(...) and fscanf(...) functions accept data from the standard input device or secondary storage respectively and store that data in memory at the address specified in their argument list.  Their prototypes are

 int scanf(const char *format, address);
 int fscanf(FILE *, const char *format, address);

format

format receives a string literal that describes how to convert input text into data stored in memory.  Calls to these functions can take multiple arguments.  format contains the conversion specifier(s) for translating the input characters.  Conversion specifiers begin with a % symbol and identify the type of the destination variable.  The possible specifiers are listed below.  Other specifiers may be found on the web.

  Specifier  Input Text is a  Destination Type
%ccharacter  char, char []
%ddecimal  int, short, long, long long
%iinteger  int, short, long, long long
%ounsigned octal  unsigned int, short, long, long long  
%x unsigned hexadecimal   unsigned int, short, long, long long  
%uunsigned decimal  unsigned int, short, long, long long  
%n--  int, short, long, long long  
 %f %e %g %a floating-point  float, double, long double
%s character string   char []
%[ ] %[^ ] character string   char []
%p address   any type

%n does not read any characters but instead returns the number of characters processed.  %f, %e, %g and %a treat floating-point input identically.  Size specifiers also apply to %i , %o, %x, %u and %n, but are not listed here.

address

address receives the address of the destination variable.  We specify a separate address argument for each conversion specifier in the format string. 

Conversion Control

We may insert control characters between the % and the conversion character.  The general form of a conversion specification is

 % * width size conversion_character

The three control characters are

  • * - suppresses storage of the converted data (discards it without storing it)
  • width - specifies the maximum number of characters to be interpreted
  • size - specifies the size of the storage type

A conversion specifier that includes an * does not have a corresponding address in the argument list.  This is an exception to the matching conversion-specifier/argument rule.

The size specifiers covered in this course are listed below.  Others may be found on the web

  Specifier with Size  Input Text is a  Destination Type 
%hhd %hhivery short decimal  char
%hd %hishort decimal  short
%ld %lilong decimal  long
%lld %lli  very long decimal    long long
 %lf %le %lg %la floating-point  double
 %Lf %Le %Lg %La floating-point  long double
%hhu %hho %hhxunsigned very short decimal  unsigned char
%hu %ho %hxunsigned short decimal  unsigned short
%lu %lo %lxunsigned long decimal  unsigned long
%llu %llo %llx  unsigned very long decimal    unsigned long long 
%hhn character string   char
%hn character string   short
%ln character string   long
%lln character string   long long

Problems with %c

Because scanf() and fscanf() only extract the characters that they need from the buffer, problems arise with %c conversions.  Consider the following program.  On reading an integer value, scanf() leaves the newline character in the input buffer.  Since the next call to scanf() starts with a %c specifier, scanf() treats the unprocessed '\n' as the input character.  This program produces the output shown on the right and never collects the tax status input from the input buffer.

 /* scanf with %c Specification
  * scanf_c.c
  */

 #include <stdio.h>

 int main(void)
 {
         int items;
         char status; // tax status g or p
         printf("Number of items : ");
         scanf("%d", &items);
         printf("Status : ");
         scanf("%c", &status);   // ERROR reads \n 
         printf("%d items (%c)\n", items, status); 
         return 0;
 }











  Number of items : 25
  Status : 25 items ( 
  )
   
   
   

Note how the newline character (accepted as the tax status) places the closing parenthesis on a newline.

There are different ways to handle unprocessed '\n' characters.  Some are listed in the code snippets below.

A space before a conversion specifier forces the skipping of all leading whitespace before the next conversion.  For example, " %c" directs scanf() to skip whitespace before reading the next non-whitespace character. 

          scanf("%d", &items);
          scanf("%c%c", &junk, &status); /* store one character in junk first */

          scanf("%d", &items);
          scanf("%*c%c", &status);       /* swallow one character first */

          scanf("%d", &items);
          scanf(" %c", &status);         /* skip all whitespace first */

          scanf("%d%*c", &items);     /* swallow newline */
          scanf("%c", &status);

          scanf("%d", &items);
          clear();                   /* clear the buffer */
          scanf("%c", &status);

"%*c%c" swallows one character and accepts the next.  " %c" swallows all whitespace before the next non-whitespace character.  One corrected version of the above program is

 /* scanf with %c Specification
  * scanf_cc.c
  */

 #include <stdio.h>

 int main(void)
 {
         int items;
         char status; // tax status g or p
         printf("Number of items : ");
         scanf("%d", &items);
         printf("Status : ");
         scanf(" %c", &status);   // note the space 
         printf("%d items (%c)\n", items, status); 
         return 0;
 }










  Number of items : 25
  Status : g

  25 items (g)
   
   
   

Unformatted Input

The library functions for processing unformatted input are:

  • getchar() - character by character input from standard input (see Input Functions)
  • fgetc() - character by character input from file (see Input Functions)
  • gets_s() - line by line input from standard input (not universally implemented)
  • fgets() - line by line input from file

gets_s

The gets_s() function

  • accepts an empty string
  • assumes no more than the specified number of characters
  • reads the '\n' as the delimiter
  • replaces the delimiter with the null terminator

gets_s() takes two arguments.  Its prototype is

 char *gets_s(char *address, int n);

The first parameter receives the address of the string to be filled.  The second parameter receives the maximum number of characters that can be stored including the null terminator.  On success this function returns the address of the filled string. 

For example,

 // Read and Display Lines
 // gets_s.c

 #include <stdio.h>

 int main(void)
 {
         char first_name[21]; 
         char last_name[21]; 

         printf("First Name : ");
         gets_s(first_name, 21);
         printf("Last Name  : "); 
         gets_s(last_name, 21);
         puts(first_name);
         puts(last_name);

         return 0;
 }











 First Name : Arnold

 Last Name  : Schwartzenegger 
 Arnold
 Schwartzenegger



The behavior of gets_s() is undefined if the user inputs a line longer than the allocated string.  On a Windows platform, this function crashes.  The standard recommends use of fgets() instead of gets_s().

fgets

The fgets() function

  • reads a stream of bytes from the specified file
  • accepts an empty string
  • accepts no more than the specified number of characters
  • reads until the '\n' delimiter
  • includes the '\n' delimiter in the character string
  • does not discard the '\n' delimiter
  • adds the null terminator to the character string

The prototype for this function is

 char* fgets(char str[], int max, FILE *fp);

str receives the address of the string to be filled.  max receives the maximum number of bytes in str including space for the null byte.  fp receives the address of the FILE object.  fgets() appends the null byte to the stored string.  fgets() returns the address of str if successful; NULL in the event of an end of file or read error.


Output

The stdio library functions for processing output are:

  • printf() - output to standard output under format control
  • fprintf() - output to a file under format control
  • putchar() - character by character output to standard output (see Output Functions)
  • fputc() - character by character output to a file (see Output Functions)
  • puts() - character string output to standard output (see Output Functions)
  • fputs() - character string output to a file (see Output Functions)

Formatted Output

The printf(...) and fprintf(...) functions report the value of the variable(s) or expression(s) in the argument list to the standard output device or the specified file respectively.  Their prototypes take the form

 int printf(const char *format, ...);
 int fprintf(FILE *, const char *format, ...);

format

format is a string literal containing conversion specifiers and any characters to be output directly.  Each conversion specifier begins with a % symbol and identifies the type of the source variable.  The order of the specifiers matches the order of the values received. 

Conversion Specifiers

The conversion specifiers include: 

  Specifier  Output Text is a  Use with Type
%ccharacter  char
%dsigned decimal  int, short, long, long long
%isigned integer  int, short, long, long long
%uunsigned decimal  unsigned int, short, long, long long  
%ounsigned octal  unsigned int, short, long, long long  
%x unsigned hexadecimal   unsigned int, short, long, long long  
%X unsigned hexadecimal (uppercase)   unsigned int, short, long, long long  
%n--  int *  
 %f floating-point  float, double, long double
 %F floating-point (uppercase)  float, double, long double
 %e scientific floating-point  float, double, long double
 %E scientific floating-point (uppercase)  float, double, long double
 %g shortest floating-point  float, double, long double
 %G shortest floating-point (uppercase)  float, double, long double
 %a hexadecimal floating-point  float, double, long double
 %A  hexadecimal floating-point (uppercase)   float, double, long double
%s string of characters   char *
%% the character %   char *
%p address   --

%n does not output any characters but instead returns the number of characters processed so far. 

Scientific (%e %E) refers to output in mantissa/exponent form d.dddEdd (for example, 0.123e3, which stands for 0.123 x 103 or 123.0). 

General (%g %G) refers to output in the shortest form possible; decimal or mantissa/exponent.  (for example, 0.123e-5 rather than 0.00000123 and 3.1 rather than 0.31e1

Conversion Control

We may insert control characters between the % and the conversion character.  The general form of a conversion specification is

 % flags width . precision size conversion_character

The five control characters are

  • flags
    • - prescribes left justification of the converted value in its field
    • 0 pads the field width with leading zeros
  • width  sets the minimum field width within which to format the value (overriding with a wider field only if necessary).  Pads the converted value on the left (or right, for left alignment).  The padding character is space or 0 if the padding flag is on
  • .  separates the field's width from the field's precision
  • precision  sets the number of digits to be printed after the decimal point for f conversions and the minimum number of digits to be printed for an integer (adding leading zeros if necessary).  A value of 0 suppresses the printing of the decimal point in an f conversion.  An * instead of a number applies the value from the next argument in the argument list
  • size  identifies the size of the type being output 

The size specifiers covered in this course are listed below.  Others may be found on the web

  Specifier with Size  Output Text is  Use with Type 
%hhd %hhivery short decimal  char
%hd %hishort decimal  short
%ld %lilong decimal  long
%lld %lli  very long decimal    long long
 %lf %lF %le %lE %lg %lG %la %lA floating-point  double
 %Lf %LF %Le %LE %Lg %LG %La %LA floating-point  long double
%hhu %hho %hhx %hhXunsigned very short decimal  unsigned char
%hu %ho %hx %hhXunsigned short decimal  unsigned short
%lu %lo %lx %hhXunsigned long decimal  unsigned long
%llu %llo %llx %hhX  unsigned very long decimal    unsigned long long 
%hhn character string   char
%hn character string   short
%ln character string   long
%lln character string   long long

Custom Input

Mismatching Line Input

Managing line-oriented input helps in debugging.  Consider a set of input lines some of which contain incorrect input.  Ideally, a one-to-one correspondence should exist between the lines of input data and the lines read by the program.  Even if the user inputs a line incorrectly, subsequent correct input may still be acceptable.  In other words, incorrect input on one line should not cause incorrect reading of subsequent lines. 

Ideally, line by line input should

  • store characters only to a specified maximum
  • accept an empty string
  • read the '\n' as the line delimiter
  • discard the delimiting character along with any characters that overflow memory
  • append the null terminator to the set of characters stored

The following code meets all of these conditions

 // Custom Line-Oriented Input
 // getline.c

 #include <stdio.h>

 // getline accepts a newline terminated
 // string s of up to max - 1 characters,
 // adds the null terminator and discards
 // the remaining characters in  the input
 // buffer including terminating character
 //
 char *getline(char *s, int n)
 {
         int i, c;
         for (i = 0; i < n - 1 && (c =
          getchar()) != EOF && c != (int)'\n'; 
          i++)
                 s[i] = c;
         s[i] = '\0';
         while (n > 1 && c != EOF && c !=
          (int)'\n')
                 c = getchar();
         return c != EOF ? s : NULL;
 }

 int main(void)
 {
         char first_name[11]; 
         char last_name[11]; 

         printf("First Name : ");
         getline(first_name, 11);
         printf("Last Name  : ");
         getline(last_name, 11);
         puts(first_name);
         puts(last_name);

         return 0;
 }































 First Name : Arnold

 Last Name  : Schwartzenegger 
 Arnold
 Schwartzen



This function, unlike gets_s() has well-defined behavior if the number of characters entered exceeds the amount of memory available to store the string.

Insufficient Memory

Consider the file named spring.dat, the contents of which are listed below.  Each record in this file contains three fields: the first field holds the quantity, the second field holds a string describing the item and the third field holds the unit price of the item.  The field delimiter is the semi-colon character: 

 2;Light Jacket;95.89
 3;Long Pants;67.89
 2;Large Duster;45.98

The following program reads each record from the file and displays the fields in a tabular format

 // Tabular Data
 // table.c

 #include <stdio.h>

 int main(void)
 {
         FILE *fp = NULL;
         char label [14];
         int n;
         double price;

         fp = fopen("spring.txt","r");
         if (fp != NULL) {
                 printf("    Spring Items\n"
                  "    ============\n\n"
                  "No Description  Price\n"
                  "---------------------\n"); 
                 while (fscanf(fp,
                  "%d;%13[^;];%lf%*c", &n, label,
                  &price) == 3)
                         printf("%2d %-13s%5.2lf\n", 
                          n, label, price);
                 fclose(fp);
         }
         return 0;
 }
  
  
  
  
  
  
  
  
  
  
  
  
  
      Spring Items
      ============

 No Description  Price
 ---------------------
  2 Light Jacket 95.89
  3 Long Pants   67.89
  2 Large Duster 45.98
  
  
  
  
  
                    

Note how the field delimiters have been embedded within fscanf()'s format string.

Safe Coding

The above program executes successfully only if the descriptive strings in the file do not contain more than 13 characters.  The data in a different file that contains longer labels will not fit into the space allocated by the program. 

To process any file and safeguard against memory overflow, we upgrade the program to skip that part of a description that exceeds the memory allocation.  We do so by reading each record in two separate statements:

 // Insufficient Memory
 // table_plus.c

 #include <stdio.h>

 int main(void)
 {
         FILE *fp = NULL;
         char label [14];
         int n;
         double price;
         char c;

         fp = fopen("spring.txt","r");
         if (fp != NULL) {
                 printf("    Spring Items\n"
                  "    ============\n\n"
                  "No Description  Price\n"
                  "---------------------\n");
                 while (fscanf(fp,"%d;%13[^;]%c",
                  &n, label, &c) == 3) {
                         if (c == ';')
                                 fscanf(fp,"%lf\n",
                                  &price);
                         else
                                 fscanf(fp,
                                  "%*[^;];%lf%*c",
                                  &price); 
                         printf("%2d %-13s%5.2lf\n", 
                          n, label, price);
                 }
                 fclose(fp);
         }
         return 0;
 }
  
  
  
  
  
  

  
  
  
  
  
  
  
  
  
  
      Spring Items
      ============

 No Description  Price
 ---------------------
  2 Light Jacket 95.89
  3 Long Pants   67.89
  2 Large Duster 45.98
  
  
  
  
  
  
  
  
  
                    

The first statement reads the first two fields stopping at the second delimiter or once memory is full, whichever comes first.  If the statement has encountered the second delimiter, the second statement reads the price; if not, the alternate version of the second statement skips the remaining characters in the field and the second delimiter and only then reads the price. 

The program stops reading altogether as soon as it encounters a record with other than 3 input values - the quantity, the descriptive string and the second delimiter.


Exercises




   Printer Friendly Version of this Page print this page     Top  Go Back to the Top of this Page
Previous Reading  Previous: Strings Next: Pointers, Arrays and Structures   Next Reading


  Designed by Chris Szalwinski   Copying From This Site   

Creative Commons License