January 27, 2013

Article at IBM Systems

View original

Handling Input Handlers With RPG Open Access

Recently, we’ve been revisiting RPG Open Access. Last month, we explored an OA handler that used a Web service to convert currency values. In an earlier series on OA we explored a handler to write to the IFS so that programmers could write to IFS files the same way they do with DDS-defined database files.

This latest article in our RPG Open Access series represents our first real venture into the world of input handlers. As you’ll see, while the fundamentals remain the same as for an output handler, many more architectural decisions can be involved in the writing of a truly generic input handler.

For our example, we wanted to write an IFS handler that would allow an RPG programmer to read from a CSV file just as easily as from a database, thereby avoiding the need to use CPYFRMIMPF or other similar utilities. Just declare the file as normal and away we go. Well almost. Of course, the programmer will need to identify the handler and the name of the IFS file to process. But that’s all the additional work that should need to be done.

Here’s the source for a very simple test program that uses our new handler. As with the previous examples, the only significant difference is the appearance of the HANDLER keyword on the F-spec for the file. Apart from that, it’s a straightforward program that simply loops through the file printing each record it encounters.

                                                                  

 FIFSDATA1  IF   E             Disk    Handler('IFSINPHND1' : ifs_info1) 

 F                                     UsrOpn                            

 FQSYSPRT   O    F  132        PRINTER                                   

                                                                         

  // Copy in the template for the additional IFS data

  //   Contains a subfield named “path” for IFS path name                       

  /copy ifs_cpy                                                                 

  

  // Define IFS file name etc. based on the template                                                                     

 D ifs_info1       ds                  likeds(ifs_hdlr_info_t)

                                                                          

  /free                                                                  

      // Set up IFS path name and then open file                                                                   

      ifs_info1.path = '/Partner400/IFS_INP1.csv';                       

      Open IFSDATA1;                                                     

                                                                         

      Read IFSDATA1;                                                     

      // Print each record and read the next record until EOF                                                                   

      DoW not %eof(IFSDATA1);                                            

                                           

        Except showData;                                                 

                                                

        Read IFSDATA1;                                                   

      EndDo;                                                             

                                                                         

      *inlr = *On;                                                       

                                                                         

  /End-Free  

                                                            

 OQSYSPRT   E            ShowData    1                                   

 O                       zoned5_0            10                          

 O                       zoned5_2            20                          

 O                       packed7_2           30                          

 O                       dateUSA             45                          

 O                       Char80             132

Writing the Handler—Decisions, Decisions, Decisions

The first decision we had to make was how to map the fields in the CSV to the fields in the file. Since we wanted to keep this initial example simple, we decided that we would lay out the field names in the physical file (IFSDATA1, in this example) in the sequence that they existed in the CSV file. In a future article, we’ll discuss a more flexible approach that utilizes the first record in the CSV file to supply the field names. But for now we’ve just taken the simple approach and mapped the fields in sequence.

This is the file definition we created:

     A          R IFSDATAR1                              

     A            ZONED5_0       5S 0                    

     A            ZONED5_2       5S 2                    

     A            PACKED7_2      7P 2                    

     A            CHAR80        80A                      

     A            DATEUSA         L         DATFMT(*USA) 

And this is a sample record from the CSV file it was based on:

123,250.00,12345.6,"First Record includes a comma , in the data",05/11/2011

The second decision was how to parse the CSV string. We need to break up the fields based on the separators (commas), and then deal with the double quote marks around the character strings. If that were all there were to it, we might have been tempted to write our own parser. But in the future, we might need to process CSV files that use alternative field delimiters such as the pipe character (|) or a different text delimiter such as the single quote.

For this reason and because we hate reinventing the wheel, we searched for an existing service program to perform the parsing. As so often in the past, Scott Klement came up with the goods in the form of the CSVR4 Service Program, so we used that. This also had the added advantage that error handling was already built in to deal with situations such as a failure to open the CSV file.

We’ll be using the following procedures from CSVR4:

CSV_open to open the CSV file in the IFS

CSV_close to close the IFS file

CSV_loadRec to “read” each CSV record

CSV_getFld to retrieve the values between the commas (the “fields”) from the CSV record


As we’ll discuss later, these are not the only decisions to be made, but it’ll be easier to discuss others as we take a look at the relevant parts of the handler itself. As usual, you can find the full code for the handler on our website. We’ve included more comments in the source than usual so you can see what is going on.

The Mechanics of the Handler

We’re not going to go through the basics of how a handler works generically. We’ve covered that in the earlier articles. Instead, we’ll discuss just the basic logic flow and cover some differences involved in writing an input handler.

This handler will only deal with Open, Close and Read operations. All other operations attempted will result in general RPG error status 1299. The Open and Close operations simply use the CSV_open and CSV_close procedures from CSVR4. But the read operation is a bit more complex.

First, the CSV_loadrec() routine is called to load the record into memory. Then CSV_getfld() is called repeatedly to retrieve each field in turn. The data retrieved in this way is then placed in the RPG field buffer and the field length set. The EOF status is set When CSV_loadrec() indicates there are no more records to process.

With our previous output handlers, RPG formatted the individual field data for us and told us how long it was, etc. With an input handler, things are a little different. In this case, RPG informs us of the data type, decimal places, etc., but it’s up to us to format the data in the buffer to meet the needs of that type of field. This requires us to make design choices, and these choices will be different depending on how generic we want to make the handler and (perhaps) how much control we have over the format of the source data.

So what other decisions do we need to make?

Suppose we discover that the data supplied for a 20-character field is actually 25 characters in length. Do we simply truncate it? Or report it as an error?

Formatting numeric fields to keep RPG happy also presents its own challenges and might require that we change the format. For example, RPG is quite happy with a value such as -123.5, but will be distinctly unhappy with 123.5- or 123.5CR, both of which are common representations of the same amount. If we leave it to RPG, then the programmers using this handler will get decimal data or data conversion errors, just as they would if this were a “real” database file with bad numeric data.

A similar situation arises with date fields. Should we attempt to validate dates to ensure that they meet the required format or simply load them into the buffer and let RPG worry about it? If we decide to validate the date, then can we assume the format in the CSV will be the same format as in our database file definition? If not, then we’ll need to pass additional information to the handler to tell it which dates are in what format.

Another issue is demonstrated by the CSV record shown here:

Notice that the record includes two empty fields (i.e. consecutive commas with no data between them). If you look at our test file layout, you’ll see that the first one maps to a numeric field and the second to character. So what if when we encounter empty fields we simply inform RPG that the field value is zero bytes in length? It turns out that while that works for character fields, RPG blows up with a conversion error when you use that approach with numerics. Luckily, since we know what kind of field we’re dealing with, we can easily avoid such errors by testing for a zero-length numeric value, placing a zero in the buffer and setting the length to 1. But what if there were an empty date field? Should we set it to the low value for the date type? Or perhaps to today’s date?

The only situation the current version of the handler deals with is the zero-length numeric. The others are ignored and we just map the raw data—and maybe for your situation that is sufficient.

More to Come

As you can see, writing an input handler is no more complex than writing an output handler, but there tend to be a lot more design decisions involved. When we revisit this handler in a future article, we’ll discuss which features we decided to implement and why. In the meantime, if you have any questions or suggestions for improvements, please let us know.