New RPG PTF solves level problems and repeating-element challenges
IBM recently announced two significant enhancements to RPG’s support for XML processing via the RPG Café. Using the Café for such an announcement is interesting in itself and perhaps a sign that IBM is trying new vehicles to reach its audience. However, the announcement’s content is even more interesting. Although it doesn’t provide a solution for all of the issues that we and others have encountered with XML-INTO, it certainly goes a long way toward dealing with two of the most common problems.
Here’s an extract from an XML document we’ll use to demonstrate the issues and how the new support addresses them:
<?xml version="1.0" encoding="UTF-8"?>
<CustomerData>
<Customer>
(A) <Name category="Retail" ID="J020">Jones and Co.</Name>
(B) <Address type="Mailing">
<Street>2550 Main Street</Street>
<City>Knoxville</City>
<State>TN</State>
</Address>
(B) <Address type="Shipping">
<Street>45 Opryland Drive</Street>
<City>Nashville</City>
<State>TN</State>
</Address>
</Customer>
<Customer>
<Name category="Wholesale" ID="S197">Smith Brothers</Name>
<Address type="All">
.....
Find the full XML sample in Code Sample 1.
Problem 1: Data and Attributes at the Same Level
The Name element at (A) in the sample highlights the first problem. The attributes “category” and “ID” must be defined as subfields of the parent element “Name.” The only way we can handle this in RPG is to specify “Name” as a DS like so:
D Name DS Template
D category 10a
D ID 4a
But when we do this, there’s nowhere to put the actual data (“Jones and Co.”). The parser can’t store it in Name since that would overwrite the category and ID fields. Faced with this problem, XML-INTO has, until now, simply ignored the data. That left us with a choice of either reformatting the XML document so that XML-INTO could handle it or resorting to XML-SAX to retrieve the parts that XML-INTO couldn’t reach. Neither solution was optimal, but with the advent of this new support we have a more direct solution.
All we must do is add the keyword datasubf=data to the %XML options list. When this is specified, if the parser encounters a situation of this type, it looks to see if there’s a subfield with the name “data” in the DS and, if so, stores the element’s data there.
If we add datasubf=data, our example can now be coded as:
D Name DS Template
D category 10a
D ID 4a
D data 32a
We’re not forced to use the name “data”—we can put anything we like after the equal sign and the parser will look for it. We must confess this approach wasn’t what we’d expected. We always assumed that when IBM added support it’d simply be in the form of duplicating the DS name as a subfield of the DS. In other words, the data in this case would have been stored in Name.Name. But it turns out IBM studied the code samples submitted along with the related bug reports and decided to base the support on the most common approaches users had made. Seems a good enough reason to us.
Note you may need to define multiple fields with the same base name (i.e., “data”) but this isn’t a problem since all of these fields will be in qualified data structures anyway. So, in practice, they’ll be know as X.data, Y.data, etc. In our demonstration program, the field name is Customer(c).Name.data. Find this at (C) in Code Sample 2.
New RPG PTF solves level problems and repeating-element challenges
Problem 2: How Many Repeating Elements Are There?
XML documents often include multiple repeating elements. In our sample, the Customer element can be repeated, but so can the Address elements (B) within it (e.g., the customer may have multiple delivery addresses). We can identify the number of Customer elements that were loaded by using the XMLelements count at offset 372 in the Program Status Data Structure (PSDS). But until now, RPG gave us no assistance in determining exactly how many Address elements were present. In fact, it did just the opposite. As soon as we defined it as an array, RPG made the assumption that that many elements would always be present. So how could we deal with a variable number of occurrences? The only answer was to specify the %XML option “allowmissing=yes.” Without this option, anything less than the maximum number of elements was considered an error.
The problem with this approach is there’s no granularity to the “allowmissing” option. We couldn’t, for example, say “allowmissing=yesButOnlyForAddress.” As a result, anything (indeed, everything) could be missing from the XML document and we’d never know! Additionally, determining the number of active elements could also be a problem, particularly when using %HANDLER, since you can’t initialize the DS to assist in determining how many elements were loaded.
Luckily, those days are gone. By the simple addition of the %XML option countprefix, the parser can build a count for us of any and all repeating elements. countprefix differs slightly from datasubf in that the name we specify is used to form the first part of the field name for the count field we want to create. So if we wanted to count the number of Address elements, we might specify countprefix=count and then add a field with the name countAddress to the -INTO DS at the appropriate level. The resulting DS would look something like this:
D Customer DS Qualified Dim(99)
D Name LikeDS(Name)
D Address LikeDS(Address) Dim(10)
D countAddress 5i 0 Inz
Notice the positioning of the count field—it must occur at the same hierarchical level in the DS as the repeating element that it relates to (in this case, Address). The position of the field within that level doesn’t matter. If it makes more sense to you to place it ahead of the Address entry, you’re free to do so. When the compiler comes across such a field definition, it no longer insists that that specific array be filled and uses the supplied field to build a count of the number of active elements in the array.
This support can be used in other ways, and we’ll return to explore those either in a future EXTRA or our weekly blog. Just to give you a hint: the use of such count fields isn’t limited to repeating elements (i.e., arrays in RPG terms) but can be used for any element. Thus, it’s possible to easily determine if a particular element was present in the XML document simply by testing if its associated count field is non-zero. This lets us avoid using allowmissing=yes to handle expected situations, which in turn means that the default of allowmissing=no will protect us from processing bad XML streams.
Problems Solved
Several XML-INTO problems are solved with the addition of this simple PTF. Check out Code Sample 2 to see the result of the changes. Note in particular the use of these new options at (D).
But it’s not all good news. IBM has (at least for the time being) determined that this PTF will be available for release 6.1 only. Those of you running 5.4 will still have to find alternate means for solving these problems.