Jon Paris and Susan Gantner

IBM i Consultants, Developers, Educators and Apostles

Feb 28, 2011
Published on: IBM Systems
3 min read

Namespace support makes the opcode a viable option

When details of version 5 release 4 were announced, many people were surprised to find that RPG had added native support for XML processing in the form of the two new opcodes, XML-INTO and XML-SAX. If you’re not familiar with the topic (and the rest of this may not make much sense unless you are), we have written a number of articles on the subject. Starting with “A Traditional Approach to a Modern Technology,” which was followed by “More on RPG’s XML Support.” We also covered subsequent enhancements made to the support via PTF in the V6 release in “XML-INTO Revisited.” Now that we’re all on the same page in the hymnbook, we can take a look at IBM’s latest enhancements.

Despite the ease of use that this XML support appears to offer, uptake has been relatively slow. This is surprising since there has been a significant increase in the number of IBM i shops that are handling XML data as part of their daily workload. The reason for this slow adoption, as many of you may have discovered for yourselves, is that XML-INTO lacks namespace support, and namespaces are commonplace in most standard XML documents.

The simplest way to think about namespaces is that they are a means of qualifying element names in much the same way that qualification of an RPG data structure allows you to have two different versions of the same field in a program. Why is this necessary? Because anybody and everybody can devise XML schemas. It is after all a language designed explicitly for this purpose. Such flexibility raises the possibility, indeed probability, that two different organizations may have decided to use identical element names for different purposes. For example, consider names such as "value," "quantity," "name" or "address." Not only are these very common identifiers, but some may have multiple meanings depending on context. "name" might refer to the name of a company—or an individual. "address" could refer to a physical location such as "24 Main Street" or to an IP address such as 10.1.1.25. Add to this the fact that XML documents frequently contain other XML documents as the "payload" and you can see the potential for problems. Namespaces are the mechanism by which these issues are avoided. Typically a namespace has a connection with the company or organization that originated it. For example, our company domain is Partner400.com. If we were to design an XML schema to contain our customer information we might choose to use the namespace of "partner400." Our elements would be identified by placing the namespace followed by a colon, immediately before the relevant element names. So, our document might look something like this:

<partner400:customer xmlns:partner400="http://www.partner400.com">
   <partner400:address>
      <partner400:street>61 Kenninghall Cres.</partner400:street>
      <partner400:city>Mississauga</partner400:city>
      <partner400:postcode>L5N 2T8</partner400:postcode>
   </partner400:address>
</partner400:customer>

As you can see, there’s no chance that our street element would be confused with a street element from the ABC company’s schema. So while XML provided a solution to the problem, until now RPG had no way of handling it. Namespaces were simply not supported. As a result, many people who should have been able to use XML-INTO were forced to seek other solutions such as XML-SAX, alternative parsers or pre-processing the document to produce a solution such as incorporating the namespace within the element name so that RPG could cope with it. None of these was an ideal solution, and in some cases people just gave up, or expended the effort to write their own parsers to deal with the inadequacies of the standard feature.

All of that changes with the new RPG runtime PTF. This is currently available for V6 (SI42426) and should shortly be available for V7. It adds several new options to the XML-INTO support specifically designed to deal with namespaces. Let’s take a look at how this is achieved.


Namespace support makes the opcode a viable option

The Namespace Option

The first is the new %XML option ns. This is used to control the way in which XML-INTO handles names, which includes a namespace. When specified, the element name is modified before any attempt is made to match it to the names in the RPG data structures. The options available are "remove" and "merge." We’ll look at some examples of these options in a moment.

  • The "remove" option causes the namespace prefix to be ignored completely, and the matching against the data structure names will take place purely against the element names. As you will see later, we can find out what the actual namespace was if it turns out we need that information.
  • The "merge" option causes RPG to simply replace the colon with an underscore. The resulting name will be used when comparing with the RPG definitions.

To relate this to our company example, ns=remove would cause the matching to be made against "street" and "city". ns=merge would cause the match to search for fields named "partner400_street" and "partner400_city." If neither option is specified, the system will behave as it does today and attempt to match a name containing a colon, which will be a futile task since a colon cannot be used in an RPG name.

Examples:

In this first example, the remove option has been specified, so the matching is against the simple names.

 D Address         DS
 D  Street                       30A   Varying
 D  City                         25A   Varying
 D  Postcode                      7A
 ......
  /Free
   XML-Into Address  %XML(MyXMLDoc1 :
       'path=customer/address doc=file ns=remove');

This time we specified the merge option, so the matching variables in the DS must have the namespace prefix. Note also the impact on the specification of the path value. The namespace (together with the underscore) must be added to ensure a match as it’s also subject to the conversion triggered by the merge option.

 D Partner400_Address...
 D                 DS
 D  Partner400_Street...
 D                               30A   Varying
 D  Partner400_City...
 D                               25A   Varying
 D  Partner400_Postcode...
 D                                7A
 .....
  /Free

   XML-Into Partner400_Address  %XML(MyXMLDoc1 :
       'path=partner400_customer/partner400_address doc=file ns=merge');

Identifying the Namespace

We mentioned earlier that sometimes you might want to use the ns=remove option, but still need to know what the actual namespace value was. This requirement is supported by the nsprefix option. The prefix specified is used in combination with the element name to derive a field name into which the value of the namespace of the element in question can be placed. The effect of this will probably be more obvious if we describe it with a simple example rather than words so let’s look at how it would work with our example document.

If we specify the option nsprefix=ns, then if the target DS contains a field named NAMESP_xxxx (where xxxx is any of the elements in the document) then the namespace value would be placed in that field. You can define as many such fields as you like.

This is demonstrated in this third example. We’re stripping the namespace with the remove option but also requesting via nsprefix that the actual name of the namespace associated with the element street be placed in the variable ns_Street. In all other aspects, this example is the same as the first.

 D Address         DS
 D  Street                       30A   Varying
 D  ns_Street                    10A
 D  City                         25A   Varying
 D  Postcode                      7A

  /Free

   XML-Into Address %XML(MyXMLDoc1:
                         'path=customer/address doc=file +
                          ns=remove nsprefix=ns_');

Namespace support makes the opcode a viable option

A New case= Option

One other issue that several RPG users have encountered concerns the use of characters in element names that aren’t valid RPG field names. For example, it’s not uncommon for element names to include the hyphen but while this character is legal in COBOL field names, it isn’t valid in RPG. As a result, such documents can’t be correctly handled by XML-INTO and require alternative or additional processing. There’s an even bigger problem in European countries, Mexico and the province of Quebec—to name a few. In these areas, many element names will include accented characters such as à, é and ñ.

This is now handled by an extension to the group of case= options. Now in addition to the existing values ("upper," "lower" and "any") there’s now a "convert" option. When this is specified, any accented characters in a namespace, element or attribute name are converted to their uppercase A-Z equivalent. If after that conversion any characters remain that would be invalid in an RPG name, they’re converted to underscores. Should this conversion result in an underscore at the beginning of a name, it’s simply dropped. If there are multiple consecutive underscores, these are merged into a single underscore in the final name.

In this final example, we’ve modified the XML to include a hyphen in the name of the post code element (i.e., we changed to be post-code). By use of the convert option, this hyphen will be translated to an underscore and therefore map to the field Post_code in the DS below.

 D Address         DS
 D  Street                       30A   Varying
 D  City                         25A   Varying
 D  Post_code                     7A
 .....
  /Free
   XML-Into Address  %XML(MyXMLDoc2 :
                         'path=customer/address doc=file +
                          ns=remove case=convert');

A Viable Option

With the addition of this new support, RPG’s XML-INTO is now a viable option for processing the vast majority of XML documents. Until such time as the details are available in the RPG reference manual, the best source for the full details of this support is the document “RPG XML-INTO Namespace Options” in the RPG Café. If you’re on V7, keep an eye on this Café thread to see when the PTF becomes available.