IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

XML Parsing Issue


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
JnanaR

New User


Joined: 20 Jul 2009
Posts: 23
Location: Mumbai

PostPosted: Sat Jul 19, 2014 12:23 pm
Reply with quote

Hi Guys

In production we are getting an abend in a cobol xml parsing module as one xml element contains '&' from time to time.As "&" will generate an error because the parser interprets it as the start of an character entity.And we are abending the program in case of an XML parsing error.

I suggested that the incoming xml should use CDATA to avoid the parsing error.But since the incoming XML comes from external application we don't have control to change the structure of XML and we have to handle it in our own cobol module..

So basically the now requirement becomes we are going to still use the XML parser but for that erroneous field we have to move the value of the field to required field without letting it abend.Would be thankful If anyone could pls suggest an idea how to implement the solution!!!
Back to top
View user's profile Send private message
Ed Goodman

Active Member


Joined: 08 Jun 2011
Posts: 556
Location: USA

PostPosted: Mon Jul 21, 2014 6:20 pm
Reply with quote

How in the world?

The sending document does not sound as if it's a valid XML document. At the very least, the ampersand should be &

Depending on your compiler settings for XMLPARSE(), you MAY be able to handle the error with an ON EXCEPTION clause in the XML parse routine.

When you say 'it's from an external source,' do you mean another group in the same company, or do you mean a real vendor?
Back to top
View user's profile Send private message
JnanaR

New User


Joined: 20 Jul 2009
Posts: 23
Location: Mumbai

PostPosted: Tue Jul 22, 2014 12:12 am
Reply with quote

Thanks Ed for the reply!!

Let me your answer your 2nd question first.It's a vendor and it's not another group within the company.

There is a field called remittance info where we are getting this '&' at times.So basically the xml looks like this!!
eg:

<remitinfo>transfer to xy & z co</remitinfo>

In this scenario e need to move the remittance info text 'transfer to xy & z co' to our required field.Since it has parsing error at the char '&' I don't think the XML-TEXT will contain the complete string.ON EXCEPTION we can't move the XML-TEXT to our field as it's incomplete.That's where I am stuck.
Back to top
View user's profile Send private message
Ed Goodman

Active Member


Joined: 08 Jun 2011
Posts: 556
Location: USA

PostPosted: Tue Jul 22, 2014 1:53 am
Reply with quote

I am telling you that the XML is not properly formed. That is a problem the vendor should fix. I know that's not likely to happen though.

Can you tell me if they are correctly sending any of the OTHER replacements? Like &GT; or &LT; If not, then you may be able to do a little bit of prescan on the data to replace the ampersand with another character.

However...using well-formed XML is kind of the point of using XML in the first place. If you took that XML document you are getting the error from, and ran it through ANY parser, you'd get the same error.

You might want to swim upstream a little bit and make sure no one is doing a character translation between you and the source. The ampersand may not be an ampersand when they sent it.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Tue Jul 22, 2014 2:05 am
Reply with quote

The correct approach to resolving this issue is to open a problem ticket with the vendor, reporting that you are not getting proper XML from their application.

If you insist on doing something about the improper XML in your code, you will have to scan the input data BEFORE the XML PARSE is done, finding all & and validating them (remembering, as pointed out, that & can be followed by different things in XML) while repairing any invalid ones.
Back to top
View user's profile Send private message
Ed Goodman

Active Member


Joined: 08 Jun 2011
Posts: 556
Location: USA

PostPosted: Tue Jul 22, 2014 8:29 pm
Reply with quote

You're going to love me for this... I love me more for thinking of it, and I was already pretty fond of myself.

I experimented a little bit with the XML parser. There is SOME exception handling, but it's a limited. However, it is very good at its job.

I found that if I had an extra ampersand in the input, I could locate it by checking the length of the XML-TEXT register at the time of the error. This is different than the 'character position' or 'offset' given in the error message because it has translated some of the code already.

This gives me the ability to go back to the input field and replace the ampersand with a plus sign (or anything else). You can repeat as needed.

So...I just parsed the XML twice! Once with a simplistic parsing routine that really did nothing, then a real pass with the proper parsing routine. I use the ON EXCEPTION processing to do the replacing on the first pass.

Code:

1 bad-char-pos pic 9(05).

XML PARSE xml-document-orig PROCESSING PROCEDURE xml-checker
  ON EXCEPTION                                               
    compute bad-char-pos = length of XML-TEXT               
    subtract 1 from bad-char-pos                             
    if xml-document-orig(bad-char-pos:1) = '&'               
        move '+' to xml-document-orig(bad-char-pos:1)       
        move 0 to XML-CODE                                   
    end-if                                                   
  NOT ON EXCEPTION                                           
    display 'XML document successfully parsed'               
END-XML       

XML PARSE xml-document-orig PROCESSING PROCEDURE xml-handler
  ON EXCEPTION                                             
    display 'XML document error ' XML-CODE                 
    display 'XML Text ?' XML-TEXT '?'                       
  NOT ON EXCEPTION                                         
    display 'XML document successfully parsed'             
END-XML                                                     

xml-checker section.             

   xml-handler section.                                       
      evaluate XML-EVENT                                     
 * ==> Order XML events most frequent first                   
        when 'START-OF-ELEMENT'                               
          display 'Start element tag: <' XML-TEXT '>'         
          move XML-TEXT to current-element                   
        when 'CONTENT-CHARACTERS'                             
          display 'Content characters: <' XML-TEXT '>'       
 * ==> Transform XML content to operational COBOL data item...
          evaluate current-element                           
            when 'listprice'                                 
 * ==> Using function NUMVAL-C...                             
              compute list-price = function numval-c(XML-TEXT)
            when 'discount'                                   
              compute discount = function numval-c(XML-TEXT) 
          end-evaluate 
          .
          .
          .
      end-evaluate                               
                                     


You can use this after modifying it a little bit to fit your situation. You could run the fake pass more than once too. Right now, it will only fix the first problem.
Back to top
View user's profile Send private message
JnanaR

New User


Joined: 20 Jul 2009
Posts: 23
Location: Mumbai

PostPosted: Thu Jul 24, 2014 1:02 pm
Reply with quote

Thanks Guys for the replies.

Actually I was misinformed that the XML is coming from a vendor.some group in our company is actually creating the XML from the files they are getting from the Vendor.So we have asked them to handle the & in proper XML format
Back to top
View user's profile Send private message
daveporcelan

Active Member


Joined: 01 Dec 2006
Posts: 792
Location: Pennsylvania

PostPosted: Thu Jul 24, 2014 5:38 pm
Reply with quote

Why is your company creating a file in XML format, then taking the file and parsing the XML in the Cobol program?

Seems like alot of unneeded work.

Can the 'other group' create a sequential file for your Cobol, program to process?

Or you just process the file from the vendor?

Too simple I know.
Back to top
View user's profile Send private message
Ed Goodman

Active Member


Joined: 08 Jun 2011
Posts: 556
Location: USA

PostPosted: Thu Jul 24, 2014 6:52 pm
Reply with quote

Dave,
They are probably making a Web Service out of the data. All that extra work is to achieve compatibility. They did wrong somehow though.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Parsing Large JSON file using COBOL COBOL Programming 4
No new posts Issue after ISPF copy to Linklist Lib... TSO/ISPF 1
No new posts parsing variable length/position data... DFSORT/ICETOOL 5
No new posts Facing ABM3 issue! CICS 3
Search our Forums:

Back to Top