IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Splitting XML File into different records


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Mon Jan 09, 2012 2:09 pm
Reply with quote

Hi,

I have a requirement where in I need to split a XML tag file based on the tag. For example, the sample file is

<Name>ABC</Name><Address>Address1</Address><City>US</City>

I want to split it as below:

<Name>ABC</Name>
<Address>Address1</Address>
<City>US</City>

It is a variable length file of maximum length 1014.

Is this possible using DFSORT?

Thanks,
Indrajit
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Jan 09, 2012 3:32 pm
Reply with quote

Will the closing tag always be on the same record as the opening tag?
Back to top
View user's profile Send private message
elango_K

New User


Joined: 18 Aug 2011
Posts: 44
Location: India

PostPosted: Mon Jan 09, 2012 5:44 pm
Reply with quote

Yes it is possible using DFSORT.

Try using the PARSE command.
Will it always be name address and city only?
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Mon Jan 09, 2012 6:22 pm
Reply with quote

The closing and opening tag will always be on the same record.

It will not be name address and city only. There are 600 of those tag records.

Thanks,
Indrajit
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Jan 09, 2012 6:37 pm
Reply with quote

Indrajit_57 wrote:
The closing and opening tag will always be on the same record.

It will not be name address and city only. There are 600 of those tag records.

Thanks,
Indrajit


You mean 600 in total, not on the same record (you said max length 1014)?

DFSORT has up to 100 parsed fields. As long as there are fewer than 100 tag sets, PARSE is a possibility.
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Mon Jan 09, 2012 7:35 pm
Reply with quote

Can you please provide me a sample of doing so, considering 3 tag records only?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Mon Jan 09, 2012 8:04 pm
Reply with quote

<Name>ABC</Name><Address>Address1</Address><City>US</City>
in OUTFILE you want to parse with ending >
that way you would have 3 pairs of parsed data
(each tag set would require two parse variables)
then put them together with a literal > at the end of each with the newline char /.
and OMIT 1,3,EQ,' '

if your records have more than 3 tag sets,
then your logic would be different.

if you had a potential max of 5 tag sets per record / 10 of you records has 5, a few only 2 and the rest three.

don't know how parse works for data that is not there,
or what the results in the parse variable ($nn) will be if there is nothing to populate.
maybe it says something in the manual.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Jan 09, 2012 8:40 pm
Reply with quote

Indrajit_57,

As the DFSORT guys will be along later, now is the time to get your requirement set.

Do you want the solution for three tags, with the idea that you can apply it to as many as you want, or do you want a solution for three tags which usually will different values, or what.

Have a think, and do a good job of describing what you want, and what you will then do with the things, and if there is a possibility, some DFSORT will be woven for you, during working-hours in San Jose, California.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Mon Jan 09, 2012 9:31 pm
Reply with quote

from my point of view I would go for <full> parsing

Code:
STARTAT=C'<NAME>',ENDAT=C'</NAME>'

and so on for each tag

a bit more typing but ... bulletproof

then I was able to restart parsing from the beginning of the record
to provide for arbitrary ordering ...

I was able to split everything with one entry for each line,,,

but I was not able to get rid of blank lines ,,,
keep trying

but in this particular case giben the number of tags involved a <programmatical> approach might be better

depending on the number of record a REXX solution seems workable

here it is the REXX shebang

the rexx script

Code:
 EDIT       ENRICO.ISPF.EXEC(XMLPARS) - 01.07               Columns 00001 00072
 Command ===>                                                  Scroll ===> CSR 
 ****** ***************************** Top of Data ******************************
 000001 /*REXX - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
 000002 /*                                                                   */
 000003 /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
 000004 Trace "O"                                                               
 000005 Parse Source _sys _how _cmd .                                           
 000006 say left(_cmd,8)"- Started"                                             
 000007 execio =  "EXECIO * DISKR XMLTAGS ( STEM tags. FINIS"                   
 000008 if  0 ¬=  $tsoex(execio) then do                                       
 000009     say left(_cmd,8)"- error reading tags"                             
 000010     exit 4                                                             
 000011 end                                                                     
 000012 say "tags read" tags.0                                                 
 000013 execio =  "EXECIO * DISKR XMLTEXT ( STEM text. FINIS"                   
 000014 if  0 ¬=  $tsoex(execio) then do                                       
 000015     say left(_cmd,8)"- error reading text"                             
 000016     exit 4                                                             
 000017 end                                                                     
 000018 say "text read" text.0                                                 
 000019 do itext = 1 to text.0                                                 
 000020    text = text.itext                                                   
 000021    do  itags = 1 to tags.0                                             
 000022        tags = space(tags.itags)                                         
 000023        do  itag = 1 to words(tags)                                     
 000024            tag = strip(word(tags,itag))                                 
 000025            par  = "parse var text . '<"tag">' val '</"tag">' ."         
 000026            interpret par                                               
 000027            if  strip(val) ¬= "" then ,                                 
 000028                say left(text,40) left(tag,8) "<"tag">"val"</"tag">"     
 000029        end                                                             
 000030    end                                                                 
 000031 end                                                                     
 000032 say left(_cmd,8)"- Ended"                                               
 000033 Exit 0                                                                 
 000034                                                                         
 000035 /* */                                                                   
 000036 $tsoex:                                                                 
 000037    tso_0tr = trace("O")                                                 
 000038    Address TSO arg(1)                                                   
 000039    tso_0rc = rc                                                         
 000040    trace value(tso_0tr)                                                 
 000041    return tso_0rc                                                       
 000042                                                                         
 ****** **************************** Bottom of Data ****************************

the execution jcl
Code:
 ****** ***************************** Top of Data ******************************
 000001 //ENRICO1  JOB NOTIFY=&SYSUID,                                         
 000002 //             REGION=0M,                                               
 000003 //             MSGLEVEL=(1,1),CLASS=A,MSGCLASS=X                       
 000004 //*                                                                     
 000005 //XML     EXEC PGM=IKJEFT01,PARM=XMLPARS                               
 000006 //SYSPROC   DD DISP=SHR,DSN=ENRICO.ISPF.EXEC                           
 000007 //SYSPRINT  DD SYSOUT=*                                                 
 000008 //SYSTSPRT  DD SYSOUT=*                                                 
 000009 //SYSTSIN   DD DUMMY                                                   
 000010 //XMLTAGS   DD DISP=SHR,DSN=ENRICO.XMLTAGS.TXT                         
 000011 //XMLTEXT   DD DISP=SHR,DSN=ENRICO.XMLTEXT.TXT                         
 ****** **************************** Bottom of Data ****************************


the input tags
Code:
 EDIT       ENRICO.XMLTAGS.TXT                              Columns 00001 00072
 Command ===>                                                  Scroll ===> CSR 
 ****** ***************************** Top of Data ******************************
 000001 name                                                                   
 000002 address                                                                 
 000003 city                                                                   
 000004 zip                                                                     
 ****** **************************** Bottom of Data ****************************

the input text
Code:
 EDIT       ENRICO.XMLTEXT.TXT                              Columns 00001 00072
 Command ===>                                                  Scroll ===> CSR 
 ****** ***************************** Top of Data ******************************
 000001 <address>addr1</address>                                               
 000002 <address>addr1</address><city>city1</city>                             
 000003 <address>addr2</address>                                               
 000004 <address>addr2</address><city>city2</city>                             
 000005 <address>addr3</address>                                               
 000006 <address>addr3</address><city>city3</city>                             
 000007 <address>addr4</address>                                               
 000008 <address>addr4</address><city>city4</city>                             
 000009 <address>addr5</address>                                               
 000010 <address>addr5</address><city>city5</city>                             
 000011 <city>city1</city>                                                     
 000012 <city>city2</city>                                                     
 000013 <city>city3</city>                                                     
 000014 <city>city4</city>                                                     
 000015 <city>city5</city>                                                     
 000016 <name>name1</name>                                                     
 000017 <name>name1</name><address>addr1</address><city>city1</city>           
 000018 <name>name1</name><city>city1</city>                                   
 000019 <name>name2</name>                                                     
 000020 <name>name2</name><address>addr2</address><city>city2</city>           
 000021 <name>name2</name><city>city2</city>                                   
 000022 <name>name3</name>                                                     
 000023 <name>name3</name><address>addr3</address><city>city3</city>           
 000024 <name>name3</name><city>city3</city>                                   
 000025 <name>name4</name>                                                     
 000026 <name>name4</name><address>addr4</address><city>city4</city>           
 000027 <name>name4</name><city>city4</city>                                   
 000028 <name>name5</name>                                                     
 000029 <name>name5</name><address>addr5</address><city>city5</city>           
 000030 <name>name5</name><city>city5</city>                                   
 ****** **************************** Bottom of Data ****************************


the result
Code:
********************************* TOP OF DATA **********************************
XMLPARS - Started                                                               
tags read 4                                                                     
text read 30                                                                   
<address>addr1</address>                 address  <address>addr1</address>     
<address>addr1</address><city>city1</cit address  <address>addr1</address>     
<address>addr1</address><city>city1</cit city     <city>city1</city>           
<address>addr2</address>                 address  <address>addr2</address>     
<address>addr2</address><city>city2</cit address  <address>addr2</address>     
<address>addr2</address><city>city2</cit city     <city>city2</city>           
<address>addr3</address>                 address  <address>addr3</address>     
<address>addr3</address><city>city3</cit address  <address>addr3</address>     
<address>addr3</address><city>city3</cit city     <city>city3</city>           
<address>addr4</address>                 address  <address>addr4</address>     
<address>addr4</address><city>city4</cit address  <address>addr4</address>     
<address>addr4</address><city>city4</cit city     <city>city4</city>           
<address>addr5</address>                 address  <address>addr5</address>     
<address>addr5</address><city>city5</cit address  <address>addr5</address>     
<address>addr5</address><city>city5</cit city     <city>city5</city>           
<city>city1</city>                       city     <city>city1</city>           
<city>city2</city>                       city     <city>city2</city>           
<city>city3</city>                       city     <city>city3</city>           
<city>city4</city>                       city     <city>city4</city>           
<city>city5</city>                       city     <city>city5</city>           
<name>name1</name>                       name     <name>name1</name>           
<name>name1</name><address>addr1</addres name     <name>name1</name>           
<name>name1</name><address>addr1</addres address  <address>addr1</address>     
<name>name1</name><address>addr1</addres city     <city>city1</city>           
<name>name1</name><city>city1</city>     name     <name>name1</name>           
<name>name1</name><city>city1</city>     city     <city>city1</city>           
<name>name2</name>                       name     <name>name2</name>           
<name>name2</name><address>addr2</addres name     <name>name2</name>           
<name>name2</name><address>addr2</addres address  <address>addr2</address>     
<name>name2</name><address>addr2</addres city     <city>city2</city>           
<name>name2</name><city>city2</city>     name     <name>name2</name>           
<name>name2</name><city>city2</city>     city     <city>city2</city>           
<name>name3</name>                       name     <name>name3</name>           
<name>name3</name><address>addr3</addres name     <name>name3</name>           
<name>name3</name><address>addr3</addres address  <address>addr3</address>     
<name>name3</name><address>addr3</addres city     <city>city3</city>           
<name>name3</name><city>city3</city>     name     <name>name3</name>           
<name>name3</name><city>city3</city>     city     <city>city3</city>           
<name>name4</name>                       name     <name>name4</name>           
<name>name4</name><address>addr4</addres name     <name>name4</name>           
<name>name4</name><address>addr4</addres address  <address>addr4</address>     
<name>name4</name><address>addr4</addres city     <city>city4</city>           
<name>name4</name><city>city4</city>     name     <name>name4</name>           
<name>name4</name><city>city4</city>     city     <city>city4</city>           
<name>name5</name>                       name     <name>name5</name>           
<name>name5</name><address>addr5</addres name     <name>name5</name>           
<name>name5</name><address>addr5</addres address  <address>addr5</address>     
<name>name5</name><address>addr5</addres city     <city>city5</city>           
<name>name5</name><city>city5</city>     name     <name>name5</name>           
<name>name5</name><city>city5</city>     city     <city>city5</city>           
XMLPARS - Ended                                                                 
READY                                                                           
END                                                                             
******************************** BOTTOM OF DATA ********************************



might not be the finakl solution,
but a strong Proof Of Concept of how
and demonstration of REXX parsing capabilities thru the use of the INTERPRET instruction
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Jan 10, 2012 3:07 am
Reply with quote

Indrajit,

How you would (or if you could) do something like this with PARSE would depend on exactly what variations the records could have in terms of the pattern. If each record always starts with <tag1> and ends with </tagn> (e.g. not <tagx> in record and </tagx> in another record), you might be able to PARSE for each '><' pair in the record (up to the maximum number of pairs in a record) and break it between '>' and '<'. The last %n present would get the last tag pair (up to the end of the record) and %ns not present after that would be blanks (that is, not start with <) so they could be eliminated. So you wouldn't really need to code each specific tag name.

Quote:
but in this particular case given the number of tags involved a <programmatical> approach might be better


I tend to agree.

But if you want an actual DFSORT solution, show a much more detailed example of what your input records can look like, give the maximum number of tag pairs per record, etc.
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Tue Jan 10, 2012 5:02 pm
Reply with quote

Hi Frank,

<Name>ABC</Name><Address>Address1</Address><City>US</City><DateOfBirth>1983-03-03</DateOfBirth><LocaleCode>VFC</LocaleCode><NameLine1>AYUSH </NameLine1>
<SequenceNum>00000018</SequenceNum><StatusCode>I</StatusCode><Status>Married</Status>
<MailID>abc</MailID><Age>23</Age><Telephone>123456</Telephone><Occupation>Student</Occupation>

I want to split it as below:

<Name>ABC</Name>
<Address>Address1</Address>
<City>US</City>
<DateOfBirth>1983-03-03</DateOfBirth>
<LocaleCode>VFC</LocaleCode>
.
.
.
.
<Occupation>Student</Occupation>

It is a variable length file of maximum length 1014. There can be a maximum of 20 tag pairs per record.

Thanks,
Indrajit
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Jan 10, 2012 5:27 pm
Reply with quote

If you are so sure there are a max of 20, sounds like a program in some way setting the max. Any chance of chaging that to one to save all the bother?
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Tue Jan 10, 2012 5:29 pm
Reply with quote

No we cannot change that to 1.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Jan 11, 2012 1:03 am
Reply with quote

Here's a DFSORT/ICETOOL job that will do what you asked for. I assumed the largest <tag>info</tag> string is 50 bytes.

Code:

//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=...  input file (VB/1015)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD LRECL=1015,DSN=...  output file (VB/1015)
//TOOLIN DD *
COPY FROM(IN) USING(CTL1)
COPY FROM(T1) USING(CTL2)
//CTL1CNTL DD *
  OPTION COPY
  OUTFIL FNAMES=T1,
   PARSE=(%01=(ENDBEFR=C'><',FIXLEN=50),
                %02=(ENDBEFR=C'><',FIXLEN=50),
                %03=(ENDBEFR=C'><',FIXLEN=50),
                %04=(ENDBEFR=C'><',FIXLEN=50),
                %05=(ENDBEFR=C'><',FIXLEN=50),
                %06=(ENDBEFR=C'><',FIXLEN=50),
                %07=(ENDBEFR=C'><',FIXLEN=50),
                %08=(ENDBEFR=C'><',FIXLEN=50),
                %09=(ENDBEFR=C'><',FIXLEN=50),
                %10=(ENDBEFR=C'><',FIXLEN=50),
                %11=(ENDBEFR=C'><',FIXLEN=50),
                %12=(ENDBEFR=C'><',FIXLEN=50),
                %13=(ENDBEFR=C'><',FIXLEN=50),
                %14=(ENDBEFR=C'><',FIXLEN=50),
                %15=(ENDBEFR=C'><',FIXLEN=50),
                %16=(ENDBEFR=C'><',FIXLEN=50),
                %17=(ENDBEFR=C'><',FIXLEN=50),
                %18=(ENDBEFR=C'><',FIXLEN=50),
                %19=(ENDBEFR=C'><',FIXLEN=50),
                %20=(ENDBEFR=C'><',FIXLEN=50)),
   BUILD=(1,4,%01,JFY=(SHIFT=LEFT,TRAIL=C'>'),/,
    1,4,%02,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%03,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%04,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%05,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%06,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%07,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%08,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%09,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%10,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%11,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%12,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%13,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%14,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%15,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%16,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%17,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%18,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%19,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'),/,
    1,4,%20,JFY=(SHIFT=LEFT,LEAD=C'<',TRAIL=C'>'))
//CTL2CNTL DD *
  OMIT COND=(5,2,CH,EQ,C'<>')
  INREC FINDREP=(IN=C'>>',OUT=C'>')
  OUTFIL FNAMES=OUT,VLTRIM=C' '
/*
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Thu Jan 12, 2012 9:32 am
Reply with quote

Thanks Frank, it worked perfectly for me.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
Search our Forums:

Back to Top