IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to Parse the Nth column directly from a csv file


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Chaketh Ram

New User


Joined: 16 Nov 2011
Posts: 7
Location: India

PostPosted: Thu May 03, 2012 7:18 pm
Reply with quote

There is a csv file with more than 80 columns. The concerned Columns are say 10th, 15th, 70th and 78th one.


To parse the 10th col from the file,generally the following sort card is used
Code:

   SORT FIELDS=COPY
   OUTREC PARSE=(%=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %=(ENDBEFR=C','),
   %10=(ENDBEFR=C',',FIXLEN=45)),
   BUILD=(%10)



Using this approach, to parse and get the 70th column, the keywords %=(ENDBEFR=C','), has to be repeated so many times.

Other than this approach,how to parse and get directly the 10th, 15th, 70th and 78th columns ?

Thanks - Chaketh
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu May 03, 2012 7:24 pm
Reply with quote

That's what R in the prefix area is for.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu May 03, 2012 7:49 pm
Reply with quote

Bill,
are you referring to the line command "R"?
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu May 03, 2012 7:57 pm
Reply with quote

Yep. Quickest way I know to get 100 identical statements and then just change four of them, or whatever the acutal numbers need to be.
Back to top
View user's profile Send private message
Chaketh Ram

New User


Joined: 16 Nov 2011
Posts: 7
Location: India

PostPosted: Thu May 03, 2012 8:01 pm
Reply with quote

Bill,
Replicating %=(ENDBEFR=C','), for n times and changing the required 4/5 lines could be done.

Just wanted to know is there any smarter way of doing this other than replicating...
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu May 03, 2012 8:16 pm
Reply with quote

PARSE uses a pointer to address the record,
and this pointer needs to be incremented.

i can not think if any language,
which would allow you to address the 80th something of a variable record
without either employing
  • loop-logic for an index (pointer)
  • data definitions for the 1-79th occurance


while DFSORT is pretty flexible,
it does not provide the looping capabilities to bump a pointer x-number of times.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu May 03, 2012 8:22 pm
Reply with quote

I don't know if it is "smart" to try this. To pick out J, O and Z:

Code:
//PARSEIT  EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTOUT  DD SYSOUT=*
//SYSIN DD *
  OPTION COPY
  INREC IFTHEN=(WHEN=INIT,
                FINDREP=(IN=C',',OUT=C' ',DO=8)),
        IFTHEN=(WHEN=INIT,
                PARSE=(%01=(STARTAFT=C',',ENDBEFR=C',',FIXLEN=1))),
        IFTHEN=(WHEN=INIT,
                FINDREP=(IN=C',',OUT=C' ',DO=5)),
        IFTHEN=(WHEN=INIT,
                PARSE=(%02=(STARTAFT=C',',ENDBEFR=C',',FIXLEN=1))),
        IFTHEN=(WHEN=INIT,
                FINDREP=(IN=C',',OUT=C' ',DO=11)),
        IFTHEN=(WHEN=INIT,
                PARSE=(%03=(STARTAFT=C',',ENDBEFR=C',',FIXLEN=1))),
        IFTHEN=(WHEN=INIT,
         BUILD=(C'>',%01,C'<>',%02,C'<>',%03,C'<'))
//SORTIN DD *
A,B,C,D,E,F,G,R,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z


Code:
>J<>O<>Z<


Not going to work with commas embedded within quotes (not directly).

For 80 fields, I'm not sure I'd use it. To extend PARSE beyond one hundred delimited fields, maybe worth a go.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Thu May 03, 2012 8:47 pm
Reply with quote

Chaketh Ram,
I don't think there is an easy way to achieve this.
Quote:

%
Specifies a parsed field to be ignored. No data is extracted, but the starting point for the next parsed field advances according to the suboperands specified. Use % when you don't need the data from a particular field, but you do need to get to the next field. For example, if we had the four CSV fields shown earlier as input, but we only wanted to extract the first and fourth fields, we could use this OUTFIL statement: OUTFIL PARSE=(%01=(ENDBEFR=C',',FIXLEN=5),
%=(ENDBEFR=C','),
%=(ENDBEFR=C','),
%04=(FIXLEN=5)),
BUILD=(1:%01,11:%04,21:%01,HEX)


Data is extracted for %01 (first field) and %04 (fourth field), but not for % (second and third fields).


I am thinking, your best bet (if you don't want to code),looks like is to build PARSE statement dynamically.

Thanks,
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu May 03, 2012 8:50 pm
Reply with quote

MF,
stepped in it that time:
Code:
,DO=8)),
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Fri May 04, 2012 8:48 pm
Reply with quote

I had another little dabble with this. The only benefits so far seems to be few lines of code vs many lines of code, and with the numbers easier to know which field rather than "counting" the %s. I've only tested with one-byte fields and only up to 130. The % easily extends that far anyway, still leaving 100 numbered parsed variables available. The parsing is faster, though not hugely.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Binary File format getting change whi... All Other Mainframe Topics 7
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Replacing 'YYMMDD' with date, varying... SYNCSORT 3
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
Search our Forums:

Back to Top