Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Split Records based on a input dataset

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
Nimesh.Srivastava

New User


Joined: 30 Nov 2006
Posts: 78
Location: SINGAPORE

PostPosted: Fri Jun 13, 2008 4:40 pm    Post subject: Split Records based on a input dataset
Reply with quote

Hi,
I have a requirement where I have 2 files both RECFM=FB LRECL=80

File 1
abcd efg ijk
iernf jf urinc
12345 fgrtjk doenf

uri cnf gj d o
abc

File 2
fg
do

Then there should be two output files created both RECFM=FB LRECL=80
Out 1
abcd efg ijk
12345 fgrtjk doenf

Out 2
iernf jf urinc
uri cnf gj d o
abc

that split one input file into 2 based upon a file who each line is used as a pattern for splitting the File 1 into 2 parts
The pattern string(s) in File 2 can occur anywhere in File 1

Thanks in advance
Nimesh
Back to top
View user's profile Send private message

vvmanyam

New User


Joined: 16 Apr 2008
Posts: 86
Location: Bangalore

PostPosted: Fri Jun 13, 2008 7:28 pm    Post subject: Reply to: Split Records based on a input dataset
Reply with quote

Hi Srivastava,

Does file2 has only 2 records every time
or there might be any number of records?

Thanks,
Balu
Back to top
View user's profile Send private message
Nimesh.Srivastava

New User


Joined: 30 Nov 2006
Posts: 78
Location: SINGAPORE

PostPosted: Sun Jun 15, 2008 3:24 pm    Post subject: Reply to: Split Records based on a input dataset
Reply with quote

Hi vvmanyam,
it can have 'n' records not just two also the position of such records in File 1 would be variable...i.e. anywhere within the line
Thanks
Nimesh
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Mon Jun 16, 2008 4:26 am    Post subject:
Reply with quote

Nimesh,

You need to do a better job of explaining the "rules". You seem to show a pattern file2 that has 2-character strings that can appear anywhere in file1. Does any match of characters from file2 in file1 mean you want that file1 record for output? We have no way of knowing if file2 can only have a single 2-character string in each record, or different length strings in each record, or multiple strings in each record or what. You need to be very specific about what file2 can look like and what you're trying to match in file1 from file2.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Mon Jun 16, 2008 5:05 am    Post subject:
Reply with quote

Hello,

In addition to posting better "rules", it would be a big help if you posted more realistic sample data rather than alphabet soup. If your real data must not be disclosed, fine, but realistic sample data can still be posted.

If you explain the business requirement, i believe it will be much more understandable to all who read this.
Back to top
View user's profile Send private message
Nimesh.Srivastava

New User


Joined: 30 Nov 2006
Posts: 78
Location: SINGAPORE

PostPosted: Mon Jun 16, 2008 11:57 am    Post subject:
Reply with quote

Hi All,
Thanks for the input, please find the details below
Rules
1. pattern file i.e. File2 can have multiple records but one line would only contain one single pattern and would be unique within the file File2, the length of the same could also be varying [even though file is FB hence remaining would be spaces]
Code:
ex file2
9611-1963-7941.........................
9611-1965-7941.........................
961179657951...........................
7611-4876-1342.........................
NV-0150128090..........................


2. the pattern can occur anywhere within a record in File1 and there may be more than one pattern lying in a record
Code:
File1
ABC...9611-1963-7941...XYZ..............
9611-1965-7941...XYZ...FGH..............
ABC...961179657951...XYZNV-0150128090...


in the above case record 3 of File1 has 2 patterns in it so the intermediate step in JCL may return this record twice but at the end the final file Out1 would be sorted uniquely.
Out2 would be prepared after final Out1 has been sorted then Out2 would be difference between File1 and Out1

3. the search for each pattern has to be an exact search for ex File1 may have records like
Code:
ABC...9611-1963-7941...XYZ
ABC...96 11-1963-7 941...XYZ


it should return only first record as matching record
Code:
ABC...9611-1963-7941...XYZ


now for ex
Code:
File1
ABC...9611-1963-7941...XYZ..............
9611-1965-7941...XYZ...FGH..............
ABC...961179657951...XYZNV-0150128090...
RTY...JKLCDP............................
ASERT...7611-4876 1342VBHGMN............


Code:
and File2
9611-1963-7941
9611-1965-7941
961179657951
7611-4876-1342
NV-0150128090


would result in two files
Code:
Out1
ABC...9611-1963-7941...XYZ..............
ABC...961179657951...XYZNV-0150128090...
9611-1965-7941...XYZ...FGH..............


Code:
Out2
ASERT...7611-4876 1342VBHGMN............
RTY...JKLCDP............................


Please let me know if any more information is required
Thanks
Nimesh
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Tue Jun 17, 2008 1:56 am    Post subject: Reply to: Split Records based on a input dataset
Reply with quote

Nimesh.Srivastava,

You haven't defined the rules of the pattern. How did you pick only record 3 from file 1 has a pattern? Unless you come with up clear cut rules of picking the pattern , there is nothing that we can help you with. You are better off writing a program.
Back to top
View user's profile Send private message
Nimesh.Srivastava

New User


Joined: 30 Nov 2006
Posts: 78
Location: SINGAPORE

PostPosted: Tue Jun 17, 2008 8:02 am    Post subject:
Reply with quote

Hi Kolusu,
Thanks for the reply...
record 3 of File1 has multiple matching patterns in it [as per Rule 2]
there isn't a fixed position where the pattern may or may not occur in File1.
its like searching on Windows in a particular folder all the files [here records in File] with "A word or phrase in the file [here each record of File2]"

Yes, programming wise its like

Read till end of File2
for each record of File2
Read till end of File1
for each record of File1
if strstr(Rec_File1,Rec_File2) [searches for occurence of string
Rec_File2 in string Rec_File1]
returns FOUND then write Rec_File1 in Out1
for-loop end for File1
for-loop end for File2

sort Out1 and compare Out1 & File1, report records not in Out1 but existing in File1 to Out2

but my concern was if File1 is very big then it could be time consuming to do the same programmatically; hence looking for DFSORT solution.
Please let me know if the rules are ok now
Thanks
Nimesh
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Tue Jun 17, 2008 10:12 pm    Post subject:
Reply with quote

Nimesh.Srivastava,

I dont think it is possible to do what you are asking with the existing feature of sort
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Jun 17, 2008 11:16 pm    Post subject:
Reply with quote

Hello Nimesh.Srivastava,

Quote:
but my concern was if File1 is very big then it could be time consuming to do the same programmatically
No matter how you do this, it will consume a lot of cpu time.

The good news is that it would not be difficult to code in COBOL icon_smile.gif It would be better to use multiple steps rather than trying to "do it all" in one bit of code.
Back to top
View user's profile Send private message
Venkat1001

New User


Joined: 25 Feb 2008
Posts: 12
Location: chennai

PostPosted: Wed Jun 18, 2008 8:05 am    Post subject: Reply to: Split Records based on a input dataset
Reply with quote

Hi Nimesh..
I have tried through rexx. Please have a look at this code

Main pgm
COMP:
ADDRESS TSO
DSN1 = DATASET 1
"ISPEXEC EDIT DATASET ('"DSN1"') MACRO ("COMPMAC")"
"ISPEXEC VIEW DATASET ('"DSN1"')"
RETURN

COMPMAC( keep this macro in the same rexx lib)
DSN2 = DATASET 2
"ALLOC FI(REPT) DA('"DSN2"') SHR REUSE"
"EXECIO * DISKR REPT (STEM INP2. FINIS)"
ADDRESS TSO ISREDIT MACRO
COUNT = INP2.0
COUNT1 = COUNT - 1
INP = INP2.1
"ISREDIT X ALL"
"ISREDIT F ALL " '"'INP'"'
DO J = 2 TO COUNT1
INP = INP2.J
"ISREDIT F ALL " '"'INP'"'
END
INP = INP2.COUNT
"ISREDIT F ALL " '"'INP'"'
"ISREDIT DELETE ALL X"
"ISREDIT SAVE"
"ISREDIT END"
RETURN

By this u can get all the matching records in first file in second file itself. Please make sure that u parse the strings from the second file when u do a F ALL.

Regards
venkat
Back to top
View user's profile Send private message
Nimesh.Srivastava

New User


Joined: 30 Nov 2006
Posts: 78
Location: SINGAPORE

PostPosted: Wed Jun 18, 2008 10:22 am    Post subject:
Reply with quote

Hi All,
I tried another way of handling this requirement of mine by mix'n'match SUPERC with DFSORT.
I can dynamically create a JCL from my C/C++ program using paramters of File2, which looks like

Code:

//SEARCH  EXEC PGM=ISRSUPC,
//            PARM=(SRCHCMP,
//            'ANYC')
//NEWDD  DD *
ABC...9611-1963-7941...XYZ..............
9611-1965-7941...XYZ...FGH..............
RTY...JKLCDP............................
ABC...961179657951...XYZNV-0150128090...
ASERT...7611-4876 1342VBHGMN............
/*
//OUTDD  DD  DSNAME=DRST.CD.BK.NIM1,
//             SPACE=(CYL,(30,30)),DISP=(NEW,CATLG,DELETE),
//             DCB=(RECFM=FB,LRECL=133)
//SYSIN  DD *
SRCHFOR  '9611-1963-7941'
SRCHFOR  '9611-1965-7941'
SRCHFOR  '961179657951'
SRCHFOR  '7611-4876-1342'
SRCHFOR  'NV-0150128090'
/*


this gives me in OUTDD the listing containing the matching lines from File1 as well as other output, which looks like this

Code:
 
OUTPUT
 LINE-#  SOURCE SECTION                    SRCH DSN:
       1  ABC...9611-1963-7941...XYZ..............
       2  9611-1965-7941...XYZ...FGH..............
       4  ABC...961179657951...XYZNV-0150128090...

1  ISRSUPC   -   MVS/PDF FILE/LINE/WORD/BYTE/SFOR COMPARE UTILITY- ISPF
      SEARCH-FOR SUMMARY SECTION            SRCH DSN:

 LINES-FOUND  LINES-PROC  DATASET-W/LNS  DATASET-WO/LNS  COMPARE-COLS  L
         3            5            1              0           1:80

 PROCESS OPTIONS USED: ANYC

 THE FOLLOWING PROCESS STATEMENTS (USING COLUMNS 1:72) WERE PROCESSED:
    SRCHFOR  '9611-1963-7941'
    SRCHFOR  '9611-1965-7941'
    SRCHFOR  '961179657951'
    SRCHFOR  '7611-4876-1342'
    SRCHFOR  'NV-0150128090'


now the records which are found have the records no's from position 3-8 in the output file; which I tried extracting using DFSORT
Code:
//NCCOPY   EXEC PGM=SORT
//SORTLIB   DD  DSN=SYS1.SORTLIB,DISP=SHR
//SORTIN    DD  DSN=DRST.CD.BK.NIM1,DISP=SHR
//SORTOUT  DD  DSNAME=DRST.CD.BK.NIM2,
//             SPACE=(CYL,(30,30)),DISP=(NEW,CATLG,DELETE),
//             DCB=(RECFM=FB,LRECL=133)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(50,25),RLSE),
//             DISP=(,DELETE,DELETE)
//SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,(50,25),RLSE),
//             DISP=(,DELETE,DELETE)
//SORTWK03 DD UNIT=SYSDA,SPACE=(CYL,(50,25),RLSE),
//             DISP=(,DELETE,DELETE)
//SYSOUT    DD SYSOUT=*
//SYSIN     DD *
  SORT   FIELDS=COPY
  INCLUDE COND=(3,6,FS,EQ,NUM)
  RECORD TYPE=F,LENGTH=133
  END
/*


which doesn't outputs anything as there are spaces in position 3-7 and record no in position 8 only; if I replace the filter card as
Code:
INCLUDE COND=(8,1,FS,EQ,NUM)


I get the output as
Code:
       
       1  ABC...9611-1963-7941...XYZ..............
       2  9611-1965-7941...XYZ...FGH..............
       4  ABC...961179657951...XYZNV-0150128090...

to get my desired output Out1 I need to remove the first 10 characters from the above.

So, I need help how to extract records from position 3-8 which can be like
" 1"
or max
"999999"
and then remove them so that my final Out1 looks like
Code:
       
ABC...9611-1963-7941...XYZ..............
9611-1965-7941...XYZ...FGH..............
ABC...961179657951...XYZNV-0150128090...


and to get the Out2 file which is difference of File1 and Out1

Thanks
Nimesh
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Removing Duplicates based on certain ... chandracdac DFSORT/ICETOOL 8 Fri Dec 09, 2016 4:40 am
No new posts Performing arithmetic on input field zh_lad DFSORT/ICETOOL 31 Tue Dec 06, 2016 8:04 pm
No new posts High CPU consumption Job using IAM fi... aswinir JCL & VSAM 15 Thu Dec 01, 2016 8:28 pm
No new posts Limit duplicate records in the SORT pshongal SYNCSORT 6 Mon Nov 21, 2016 12:54 pm
No new posts FTP - JCL failed while passing FTP co... Suneetha1612 JCL & VSAM 12 Wed Nov 16, 2016 7:33 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us