IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Search file for records that don't follow expected sequence


IBM Mainframe Forums -> SYNCSORT
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
sdrolz

New User


Joined: 18 Apr 2016
Posts: 4
Location: USA

PostPosted: Tue Dec 14, 2021 9:51 pm
Reply with quote

I have a interesting querry I need to perform on a single file where there are two records that should always be in sequence together. The first 6 bytes of the first record are always 400000. The first 6 bytes of the second record should always be 400070. Occasionally there is a 400000 record followed by another 400000 record. I need to find all of these sprinkled in a file of 9 million records. It seems like I should be able to do this with a syncsort, but I'm struggling to find the right syntax.

So the file has records with a key like the following:
400000
400070
400000
400070

I'm looking for the occurance when this happens, I need to pull out that first 400000 record when it is followed by a second 400000 record.

400000
400000
400070
400000
400070[/b]
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Tue Dec 14, 2021 10:53 pm
Reply with quote

You can use SORT JOINKEYS for that task. I assume you want to print the first record when your condition is met, right?
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Tue Dec 14, 2021 11:26 pm
Reply with quote

Sample:
Code:
400000 #1
400000 #2
400070   
400000 #3
400070   
400000 #4
400000 #5
400070   

Output:
Code:
400000 #1  has a match in line 00000001
400000 #4  has a match in line 00000006
Back to top
View user's profile Send private message
sdrolz

New User


Joined: 18 Apr 2016
Posts: 4
Location: USA

PostPosted: Wed Dec 15, 2021 1:52 am
Reply with quote

Actually, I want identify the records that do NOT follow that pattern. So I'm expecting 400000 to be followed by 400070 every time. on the occurance where 400000 is followed by another 400000, then I want to pull out that first 400000 record. Not sure if that makes sense or not. It's not easy to explain.
Back to top
View user's profile Send private message
sdrolz

New User


Joined: 18 Apr 2016
Posts: 4
Location: USA

PostPosted: Wed Dec 15, 2021 2:21 am
Reply with quote

Looking at this from different angle and maybe using a little more information to develop matching keys. I split the data out into 2 seperate files. All the 400 records in 1 file. All the 400070 records in another file. Each record has additional pieces of information that can be used as keys to match the 400000 and 400070 together, the write out the 400000 that doesn't have a matching 400070 to its own file, like an XSUM.
Each record has an account number and a transaction number that should match 1:1 for each 400000 and 400070 combo. Could I used a Sort Fields=None,XSUM type of function. Here's the clincher, the account number is in the same place on both files, but the transaction number is in different locations. Like below.
400000____Account#_Transaction#
400070____Account#_______Transaction#

I should be able to match the 2 records together identifying key field locations for F1 and F2, then we I don't get a match, write out the exception record to a seperate file.

I'm so close on this, but I'm just not strong with the syntax.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Wed Dec 15, 2021 11:12 am
Reply with quote

See if that code snippet is of help for you.
Code:
//WHATEVER EXEC PGM=SORT                                     
//F1       DD *                                             
400000 #1                                                   
400000 #2                                                   
400070                                                       
400000 #3                                                   
400070                                                       
400000 #4                                                   
400000 #5                                                   
400070                                                       
/*                                                           
//F2       DD .. <* exactly the same as DD:F1                   
//SYSOUT   DD SYSOUT=*                                       
//SORTOUT  DD SYSOUT=*                                       
//SYSIN    DD *                                             
  OPTION COPY                                               
  JOINKEYS F1=F1,FIELDS=(21,8,A)                             
  JOINKEYS F2=F2,FIELDS=(21,8,A)                             
  REFORMAT FIELDS=(F1:1,10,F2:1,10)                         
  OUTREC OVERLAY=(40:SEQNUM,8,ZD)                           
  OUTFIL FNAMES=(SORTOUT),                                   
    REMOVECC,                                               
    INCLUDE=(1,6,BI,EQ,11,6,BI),                             
    BUILD=(11,10,X,C'has a match in line',X,40,8,ZD,M11)     
  END                                                       
/*                                                           
//JNF1CNTL DD *                                             
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(21:SEQNUM,8,ZD,START=0)) 
  END                                                       
/*                                                           
//JNF2CNTL DD *                                             
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(21:SEQNUM,8,ZD,START=1)) 
  END                                                       
/*
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2022
Location: USA

PostPosted: Wed Dec 15, 2021 3:11 pm
Reply with quote

General approach needs to take into account also real Account#, and TransactionID

1) use
Code:

INREC IFTHEN=(WHEN=(1,6,CH,EQ,C’400000’),
              BUILD=(…align TransactionID….))


2) optionally use (to be on safe side)
Code:

SORT FIELDS=(Account#…,TransactionID…),EQUALS


3) identify correct groups
Code:

OUTREC IFTHEN=(WHEN=GROUP,
               BEGIN=(1,6,CH,EQ,C’400000’),
               END=(1,6,CH,EQ,C’400070’), - maybe not needed, to detect orphaned 400000….
               PUSH=(100:ID=8,SEQ=2))


4) split the result into two temporary datasets (OUTFIL FNAMES=…): for 400000, and for 400070

5) using JOINKEYS on ID values, you can easily detect the non-matching 400000 lines, together with their Account#, and TransactionID

6) use SYNCTOOL instead of SYNCSORT to perform all stages as a single JCL step
Back to top
View user's profile Send private message
sdrolz

New User


Joined: 18 Apr 2016
Posts: 4
Location: USA

PostPosted: Wed Dec 15, 2021 10:53 pm
Reply with quote

I think I got what I'm looking for using the following.

SORT FIELDS=COPY
JOINKEYS FILE=F1,FIELDS=(5,3,A,11,15,A,928,8,A),SORTED,NOSEQCK
JOINKEYS FILE=F2,FIELDS=(5,3,A,11,15,A,53,8,A),SORTED,NOSEQCK
JOIN F1,UNPAIRED,ONLY
END

First screw up was not accounting for the 4 bytes at the beginning of the VB file. Once I got thru beating myself up for that mistake, and I fixed all the key positions relative starting positions it worked.

I'm still validating my results, but the resulting Output file includes only records from F1 that did not have matching records in F2. Note: I took the original file that contained both 400000 and 400070 and split them into 2 files, one for each record type. I knew then that these two files are sorted in the same order and followed the same sort sequence. Then I could do a Join key on each file seperately to identify when the first file file found a record that didn't have a match in the second file, and then F1 UNPAIRED ONLY wrote out just those records from F1 that didn't have matches in F2.

You guys have been a great help! Thank you for responding and helping!
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1252
Location: Bamberg, Germany

PostPosted: Thu Dec 16, 2021 5:05 am
Reply with quote

As you are searching for records out of sequence, I would consider not to expect all input as really SORTED.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> SYNCSORT

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
Search our Forums:

Back to Top