IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Search and find the best possible match from a file


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
chagoyal

New User


Joined: 03 May 2007
Posts: 11
Location: India

PostPosted: Thu May 03, 2007 2:59 pm
Reply with quote

Hello, I have a scenario where I have two files, FILE1 and FILE2. FILE1 has the following structure

Code:

key1 key2 key3 key4 key5 col1 col2 col3 col4 col5


The second file has the following structure

Code:

key1 key2 key3 key4 key5 col6


For all the records in file1 the corresponding keys need to be checked in file2 and if found the record needs to be written in the output file.

The o/p file has the following format

Code:

key1 key2 key3 key4 key5 col1 col2 col3 col4 col5 col6


But the problem is file2 may/may not have all the key columns matching with an input records and it can have multiple records matching for an input record like for example

Code:

key1 key2 key3 key4 key5 col6
M    M    M    M    M    Val1
M    M    M    M    -    Val2
M    M    M    -    -    Val3
M    M    -    -    -    Val4
M    -    -    -    -    Val5      M means Match
                                   -   wild carded


The above depicted situation is the order in which the search needs to be done. In such a case the best possible match, i.e the first record should be chosen. Please note that there can be other combinations also in file2 but the only combinations that affect the output are the ones mentioned above and in order of their priority.

The output file should only have matching records and exactly one record(the best match) for an input record.

Could you please advice if this can be done through pure JCL../DFSORT etc
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Thu May 03, 2007 3:08 pm
Reply with quote

chagoyal,

I dont think you can acheive your requirement through DFSORT or JCL. You need some programming skills for this.
Back to top
View user's profile Send private message
chagoyal

New User


Joined: 03 May 2007
Posts: 11
Location: India

PostPosted: Thu May 03, 2007 5:51 pm
Reply with quote

Yup.. I felt so ..
But still thought there could be some <if then else logic> that could be incorporated in a jcl through dfsort to achieve this ..
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu May 03, 2007 11:06 pm
Reply with quote

I'm not convinced yet that this can't be done with DFSORT/ICETOOL, but I need more information to know for sure.

What is the RECFM and LRECL of the each input file?

Please show an example of the records in each input file (relevant values) and what you expect for output. Please show as many variations as you can. You can use something simple for each key value and col value such as a letter or number to keep it simple.
Back to top
View user's profile Send private message
chagoyal

New User


Joined: 03 May 2007
Posts: 11
Location: India

PostPosted: Fri May 04, 2007 11:34 am
Reply with quote

Hello Frank,

Thanks for your response. Actually FILE1 and FILE2 are table unloads and can be made any RECFM and although the record size for both the input tables vary, the files can still be made to have the same lrecls by adding a filler in one or both the input files.

Lets assume all the columns in both the files are char(5).

Now lets assume the following data in FILE1
KKKK1KKKK2KKKK3KKKK4KKKK5CCCC1CCCC2CCCC3CCCC4CCCC5
KKKK6KKKK7KKKK8KKKK8KKK10CCCC6CCCC7CCCC8CCCC9CCC10

Lets assume FILE2 has the following data


KKKK1DDDDDDDDDDDDDDDKKKK4NNNN1
KKKK1DDDDDKKKK3KKKK3KKKK4NNNN2
KKKK1DDDDDKKKK4KKKKKKKKK5NNNN3
KKKK1KKKK2DDDDDDDDDDDDDDDNNNN4
KKKK1KKKK2KKKK3DDDDDDDDDDNNNN5
KKKK1KKKK2KKKK3KKKK4KKKK5NNNN6
KKKK4KKKK7KKKK8KKKK9KKK10NNNN7

Corresponding to the first record of FILE1, 6th record of file2 is the best match, i.e NNNN6. For record 2 of FILE1 there is no match hence it will not go in the o/p file.

Lets assume file2 has the first two records only..i.e.

KKKK1DDDDDDDDDDDDDDDKKKK4NNNN1
KKKK1DDDDDKKKK3KKKK3KKKK4NNNN2

In this case both the records occupy same position priority wise but since first record is found first, the effective value for the output will be NNNN1

Again if we assume the FILE2 just has the following three records.

KKKK1DDDDDDDDDDDDDDDKKKK4NNNN1
KKKK1DDDDDKKKK3KKKK3KKKK4NNNN2
KKKK1DDDDDKKKK4KKKKKKKKK5NNNN3

Even now the value for COL6 will be NNNN1
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri May 04, 2007 11:19 pm
Reply with quote

I'm still not sure I understand your priority scheme. I think that a match of key1 and key2 and key3 has a higher priority than a match of key1 and key2. But does a match of key1 and key3 and key5 have a higher or lower priority than a match of key1 and key2?

Are you only interested in these matches:

key1, key2, key3, key4, key5
key1, key2, key3, key4
key1, key2, key3
key1, key2
Key1

or are you also interested in other matches such as:

Key2
Key1, Key3
Key1, Key2, Key5

and if you are interested in those other matches, then what's the priority scheme for all of the other possible matches?
Back to top
View user's profile Send private message
chagoyal

New User


Joined: 03 May 2007
Posts: 11
Location: India

PostPosted: Sat May 05, 2007 11:16 am
Reply with quote

Answers-

1. A match of key1 and key3 and key5 has lower priority than a match of key1 and key2 coz that is equivalent to match on key1 only.
matches only.

2. yes I am interested in the following

key1, key2, key3, key4, key5
key1, key2, key3, key4
key1, key2, key3
key1, key2
Key1

3. Nope- Not interested in those matches..
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sun May 06, 2007 9:56 pm
Reply with quote

chagoyal,

Ok, I think I understand what you want and the DFSORT/ICETOOL job below should do it. To test it, I used the following for the input files:

Input file1
Code:

AAAA1BBBB1CCCC1DDDD1EEEE1C1001C1002C1003C1004C1005
BBBB1BBBB2CCCC2DDDD2EEEE2C2001C2002C2003C2004C2005
DDDD1BBBB3CCCC3DDDD3EEEE3C3001C3002C3003C3004C3005
FFFF1BBBB4CCCC4DDDD4EEEE4C4001C4002C4003C4004C4005
GGGG1BBBB5CCCC5DDDD5EEEE5C4001C5002C5003C5004C5005
HHHH1BBBB6CCCC6DDDD6EEEE6C6001C6002C6003C6004C6005


Input file2
Code:

GGGG1BBBB5XXXXXXXXXXXXXXXW0001
GGGG1XXXXXXXXXXXXXXXXXXXXL0001
BBBB1BBBB2CCCC2XXXXXXXXXXL0001
BBBB1BBBB2XXXXXXXXXXXXXXXL0002
BBBB1BBBB2CCCC2DDDD2XXXXXW0001
BBBB1XXXXXXXXXXXXXXXXXXXXL0003
AAAA1BBBB1XXXXXXXXXXXXXXXL0001
AAAA1BBBB1CCCC1DDDD1XXXXXL0002
AAAA1BBBB1CCCC1XXXXXXXXXXL0003
AAAA1BBBB1CCCC1DDDD1EEEE1W0001
XXXXXBBBB1CCCC1DDDD1EEEE1L0004
AAAA1XXXXXXXXXXXXXXXXXXXXL0005
AAAA1XXXXXCCCC1DDDD1EEEE1L0006
FFFF1XXXXXXXXXXXXXXXXXXXXL0001
FFFF1BBBB4XXXXXXXXXXXXXXXL0002
FFFF1BBBB4CCCC4XXXXXXXXXXW0001
HHHH1XXXXXXXXXXXXXXXXXXXXW0001
HHHH1XXXXXCCCC6DDDD6EEEE6L0001


and got the following for the output file:

Code:

AAAA1BBBB1CCCC1DDDD1EEEE1C1001C1002C1003C1004C1005W0001
BBBB1BBBB2CCCC2DDDD2EEEE2C2001C2002C2003C2004C2005W0001
FFFF1BBBB4CCCC4DDDD4EEEE4C4001C4002C4003C4004C4005W0001
GGGG1BBBB5CCCC5DDDD5EEEE5C4001C5002C5003C5004C5005W0001
HHHH1BBBB6CCCC6DDDD6EEEE6C6001C6002C6003C6004C6005W0001


Here's the DFSORT/ICETOOL job:

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN1 DD DSN=...   input file1 (FB/50)
//IN2 DD DSN=...   input file2 (FB/30)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//T2 DD DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=...  output file (FB/55)
//TOOLIN   DD    *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T1) USING(CTL2)
SPLICE FROM(T1) TO(T2) ON(1,5,CH) -
  WITHALL WITH(51,30) USING(CTL3)
SELECT FROM(T2) TO(OUT) ON(1,5,CH) FIRST USING(CTL4)
/*
//CTL1CNTL DD *
  INREC OVERLAY=(80:X)
/*
//CTL2CNTL DD *
  INREC BUILD=(1,5,51:26,5,56:1,25)
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=T2,
    IFTHEN=(WHEN=(1,5,CH,EQ,56,5,CH,AND,
      6,5,CH,EQ,61,5,CH,AND,
     11,5,CH,EQ,66,5,CH,AND,
     16,5,CH,EQ,71,5,CH,AND,
     21,5,CH,EQ,76,5,CH),OVERLAY=(81:C'5')),
    IFTHEN=(WHEN=(1,5,CH,EQ,56,5,CH,AND,
      6,5,CH,EQ,61,5,CH,AND,
     11,5,CH,EQ,66,5,CH,AND,
     16,5,CH,EQ,71,5,CH),OVERLAY=(81:C'4')),
    IFTHEN=(WHEN=(1,5,CH,EQ,56,5,CH,AND,
      6,5,CH,EQ,61,5,CH,AND,
     11,5,CH,EQ,66,5,CH),OVERLAY=(81:C'3')),
    IFTHEN=(WHEN=(1,5,CH,EQ,56,5,CH,AND,
      6,5,CH,EQ,61,5,CH),OVERLAY=(81:C'2')),
    IFTHEN=(WHEN=(1,5,CH,EQ,56,5,CH),OVERLAY=(81:C'1'))
/*
//CTL4CNTL DD *
  OPTION EQUALS
  SORT FIELDS=(1,5,CH,A,81,1,CH,D)
  OUTFIL FNAMES=OUT,BUILD=(1,55)
/*
Back to top
View user's profile Send private message
chagoyal

New User


Joined: 03 May 2007
Posts: 11
Location: India

PostPosted: Mon May 07, 2007 5:44 pm
Reply with quote

Hello Frank,

I tried the solution posted by you .. and it's working !!! Thanks a lot !
YIPEE !!
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 1
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
Search our Forums:

Back to Top