View previous topic :: View next topic
|
Author |
Message |
Roshnii
New User
Joined: 30 Sep 2008 Posts: 45 Location: bangalore
|
|
|
|
I have two files like this
FILE A
Code: |
Name Subject Date
------------------------------
Tom Maths 2010-09-01
John Science 2010-07-02
John Physics 2011-01-01
Robin Psychology2011-10-01
Anna Biology 2010-08-01
Alice Geography 2010-05-02
Thomas Geology 2011-03-04
Alice Physics 2011-01-12
Tom Physics 2010-10-10
Alice Biology 2011-12-12
|
Output file
Code: |
Name Subject1 Subject2 subject3
----------------------------------------
Tom Maths Physics
John Science Physics
Alice Geography Physics Biology |
I want all records from the input file where a student has one or more than one subject and in the order when the subject
was taken
Input file is 30 lrec. and each field is 10 bytes long.
Any idea how should I go about?
I have tried taking the duplicate records from the file using sortxsum
and then joining it with the original file to have all records but somehow I am not getting the desired output. Please help |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
Quote: |
I have two files like this |
Roshnii,
But I could see only one input file. Are you missing something here. Which sort product/version you're using?
Please use "Code" tags while posting data/code to make it readable and to preserve the alignment. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hi Arun,
The 2 files are 1 input and 1 output.
I've "Code'd" the data and it looks like all of the entries for each "key" are to be written as one output record.
Unfortunately, the max entries on a line has not been specified. Neither has the recfm. Maybe some clarification will be provided. . . |
|
Back to top |
|
|
Roshnii
New User
Joined: 30 Sep 2008 Posts: 45 Location: bangalore
|
|
|
|
I apolozize for the confusion :
The files that I had given is not correct.
Here is the correct requirement
Input file
Code: |
Tom Maths
John Science
John Physics
Robin Psychology
Anna Biology
Alice Geography
Thomas Geology
Alice Physics
Tom Physics
Alice Biology
Dennis Biology
Deniis Biology |
Output file
Code: |
Tom Maths
Tom Physics
John Science
John Physics
Alice Geography
Alice Physics
Alice Biology
|
The file length of input and output file is 25
name is 15 byte long and subject is 10 bytes long.
My output file should have only those students who have more than one different subjects.
I have tried using sort and was able to get the desired result, however, i ended up having many sort steps in my jcl which is
confusing.
I was curious to find out a better way to get the result and in less number of steps.
This is what I tried:
Step1: Sort the file accoring to the student name. FILE1
Step2: Get all the duplicate rows to one file base on the student name alone. FILE2
Step3: Get all duplicate rows from FILE2 based on both the student name and subject both. output: duplicates into:FILE3 Unique records into FILE3A
Step4: Remove the duplicates from FILE3: output FILE4
Step5: Join the files from step4 and step 3 , ie FILE3 and FILE4: ouput FILE5
Step6: Remove duplicates from FILE5 based on the student name and subject name: FILE 6
Step7: Copy files from step6 and step 3 into one file. i.e FILE3A and FILE6.
The above steps gives me the desired output but it is extremely confusing. I have not give the exact code of the steps that i have followed.
I would like to know a simpler way of doing the same thing. |
|
Back to top |
|
|
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi,
try this
Code: |
//S1 EXEC PGM=SYNCTOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=input-file
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,15,CH) ALLDUPS
/*
|
I'm not sure whether the order of the ouput file is what you want.
Gerry |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
Hi Gerry,
Wont it pull the below records as well into the output, which the OP does not want?
Code: |
Dennis Biology
Deniis Biology |
|
|
Back to top |
|
|
Garry Carroll
Senior Member
Joined: 08 May 2006 Posts: 1205 Location: Dublin, Ireland
|
|
|
|
Quote: |
Wont it pull the below records as well into the output, which the OP does not want?
Code: |
Dennis Biology
Deniis Biology |
|
Surely not since they're not duplicates?
Garry. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
Arun !
did You watch the speeling ? |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
Oh my bad, I did n't.
Nice hint. Thanks enrico. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Whew. . .
For a short while there, i thought new "fuzzy match" functionally was available . . .
Have a great weekend!
d |
|
Back to top |
|
|
Roshnii
New User
Joined: 30 Sep 2008 Posts: 45 Location: bangalore
|
|
|
|
Hi enrico,
Thanks for the solution but it did not give me the desired output.
It gave me duplicates that I dint want. And I wud have to follow the same sort steps to delete duplicates on both the student and the subject.
I have modified the input file.
Code: |
Tom Maths
John Science
John Physics
Robin Psychology
Anna Biology
Alice Geography
Thomas Geology
Alice Physics
Tom Physics
Alice Biology
Dennis Biology
Deniis Biology
Tom Maths
Alice Physics
Robin Psychology
|
Then I get this (not nessecarily in this order)
Code: |
Tom Maths
Tom Maths
Tom Physics
John Science
John Physics
Alice Geography
Alice Physics
Alice Physics
Alice Biology
Robin Psychology
Robin Psychology
|
which is correct according to the input file that I had given.
But with the above input file, it does not work in one step.
I would like the input to be like this:
Code: |
Tom Maths
Tom Physics
John Science
John Physics
Alice Geography
Alice Physics
Alice Biology
|
I do not want these rows
Code: |
Tom Maths
Robin Psychology
Robin Psychology
Alice Physics
|
I do not want Tom with Maths twice and ALice with Physics twice as that is already recorded once.
I do not want Robin as he does not have different subjects
And I carefully checked if I made any typo this time
[/code] |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
You do not have to thank me...
I did nothing, just joking with Arun on speeling mistakes |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
Roshnii wrote: |
Thanks for the solution but it did not give me the desired output |
so the OP comes up with the corrected speeling now.
Roshni,
If you really have duplicate records for both name as well as subject, you could try some alternatives. |
|
Back to top |
|
|
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi,
moving targets are hard to hit.
Try this
Code: |
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,15,CH) ALLDUPS USING(CTL1)
/*
//CTL1CNTL DD *
SORT FIELDS=(1,25,CH,A)
SUM FIELDS=NONE
/*
|
Gerry |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
gcicchet wrote: |
moving targets are hard to hit |
I knew this was going to be one of them. |
|
Back to top |
|
|
|