Joined: 23 Dec 2005 Posts: 61 Location: Providence , US
Hi all,
I have a file which is 80 byte in length(of which only the first 10 bytes are of my use ), i have to remove duplicates from the file which are in file. I dont want to change the sequence of the same , i just want to remove the duplicates.
eg:
file a
---------------
abc
xyz
xyz
abd
abd
mno
o/p file
------------------
abc
xyz
abd
mno
Thus , how to go about it ? i have written a job, but it is not accepting
sum fields = none with sort fields =copy. Now, since i want to removing only duplicates and do not sort the sequence , what modification shall i add in my jcl.
/* REXX */
say "I am starting now!"
eof1 = "NO" /* we haven't got eof on input*/
i = 0 /* record count/index */
j = 1
do forever
call read_infile
if eof1 = "YES" then do
"EXECIO * DISKW NEW (stem out_rec. FINIS"
say "rec_cnt " i
leave
end
end
say "I am finished"
exit
/*--------------------------------------------------------------------*/
/* SUBROUTINE */
read_infile:
"EXECIO 1 DISKR INFILE"
if RC > 0 then do
eof1 = "YES"
return
end
parse pull inrec
i = i + 1
if i = 1 then do
inrecprev = inrec
out_rec.j = inrec
j = j + 1
end
if inrecprev \= inrec then do
inrecprev = inrec
out_rec.j = inrec
j = j + 1
end
return
/*--------------------------------------------------------------------*/
Joined: 23 Dec 2005 Posts: 61 Location: Providence , US
Hi Frank,
Thanks alot , ICETOOL is really a powerful tool. I will go through other DFSORT/ICETOOL tricks for which you have given link in other posts of this community.
Hi, Referring to the above job and requirement, I have the below requirement and the sort card modified for my requirement, but I don't get my required result.
This input data is a Job Vs Dataset cross reference containing
Dataset Name - Columns 5-25
Jobname - Columns 51-53
Proc name - Columns 55-57
proc Step name - Columns 59-61
proc DD name - Columns 63-65
The Input Rows 8 thru 13 are are datasets belonging to concatenated dataset of a DD DD1 .
My Requirement to the get the output as in Required Output column:
-------------------------------------------------------------------------------
1. Sorting should be done removing duplicates and not by doing a sorting and changing the order of the dataset.
2. In case of concatenated dataset if any duplicates those should be removed and the order of dataset input file should be maintained.
3. For direct datasets the duplicates should be removed
1. The fields on position 0 thru 25 should be considered for eliminating duplicates.
2. If the values on fields 51-53 / 55-57 / 59-61 / 63-65 repeats in more then one row those are concatenated dataset entries. Here duplicates if any should be eliminated and the dataset order should be maintained.
3. For the column positions stated in point 2, if the values in one row is different from the next row then those are direct datasets. For all these duplicates should be removed on position 0-25
I need the duplicates datasets removed from entries 0-25 for all rows, but the dataset order against concatenated entries (means values in columns 51 - 65 repeated more then one row) should not get jumbled.
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
I've tried, but I just can't figure out what you're trying to do. I don't know if you're matching up the dsnames and the other four fields or just the other four fields. I don't know if by "duplicates if any should be eliminated" you mean to eliminate all records with a match or just keep the first record with a match.
It would really help if you would show example of the input records for each case and the expected output record (or none) for that case. Particularly, the case where you want to remove one or more records.
For example: