View previous topic :: View next topic
|
Author |
Message |
grayWolf
New User
Joined: 04 Oct 2010 Posts: 19 Location: Land of broken dreams
|
|
|
|
Hi All,
We got a requirement to write all the duplicate records from the input file into an output file using Eztrieve.
Input file layout:
101001ABCDXYZQWERTY
102002GHJJRWQWEWTUY
102002ASDSGFHRJKNXZ
102002AGHHASDHHHJDF
103004BNVBNGNDDFGSF
First 3 bytes -> Department number
Second 3 bytes -> Stock number
If there are duplicates with respect to the Dept number/Stock number combination, we are supposed to write those records into an output file.
In the example above, since 102/002 is repeated 3 times, we must write these 3 records into an output file.
The logic that we followed is as follows:
1) Read the file
2) Move the file variables into Working storage variables
3) read the file
4) If the present record is equal to the previous record, write it into the output file.
So in my logic, only 1 record is written into the output file if there are TWO records as duplicates.
Please let me know what should be the logic that has to followed in case I need to write all the duplicates into the output file.
For example if there are 100 records as duplicates, all the 100 records should be in the Output file.
Thanks! |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
This is not really an Easytrieve question, it is "language independent" logic.
From what you have shown, I don't know why you don't get them all.
Code: |
if it is the first time, store the keys for matching and continue as normal
normal: until end of file
if current keys equal to stored keys, write record, otherwise store keys
read a record
end of file:
this time, nothing else to do for this requirement |
You could of course include counts of duplicates (in which case you'd have something in end of file for last set of duplicates as well as outputting the results), and would of course have the standard input/output counts (in Easytrieve, from Easytrieve!). |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
answer not appropriate, deleted by DBZ |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
maybe this:
dupped = ""
save_area = ""
save_key = ""
LOOP:
read record
if EOF - if dupped = "Y"
- write save_area
- END-OF-JOB
if save_key = record_key - write save_area
- dupped = "Y"
if save_key not = record_key- if dupped = "Y"
- write save_area
- dupped = ""
move record to save_area
move record_key to save_key
GOTO LOOP |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
There's me wondering why dbz has all that stuff in, so I go back and look, and yes you need the record it is duplicating against as well.
So, all "one behind", store the whole record, and remember to check for the last stored one to write at the end of processing, or when encountering the first duplicate, write out the one that it duplicates, then the duplicate, then again no need for extra at end of file. |
|
Back to top |
|
|
grayWolf
New User
Joined: 04 Oct 2010 Posts: 19 Location: Land of broken dreams
|
|
|
|
Bill Woodger wrote: |
This is not really an Easytrieve question, it is "language independent" logic. |
I had explicitly mentioned Easytrieve because I didn't want an answer with SORT |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
Quote: |
So, all "one behind", store the whole record, and remember to check for the last stored one to write at the end of processing, or when encountering the first duplicate, write out the one that it duplicates, then the duplicate, then again no need for extra at end of file. |
problem with that, if you have 3 dups for a key, you will write more than 3 records
Gray Wolf,
SORT is a Utility, not a programming language.
besides, Bill was saying, that it is not an solution isolated to Eztrieve.
complaining about that and your only comment being what it was,
causes me to refrain from posting anything helpful to your rookie questions in the future. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
Quote: |
I had explicitly mentioned Easytrieve because I didn't want an answer with SORT |
horse manure
a <tool> should be chosen based on the available skills and competence.
if You do not have it, choose a different tool
anyway.. the logic about duplicates is usually learned at a very basic level of training |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
dbzTHEdinosauer wrote: |
Quote: |
So, all "one behind", store the whole record, and remember to check for the last stored one to write at the end of processing, or when encountering the first duplicate, write out the one that it duplicates, then the duplicate, then again no need for extra at end of file. |
problem with that, if you have 3 dups for a key, you will write more than 3 records
[...] |
Yes, I'm making an unclear distinction between the "original" record and the "duplicates".
When you have one duplicate (ie itself and the original) you know to write out both (the original and the duplicate), in the correct order. On susequent duplicates for that key, you only need to write out the duplicate.
Mr Woolf, if you include "I don't want to do it with sort" in your question, like any other information, it makes the flow easier.
Despite my twice pigsearing it, it is really easy :-)
Maybe easier to code than describe. |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
Chapter 12 of the EZT Reference Guide 6.2 describes : Single File Keyed Processing.
Using Synchronized File Processing on a single file enables you to compare the
contents of a key field or fields from one record to the next and use IF tests to
group records according to the key fields. The file name is coded on the JOB
INPUT statement as follows:
JOB INPUT (filename KEY (keyfield...)) |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Perfect, Peter. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
What is the maximum number of duplicates possible for "a key"? |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
Quote: |
Please let me know what should be the logic that has to followed in case I need to write all the duplicates into the output file.
For example if there are 100 records as duplicates, all the 100 records should be in the Output file. |
pretty simple ...
here is a rexx prototype
Code: |
#!/usr/bin/rexx
Address HOSTEMU "EXECIO * DISKR 'zdata.txt' ( stem data. finis "
prev = data.1
pend = 0
do i = 2 to data.0
curr = data.i
if curr = prev then do
pend = 1
say prev
prev = curr
end
else do
if pend = 1 then do
pend = 0
say prev
end
prev = curr
end
end
if pend = 1 then ,
say prev
exit
::requires hostemu LIBRARY |
input
Code: |
a
b
b
c
c
c
d
e
e
f
f
f
g
h
i
j
j
k
k
k
|
result
Code: |
[enrico@enrico-mbp ztests]$./zdupl.rx
b
b
c
c
c
e
e
f
f
f
j
j
k
k
k
|
the logic is pretty simple
the assigment of data.X to prev and curr is the equivalent> of a read
the simple script can be modified easily to be tested under TSO |
|
Back to top |
|
|
|