View previous topic :: View next topic
|
Author |
Message |
dearlux
New User
Joined: 10 Oct 2019 Posts: 4 Location: usa
|
|
|
|
I have a file with the following 16 records
Field1 Field 2
A 1
B 1
C 1
D 1
A 2
B 2
C 2
D 2
A 3
B 3
C 3
D 3
A 4
B 4
C 4
D 4
Each record in field 1 should have one-one match in Field B. For eg if A is paired with 1 and it should not be paired with 2,3,4 . Similarly if 1 is paired with A already, it should NOT be paired against B,C or D
In short, I need to have output file with only 4 records -
A 1
B 2
C 3
D 4
Please help me achieve this using SORT |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1222 Location: Bamberg, Germany
|
|
|
|
Please use Code Tags when posting.
What have you tried so far and what error messages were seen? Remember, others might help but won't do YOUR job. |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2010 Location: USA
|
|
|
|
dearlux wrote: |
I have a file with the following 16 records
Field1 Field 2
A 1
B 1
C 1
D 1
A 2
B 2
C 2
D 2
A 3
B 3
C 3
D 3
A 4
B 4
C 4
D 4
Each record in field 1 should have one-one match in Field B. |
Please, clarify: what did you mean under "one-one match between field-1, and filed-2" - ?????????
There are ONLY LETTERS in field-1, and ONLY NUMBERS in field-2. How come one-one match between letters, and numbers? |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3048 Location: NYC,USA
|
|
|
|
My understanding is, there are two sub requirements associated with this requirement viz,
a. Get the Distinct on Col1 and pick the first value of col2
for e.g. A-1 (2,3,4 are eliminated)
b. From the Second records onwards compare the col2 value vs previous records col2 values and pick col2 next in seq which don't match any previous col2 values.
for e.g. B-2 (1 is not picked up because it was present for A) and same rule
till Record D. |
|
Back to top |
|
|
dearlux
New User
Joined: 10 Oct 2019 Posts: 4 Location: usa
|
|
|
|
My first field having A,B,C,D are deliver transactions and the second field with 1,2,3,4 signifies Receive transactions. A deliver transaction can only be paired with One and only receive. Similarly a Receive that is paired with a deliver cannot be paired again with another deliver. I don't want to write a batch program for this. Is there a way to accomplish this using sort. |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2010 Location: USA
|
|
|
|
dearlux wrote: |
My first field having A,B,C,D are deliver transactions and the second field with 1,2,3,4 signifies Receive transactions. A deliver transaction can only be paired with One and only receive. Similarly a Receive that is paired with a deliver cannot be paired again with another deliver. I don't want to write a batch program for this. Is there a way to accomplish this using sort. |
Why
but not
What the rule is, to combine deliver-receive into one line, or different lines???
|
|
Back to top |
|
|
Pandora-Box
Global Moderator
Joined: 07 Sep 2006 Posts: 1592 Location: Andromeda Galaxy
|
|
|
|
Dearlux your explanation is not very clear and please provide logical rules |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2010 Location: USA
|
|
|
|
dearlux wrote: |
In short, I need to have output file with only 4 records -
A 1
B 2
C 3
D 4
Please help me achieve this using SORT |
You have been asked some questions that would help others to understand the problem you are not able to clearly explain.
You prefer to ignore those questions; it's up to you, but if so then do not expect any help from the forum.
What is clear from your messy note: you need to extract unique values from the first column, and separately extract unique values from the second column.
- How do you need to join those two sets of unique values in pairs by lines?
- What if the number of unique values in two columns is different? Let's say 1000 of different "deliveries", and only 200 of different "receives"???
- What the hell is the idea to combine two completely different entities into pairs??? |
|
Back to top |
|
|
dearlux
New User
Joined: 10 Oct 2019 Posts: 4 Location: usa
|
|
|
|
In my previous step I m trying to find all possible "receive" matches for a "deliver". My previous sort step has 2 input files, first input file is a file of all delivers and 2nd input file is file of all receives. I use joinkey to match them on specific fields and extract those that match into output file.
So I end up getting an output file where I get multiple receives matched against the same deliver and also instances where the multiple delivers gets matched to the same receive because they all satisfy the criteria.
Now I want the next step where I just want to retain the first deliver-receive combination in the file that matched. For eg if deliver A is matched to receive 1 and 2, I want to retain A-1 that comes before A-2 and drop A-2. Also if B-1 and B-2 follows , I know receive 1 is already matched to deliver A, so drop B-1 and retain B-2. Just sequential processing, retain the first unique combination which is one-one match.
Input File
A-1
A-2
A-3
A-4
B-1
B-2
B-3
B-4
C-1
C-2
C-3
C-4
D-1
D-2
D-3
D-4
Expected Output file
A-1
B-2
C-3
D-4
where A,B,C,D represent delivers and 1,2,3,4 represent receives
I am sorry if I am still not able to clearly explain. Let me know if this is possible in sort or only batch program can be written to accomplish this. |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2010 Location: USA
|
|
|
|
dearlux wrote: |
Input File
A-1
A-2
A-3
A-4
B-1
B-2
B-3
B-4
C-1
C-2
C-3
C-4
D-1
D-2
D-3
D-4
Expected Output file
A-1
B-2
C-3
D-4 |
What to do if input file includes
Code: |
A-1
A-2
A-3
A-4
B-1
B-2
B-3
C-1
C-2
D-1 |
|
|
Back to top |
|
|
Nic Clouston
Global Moderator
Joined: 10 May 2007 Posts: 2455 Location: Hampshire, UK
|
|
|
|
Group on your first field, assign sequence number to each item restarting the sequence when the next group starts. Do that on the input phase. On the output phase only select those records that have a sequence number of 1 (and drop the sequence number). Examples exist in this and the DFSort forums.
But, perhaps you could improve your JOINKEYS to only output the first match? |
|
Back to top |
|
|
dearlux
New User
Joined: 10 Oct 2019 Posts: 4 Location: usa
|
|
|
|
Sergeyken,
If input file looks like this
A-1
A-2
A-3
A-4
B-1
B-2
B-3
C-1
C-2
D-1
Output required is
A-1
B-2 |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2010 Location: USA
|
|
|
|
dearlux wrote: |
Sergeyken,
If input file looks like this
A-1
A-2
A-3
A-4
B-1
B-2
B-3
C-1
C-2
D-1
Output required is
A-1
B-2 |
Why not
????????????????????????
The idea of pairing remains absolutely unclear.
Lack of detailed info:
- are the values ordered somehow?
- are they actually dates/times to choose max/min from them?
- what is the reason for first-come-first-used, or any other clear rule to chose the combinations?
- … etc ... |
|
Back to top |
|
|
Nic Clouston
Global Moderator
Joined: 10 May 2007 Posts: 2455 Location: Hampshire, UK
|
|
|
|
I think we need to see the JOINKEYS step that created this mess and the requirement that is trying to be fulfilled and the original input data. It may be that the JOINKEYS is badly formulated or it may be that the ultimate goal is not possible with sort. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3048 Location: NYC,USA
|
|
|
|
Quote: |
The idea of pairing remains absolutely unclear. |
Since, whatever input data is showed to us in the initial post has col2 in ascending order which make sense to pick up only first paired (which can be any) and ignore the rest and that is why A-1 came instead of A-4 and B-2 instead of B-3.
Quote: |
it may be that the ultimate goal is not possible with sort.
|
It is possible but looks like tricky to get it in less passes. |
|
Back to top |
|
|
|