Joined: 01 Sep 2006 Posts: 39 Location: Chennai, India
Hi,
Could you please help me in converting rows into columns for the below mentioned sample.
I/P File:
Code:
A 2008 100
A 2009 150
B 2008 050
C 2008 100
C 2009 075
Expected o/p file:
Code:
A B C
2008 100 050 100
2009 150 000 075
Please note that input file is sorted based on column 1 and we will not be knowing what values will be available in column 1. The maximum distinct characters is limited to 3 but they are not known. i.e., it can A,B and C or D, E and F or A, B and D. It will be maximum three but not the same combination.
Joined: 01 Sep 2006 Posts: 39 Location: Chennai, India
Hi Arun,
We have 3 distinct variables and we are populating the values for each variable in columns. If value is not available for that variable we will have to populate it as zeroes.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Quote:
Kindly let me know if you need more information
vinothsubramanian,
Are you expecting a maxium of 3 duplicates for field2? Still I am not sure why the 'second' field has '000' and why not the 'third'.
It would be better if you post a much more realistic example showing all possible combinations of input and the expected output out of it.
As per my understanding, 2009 doesn't have any value for 'B'. Hence second field ("B") was assigned to zeroes and first and third fields were assigned with the values mentioned in the input.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
vinoth wrote:
But my issue is I will not be knowing whether these three characters will be A, B and C. It can be any three character between A-Z
Vinoth,
I am not sure as to how you would know which value is missing. Lets take this example data
Code:
A 2008 100
A 2009 150
B 2008 050
C 2008 100
C 2009 075
Y 2010 300
Z 2010 250
Now for 2008 you have 3 value which can be grouped as a single record and for 2009 you have a missing 'b' record there is no way to identify if the missing record is an 'a' ,'b' or 'c'
for 2010 you have completely different set of alphabets in pos 1 now how would you produce a single record for them ?
Run your job using these control cards and see what it produces. Also you are doing all your processing with INREC which get executed even before sort. if your data is unsorted on the first alphabet then you may end up with S0C7 during summation.
for ex: use this data to get S0C7
Code:
A 2008 100
A 2009 150
Z 2010 250
B 2008 050
C 2008 100
C 2009 075
C 2010 250
You need to define better rules to find the missing value.
Joined: 01 Sep 2006 Posts: 39 Location: Chennai, India
Hi Kolusu,
Thanks for your reply.
As mentioned earlier by me, we will be having only 3 different alphabets for any case. Hence I think there is no need for us to think of putting fourth alphabet in the column1.
However, we can increase the years say from 2001 to 2010 and try. My code works fine for any number of years and with any 3 characters.
Only issue I have now is I couldn't get the header for each column.