View previous topic :: View next topic
|
Author |
Message |
rikdeb
New User
Joined: 19 Jan 2009 Posts: 63 Location: hyderabad
|
|
|
|
Hi All.
I got stuck in the below scenerio where i need to get the sum of values as explained below:
Consider the input file structure as
Code: |
@01 IDNO $CHAR10.
@12 EMPN $CHAR02.
@65 PAIDAMT 9.2
; |
Code: |
2120000210 1 4000.00
2120011980 25
2120011980 41
2120011980 42
2120011980 43
2120013390 1 2000
2120013390 1 -2000
2120013390 1 2000
2120013399 1 2000
2120013399 1 2000 |
Want the output to be like
Code: |
2120000210 1 4000.00
2120011980 25
2120011980 41
2120011980 42
2120011980 43
2120013390 1 4000
2120013399 1 4000 |
I was trying to do this FIRST nd LAST statement in SAS, but some of the records are getting suppressed.(also it was not taking the negative sign).
Any suggestions to achieve this ? Or any way to thru SORT? |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
SAS will handle this just fine. Since you did not post any code, there's not much we can say about your results -- SAS normally handles minus signs in numeric data without requiring anything special.
Something like this untested code should work fine, AS LONG AS THE INPUT DATA IS SORTED APPROPRIATELY (use PROC SORT if not):
Code: |
DATA X;
INFILE Y ;
INPUT IDNO ....
BY IDNO EMPN ;
IF FIRST.EMPN THEN TOTPAID = 0 ;
TOTPAID + PAIDAMT ;
IF LAST.EMPN THEN ... (output at this point) |
Your provided sample data is not consistent on input and output -- IDNO 2120013390 EMPN 1 has 2 +2000 and 1 -2000, so the total should be 2000 but your sample output shows 4000. Also, you did not specify if you are breaking on EMPN or IDNO and your sample data does NOT clarify this. |
|
Back to top |
|
|
rikdeb
New User
Joined: 19 Jan 2009 Posts: 63 Location: hyderabad
|
|
|
|
Hi Robert.Thanks for your reply.. Actually i missed that point as correctly pointed out by you. I am checking for the combination of IDNO and EMPN.If this combination is repetitive and the PAIDAMT has values , i want to add the PAIDAMT values, so as to have 1 single row.
I took a fresh input
Code: |
2120005310 10 5000.00
2120006130 7 3750.00
2120006130 7 3750.00
2120007220 10
2120007430 8 7500.00
2120007990 25
2120011530 1 1000.00
2120011530 1 1000.00
2120011860 26
2120011980 42
2120011980 43
2120013390 1 2000.00
2120013390 1 -2000.00
2120013390 1 2000.00
2120013490 27
2120013490 27
2120013490 28
2120015040 1 4000.00 |
Here is the code i am trying
DATA CLM;
INFILE INPOL;
INPUT
@01 IDNO $CHAR10.
@12 EMPN $CHAR02.
@16 PAIDAMT 9.2
;
PROC SORT DATA=CLM;
BY IDNO EMPN;
DATA TEST;
SET CLM; BY IDNO EMPN;
IF FIRST.IDNO AND FIRST.EMPN THEN
AMT = 0;
AMT + PAIDAMT;
IF LAST.IDNO AND LAST.EMPN THEN OUTPUT;
Here is the OP i am getting:
Code: |
IDNO EMPN PAIDAMT AMT
2120005310 10 5000 5000
2120006130 7 3750 7500
2120007220 10 0
2120007430 8 7500 7500
2120007990 25 0
2120011530 1 1000 2000
2120011860 26 0
2120011980 43 0
2120013390 1 2000 6000
2120013490 28 0
2120015040 1 4000 4000 |
IF we look into the AMT column, for the combination (2120013390 1) it is taking 2000 and summing up to give 60000 where as it would be 4000(expected)
Also, few combination of records like below are missing .
Code: |
2120013490 27
2120013490 27
2120013490 28 |
|
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
It looks to me like your amount field starts in column 15, not 16 -- hence SAS will NOT read the negative signs in the data.
Also, when you have BY IDNO EMPN; you need to be aware that FIRST.IDNO and FIRST.EMPN (and LAST.IDNO and LAST.EMPN) are handled separately. An example will hopefully make this clear:
Code: |
IDNO EMPN FIRST.IDNO FIRST.EMPN LAST.IDNO LAST.EMPN
0001 0001 1 1 0 0
0001 0001 0 0 0 1
0001 0002 0 1 1 1
0002 0001 1 1 0 0
0002 0001 0 0 0 0
0002 0001 0 0 0 1
0002 0002 0 1 1 1
0003 0003 1 1 1 1
0004 ....
|
I think you need to clarify whether you need to reset the total to zero for each EMPN within an IDNO or only when IDNO changes. |
|
Back to top |
|
|
rikdeb
New User
Joined: 19 Jan 2009 Posts: 63 Location: hyderabad
|
|
|
|
Hi Robert..thanks for the information.!.Stil not able to tweek this condition.
From the above exmaple u provided: here is what i am trying to do.
for the record combination with
Code: |
IDNO EMPN AMT
0001 0001 1000
0001 0001 1000 |
I will add the amount fields and output will be one row with
Next, when EMPN changes to 002, (there is only 1 record with EMPN 002 for IDNO 001,so no need to sumup,total amount will be RESET .It will be value as in input. Being a single row there is nothing to be summed up)
Similarly, for 002 001 combination,(there are 3 rows in input),I want one row with the amount field summed up.
FOR 002 002 combination, or 003 003 combination AMT fields will be as is in the input, since they are single rows and nothing to be summed up.
Bascially i am trying to group by EMPN for each IDNO and add the values in AMT field.
Any suggestion on how to procced? |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Try this:
Code: |
DATA CLM;
INFILE INPOL;
INPUT
@01 IDNO $CHAR10.
@12 EMPN $CHAR02.
@15 PAIDAMT 9.2 <=== change
;
PROC SORT DATA=CLM;
BY IDNO EMPN;
DATA TEST;
SET CLM; BY IDNO EMPN;
IF FIRST.EMPN THEN AMT = 0; <=== change
AMT + PAIDAMT;
IF LAST.EMPN THEN OUTPUT; <=== change |
|
|
Back to top |
|
|
rikdeb
New User
Joined: 19 Jan 2009 Posts: 63 Location: hyderabad
|
|
|
|
Hi Robert Thanks a lot!!!!!
I was also trying to figure out and came with this code, that gave me the same result like yours and that is what the expected OP. Thanks a ton again!
Code: |
DATA TEST;
SET CLM; BY IDNO EMPN;
IF FIRST.IDNO AND FIRST.EMPN THEN
AMT=0;
AMT+PAIDAMT ;
IF LAST.EMPN THEN OUTPUT;
PROC PRINT DATA=TEST; |
I am just curious to know one thing for better understanding. In one of the above posts u mentioned that
Quote: |
FIRST.IDNO and FIRST.EMPN (and LAST.IDNO and LAST.EMPN) are handled separately |
. So if we want to check for multiple conditions(like i did above,nt sure how it worked ) ,how do we do using this FIRST and LAST statements? |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
If you say BY IDNO EMPN; in SAS, every time FIRST.IDNO is true, FIRST.EMPN will also be true -- FIRST. and LAST. force all lower level BY variables to trigger. Hence you may have
Code: |
IDNO EMPN FIRST.IDNO FIRST.EMPN
0001 0001 1 1
0002 0001 1 1 |
even though EMPN repeats, since IDNO changed the two EMPN both have FIRST.EMPN set to true. In your code, AMT will not be reset to zero if EMPN changes values within a single IDNO -- only a change of IDNO will cause AMT to be zero. Hence if you are looking for AMT to reflect the total of all input records for each EMPN within IDNO, you will not get that -- you will get the running total within IDNO. This may work for you depending upon the data you are using. |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8797 Location: Welsh Wales
|
|
|
|
Have you considered using PROC SUMMARY ??? |
|
Back to top |
|
|
|