PLUS STUDENT SSN 000-09-6843
11111111 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
22222222 F EFF 032408-000000 AGD 0531
32222222 F EFF 032408-000000 AGD 0531
42222222 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
52222222 F EFF 032408-000000 AGD 0531
Output should be
Code:
PLUS STUDENT SSN 000-09-6843
11111111 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
22222222 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
52222222 F EFF 032408-000000 AGD 0531
ie whenever a record has PLUS in 5th position should be written and also the immediate record should also be written to output file
PLUS STUDENT SSN 000-09-6843
11111111 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
22222222 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
52222222 F EFF 032408-000000 AGD 0531
While Arun's SYNCTOOL application will work, it passes the data twice. The SORT step that I previously provided will only pass the data once. If your input is small, then the performance difference is negligible. However, if you have a significant amount of data, then this may be an issue.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Alissa,
The syncsort job provided by you works perfectly for the sample input given above; but I believe it may not give the desired results when the input looks like the one below. I can't test this as I m out of office now Corrections are welcome.
Code:
----+----+----+----+----+----+----+--
PLUS STUDENT SSN 000-09-6843
11111111 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
22222222 F EFF 032408-000000 AGD 0531
32222222 F EFF 032408-000000 AGD 0531
42222222 F EFF 032408-000000 AGD 0531
PLUS STUDENT SSN 000-09-6842
52222222 F EFF 032408-000000 AGD 0531
52222223 F EFF 032408-000000 AGD 0531
52222224 F EFF 032408-000000 AGD 0531
52222225 F EFF 032408-000000 AGD 0531
That is correct. I only gave a solution for what the OP specifically asked for, based on the sample input records. My solution will not work with your sample data. For your example, you can code the following:
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
lokeshwar,
The first IFTHEN inserts an 8 digit sequence number 1,2,3.... at 81st position. The second IFTHEN increments the sequence number by 1 wherever it encounters a 'PLUS' at 5th pos. For your sample input the temporary dataset T1 will have something like this.
Code:
----+----+----+----+----+----+----+-- pos-81
PLUS STUDENT SSN 000-09-6843 00000002
11111111 F EFF 032408-000000 AGD 0531 00000002
PLUS STUDENT SSN 000-09-6842 00000004
22222222 F EFF 032408-000000 AGD 0531 00000004
32222222 F EFF 032408-000000 AGD 0531 00000005
42222222 F EFF 032408-000000 AGD 0531 00000006
PLUS STUDENT SSN 000-09-6842 00000008
52222222 F EFF 032408-000000 AGD 0531 00000008
From T1, the SELECT operator copies only those records which have duplicate entries at 81st pos which is exactly what you need. The final BUILD discards the sequence numbers which we dont need any more.