I have the following scenario. There a 2 sort steps in a job, which uses same input having approx 180000000 records. Record length of input and output files in 600. Two outputs are created out of the 2 steps.
First step is like below,
45,3 is record type. '001' is a header record which can have many records with record types '032', '201' etc.
Second step is like below,
In the above steps they used the same SORT FIELDS,
These steps taking lot of time as the input is having huge no of records.
Please let me know if we can use any other SORT which eliminates the performance issue.
This is quite vague. You may or may not be able to do anything to improve performance, but you haven't given enough information for anyone to determine that.
You INCLUDE statements are reducing the number of records to be sorted, which is a good thing. But you're using two passes over the data which is a bad thing.
Ideally, you might want to use an INCLUDE statement that includes the relevant records for both output data sets (to reduce the number of records to be sorted) and two OUTFIL statements, each with an INCLUDE operand to create the output data sets from the sorted records. Like this:
You haven't really demonstrated that there is an issue, or what the issue is.
And you didn't do what I suggested.
In my job, I used an INCLUDE statement to remove unneeded records so they wouldn't be sorted. You don't have that INCLUDE statement so you're sorting all of the records. The INCLUDE statement may or may not help, but you didn't try it, so you don't know.