IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

external DFSORT in JCL steps or internal COBOL sort ?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Fri Mar 02, 2007 11:36 pm
Reply with quote

Hi,

Everyday we have to sort large file (sometimes more than 30 million records, always FB)

Now, our applications include either external DFSORT in JCL steps or internal COBOL sort.

Some people say internal COBOL is faster (fewer I/O) than external DFSORT, other think, on the contrary, external DFSORT is better.

What is exactly the truth ? Does it depend on the size of file ? of other parameters ?

NB : comparison to do is between A and B

A: 2 steps
1- DFSORT
2- COBOL pgm read the sorted file in step 1

B : only 1 COBOL step which sort and read the input file
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sat Mar 03, 2007 12:54 am
Reply with quote

What you're calling an "internal sort" is just COBOL calling DFSORT. And there are two flavors of that ... one where COBOL does the I/O (NOFASTSRT) and another where DFSORT does the I/O (FASTSRT). You can read more about this at:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CT00/6.0?DT=20031124135012

To find the "truth" for your particular case, run a comparison between the two. Let us know what you find out.
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Sat Mar 03, 2007 4:06 am
Reply with quote

Very interesting ! I'll try to test it and let you know the result...
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Mar 03, 2007 4:57 am
Reply with quote

Hello,

If i may, a couple of thoughts for your testing.

Make sure the COBOL program does not specify both USING and GIVING. A COBOL program with both is the worst possible performance choice.

If the code that processes the sorted data is to skip some of the records, they may be dropped before being sorted (if you use INPUT PROCEDURE, you can RELEASE only the records needed to the sort.)

When you benchmark the two, you will be more interested in the cpu and i/o used than the wall time. Wall time (the clock on the wall<g>) may vary due to the mix of jobs/transactions being run at the time you run the tests but the resources used for each case will remain fairly constant.

We look forward to your results icon_question.gif
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Sat Mar 03, 2007 2:15 pm
Reply with quote

I have to check it but I think all the programs (with "internal" sort) use the method 1 (INPUT/OUTPUT)
About the method 2 showed below (from 6.4.4 DFSORT tuning guide, cf. link in Yeager's post) I don't see clearly the interest to write it in a COBOL program rather in DFSORT step ("external sort").


=> Method 2 :
*-----------------------------------------------------------------
* CALL DFSORT TO SORT THE RECORDS IN DESCENDING ORDER.
*-----------------------------------------------------------------
SORT SORT-FILE
ON DESCENDING KEY SORT-KEY
USING SORTIN
GIVING SORTOUT.
IF SORT-RETURN > 0
DISPLAY "SORT FAILED".
STOP RUN.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sun Mar 04, 2007 5:53 am
Reply with quote

Hello,

If i understand, i agree. There is not a good reason to code a sort with using/giving.

Most times that i've seen this used was because the coder did not want to understand input/output procedure or they were just lazy.

I believe that "Method 2" is another example of something we "can" do, but should almost always not do for anything that is going to be a production program (have to leave a bit of room for the case where it really does make sense).
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Sun Mar 04, 2007 6:04 am
Reply with quote

Quote:
i agree. There is not a good reason to code a sort with using/giving.
Sort of like declaring IEFBR14 as E15 and E35(?) exits..... icon_rolleyes.gif
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sun Mar 04, 2007 6:11 am
Reply with quote

Except there is exponentially worse overhead with using/giving. icon_smile.gif

As i've posted in a couple of other threads, if a program is to read some data (qsam, database, vsam, whatever) to create a report file, sort the report data, then read the data to process the report the "sorted data" has to be processed 4 times - create, read into sort, write out from sort, and read again to produce the report. Horrible wast of system resources.
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Tue Mar 06, 2007 4:20 am
Reply with quote

I've just read a COBOL program with INPUT/OUTPUT PROCEDURES (Method 1).

The advantage I can see is that the program read the input file then,
generate 1, 2 or 3 records before instruction release.

Note the program is needed to generate those records (lot of rules).

So, If I replace the program with "internal" sort by external sort I have to create another program which read the input file, writes in the output file the generate records, then an external DFSORT sorts the output file.
I'm not sure it'll be more efficient.

I resume :

Only 1 COBOL PGM to read 20 M rec., generate 45 M rec., sort them, and write 15 M rec. (sum on key)

is it better than :

1 COBOL PGM which read 20 M rec., writes 45 M rec.
1 DFSORT step which sorts/sum the 45 M. rec. to produce 15 M. rec.
?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Mar 06, 2007 4:40 am
Reply with quote

Hello,

Well, it wouldn't take long to create both processes. If you write a program to read the 20m and write or release the 45m, it would not be much work to to make a second version that did it the "other" way.

Timing tests (cpu&i/o) on both would be interesting icon_smile.gif

What happens to the 15m that come out of the sort? Is that file used in only one process or multiple processes?
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Tue Mar 06, 2007 1:17 pm
Reply with quote

You mean the 30 M (45-15) ?
Note that these are not exactly the right figures, I've given them for example.

It's to understand the problem :
1- increase, by release, the number of records read
2- decrease, by sum, the number of records read in sort file

But It's true there are millions of records.

Hope it'll not take a long before testing the two solutions.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Mar 06, 2007 9:20 pm
Reply with quote

Hello,

Quote:
You mean the 30 M (45-15) ?


I just used:
Quote:

Only 1 COBOL PGM to read 20 M rec., generate 45 M rec., sort them, and write 15 M rec. (sum on key)


If you do both processes in the COBOL code, you will be able to reduce the number of i/o.

There was a time (especially when region sizes were more scarce) that the internal sort performed poorly for large sorts. With multi-meg region sizes and improvements made to the sort, that may no longer be as critical. The result of both tests will be most interesting.
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Wed Mar 07, 2007 3:37 am
Reply with quote

Thank's for your answer, I think too the sort COBOL, in this case, might be better.

Now, it's time (if I can) to run the tests!...
Back to top
View user's profile Send private message
Searchman

New User


Joined: 28 Dec 2006
Posts: 80
Location: France

PostPosted: Tue Mar 13, 2007 4:48 am
Reply with quote

The result for 30 000 000 records read is :

PGM + DFSORT is little bit faster than "internal" SORT in COBOL PGM
74 s CPU vs 76 s CPU
10'31 " elapsed time vs 11'36 "


below the detail

1 st job with internal sort in COBOL pgm (1 step)

input : 30 000 000 (=30 M) records FLB, LRECL = 700
befort realase sort : # 45 M records FLB, LRECL = 130
CPU time = 76 s
elpased time = 11'36"
NB BLOCK DISK = 786329

2 nd job with COBOL pgm and DFSORT (2 steps)
1 st setp (COBOL PGM)
input : 30 M rec. FLB, LRECL = 700
output : 45 M rec. FLB, LRECL = 130
CPU time = 21,1 s
elpased time = 7'15"
NB BLOCK DISK = 782197

2 nd step (DFSORT with SORT FIELDS and SUM FIELDS)
input : 45 M rec. rec. FLB, LRECL = 130
output : 30 M rec. FLB, LRECL = 130
CPU time = 53,7 s
elpased time = 3'16"
NB BLOCK DISK = 10415
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Mar 13, 2007 5:43 am
Reply with quote

Very interesting. . .

Thank you for running both tests and posting the results icon_smile.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Replace each space in cobol string wi... COBOL Programming 3
No new posts COBOL -Linkage Section-Case Sensitive COBOL Programming 1
No new posts Modifying Date Format Using DFSORT DFSORT/ICETOOL 9
Search our Forums:

Back to Top