How I can verify the data is in sorted order without sorting

PUMA · New User Joined: 08 Aug 2006 Posts: 10 Location: FRANCE

Hi

Is there a facility in DFSORT or ICETOOL to verify that the data is in sorted order in my very large file to prevent the sort mechanism.
In fact is there a function that read all my data and verify that all my data is sorted on a criteria .

Thinks

enrico-sorichetti · Posted: Sun Sep 21, 2008 1:38 pm

if the procedures have been setup in the proper way,
and the programs have been properly tested ........

there is no need for such check

just curious...
and if the dataset is not sorted what are You going do do ??
another pass to sort it maybe .....

dick scherrer · Posted: Sun Sep 21, 2008 2:40 pm

Hello,

Frank Yaeger · Posted: Sun Sep 21, 2008 9:00 pm

Puma,

You can do a one-file MERGE like the one below. MERGE will issue a message and terminate if it finds a record out of sorted order. MERGE is more efficient than SORT.

Frank Yaeger · Posted: Sun Sep 21, 2008 9:07 pm

CICS Guy · Posted: Sun Sep 21, 2008 10:24 pm

dick scherrer · Posted: Sun Sep 21, 2008 10:46 pm

H Frank,

Frank Yaeger · Posted: Mon Sep 22, 2008 8:24 pm

Frank Yaeger · Posted: Mon Sep 22, 2008 8:36 pm

dick scherrer · Posted: Mon Sep 22, 2008 9:18 pm

Hi Frank,

Frank Yaeger · Posted: Mon Sep 22, 2008 9:39 pm

Dave Betten · New User Joined: 24 Jan 2006 Posts: 26

This notion of whether a sort uses less resources if the input is already in sequence is complex and there's no simple answer. First, one has to define what we mean by resources?

CPU - in some cases (but not all) there will be less cpu time required to do the sort if the input is already in sequence. The degree of cpu savings is going to vary depending on the characteristics of the sort and the available resources. During the input phase, we gain some efficiencies in the number of instructions required since the records we read in are already in sequence. During the output phase, we may or may not be more efficient in how we merge those strings and write the output. Long ago when we were running with limited amounts of main storage, the process of merging those strings was less efficient so having the data already in sequence could have a big effect. But in today's environments where we can optimize main storage for larger sorts, that merge process is more efficient. And in cases where we were able to read the entire file into memory, we're going to be very efficient whether the input was in sequence or not.

Intermediate storage - Whether we're talking central storage or DASD, we're still going to have to store the entire file in some sort of intermediate storage. This is because we can't write any of the records out until the last one is read. For all we know, that last record could be the fist one to be written out! Yes, we might have some efficiencies in and reduce our intermediate storage requirement slightly, but we're still going to have to store the entire file somewhere. So I'd say you're going to require almost the same amount of intermediate storage whether your input is in sequence or not. And if that intermediate storage is on DASD, you're going to still do quite a bit of I/O to work datasets that you would never need to do if you ran the file through a MERGE as was suggested earlier. This is the main reason, I would disagree with the idea that a sort is going to use "a fraction of the usual resources" if the input file is already in sequence.

dick scherrer · Posted: Mon Sep 22, 2008 11:12 pm

Hi Dave,

Thanks for your reply.