View previous topic :: View next topic
|
Author |
Message |
thunderstorm
New User
Joined: 23 Mar 2007 Posts: 35 Location: pune
|
|
|
|
We have a single input file and an single output file. The file is being sorted on the basis of a key field which is defined as character but can contain numeric values as well.
The mainframe installation's has the default as EQUALS
When I execute the step having EQUALS option and compare the output file with the same step without EQUALS option , I would expect the output files to be same but I have observed that there are differences observed in the ordering of the files with the same key.
Can you let me know why is it happenning ?
Thanks,
Thunder |
|
Back to top |
|
|
superk
Global Moderator
Joined: 26 Apr 2004 Posts: 4652 Location: Raleigh, NC, USA
|
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
When you say "character" you need not say "may also contain numeric values." Digits are characters just like every other symbol represented by a hex code and also those hex codes without display characters. All 256 values can be there.
Also, numeric "value" may be misleading, because " 10" and "010" may evaluate the same numerically, but not as character strings.
Being picky here, but that's the way we need to be. |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
Superk -
How does that explain it. Poster says their default is EQUALS, so whether he says it or not, shouldn't he get the same results?
BTW - My opinion - I think the default should be NOT EQUAL. Our SAS installation has default = EQUALS, and I see novices therefore assume that any sort anywhere will preserve order. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
and I see novices therefore assume that any sort anywhere will preserve order. |
Once upon a time every student was given at least one exercise that would demonstrate this. The teachers/instructors were much more qualified and actually cared if the students learned the material. These days many of the so-called "technical schools" are only in it for the $. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
To me also EQUALS is a strange default. EQUALS necessarily uses more resources. For every sort with unique keys, you'd have to specify NOEQUALS explicitly, or just go with the waste.
Two sorts, one with EQUALS as the default and the other specifying EQUALS, should give you identical output files.
So, let's see the two steps, and their output messages. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
Two sorts, one with EQUALS as the default and the other specifying EQUALS, should give you identical output files. |
This would be true only part of the time - when there were no duplicate records on the "sort key(s)". If there are duplicates and NOEQUALS is in effect, the results are unpredictable.
Possibly i'm out of sync with the topic . . . |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
Quote: |
Possibly i'm out of sync with the topic |
Uncharacteristically, I think so.
As Bill says, and I agree, whether it's EQUALS by specific declaration or by default, results should be the same.
I think you and superk missed that detail here.
Luckily, it's not a fatal mistake, because if there's one thing I've learned on this forum, you're allowed only one of those.
Now, if Bill could say something funny - that would be nice.
Thunder - let's see your code and the verbiage showing EQUALS is your default value. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Quote: |
Possibly i'm out of sync with the topic . . . |
Out of DF maybe? Wrong forum to be out of sync.
I always though writing Ph... Phil was funny enough. |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
Yeah Bill - that's good and appreciated. Gotta keep what's important up front. Thanks! |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hi Guys,
Quote: |
Possibly i'm out of sync with the topic . . . |
Quote: |
Uncharacteristically, I think so.
As Bill says, and I agree, whether it's EQUALS by specific declaration or by default, results should be the same. |
Yup, i agree that if EQUALS is in effect (no matter how) the results should be the same (unless there is some other magic under the covers).
IIRC, there are multiple ways to override the default - i don't have the Syncsort material available this weekend
Quote: |
So, let's see the two steps, and their output messages. |
As Bill requested, it would be most helpful to see the informational output generated by each run. . . |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
For DFSORT, if EQUALS is in effect for both steps (regardless of how EQUALS is set in effect) with the same input, the output will be identical.
If NOEQUALS is in effect for either or both steps, the output may not be identical.
Thunderstorm would have to show us the input, output and JES log for both runs in order for anyone to say anything conclusive about the results vs expectations. Anything else is just guessing. |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
Of course, but as usual we've enjoyed each others' company awaiting Thunder's return. |
|
Back to top |
|
|
thunderstorm
New User
Joined: 23 Mar 2007 Posts: 35 Location: pune
|
|
|
|
Frank,
I have a file which i am sorting on the basis of a character field.
The field starts from 7th field and is 37 bytes long and i am sorting on Ascending sequence.
Here is the portion of data where i am facing issue
Input :
----+----1----+----2----+----3----+----4----+----5-
04072 RR 1
06498 RR 1
08648 RR 1
18447 RR 1
17754 RR 1
15801 RR 1
Output 1
When i submit the SORT job with only this 6 records in the input file and without explicit EQUALS ( SORT FIELDS=(7,37,CH,A)) , I get the output as
04072 RR 1
06498 RR 1
08648 RR 1
18447 RR 1
17754 RR 1
15801 RR 1
Output 2
When I submit the job which has around 2000 records in the input file and with explicit EQUALS specified in the sort card
( SORT FIELDS=(7,37,CH,A),EQUALS)and i look for the order in the output for only the above records I find it to be in this order
04072 RR 1
06498 RR 1
08648 RR 1
15801 RR 1
17754 RR 1
18447 RR 1
The sort order seems to be varying with the number of records in the input file. Can you help me understand why this is happenning. If you need any other details ..please let me know
Thanks,
Thunder |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
As stated previously:
Show us the complete JES log for each run
If you prefer, you can e-mail it to me directly (yaeger@us.ibm.com) and I'll take a look. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
The sort order seems to be varying with the number of records in the input file. |
Yup, this can/does happen but not for the reason you believe. It has nothing to do with some particular record count.
The problem is that your sort starts with the RR and the numbers before the RR are not mentioned. If you want these in order within the RR and whatever else (not just the input sequence), you need to name these positions in your sort control.
As the number of records with the same "key" increases, the likelihood that the original order will not survive is almost guaranteed.
So, either code EQUALS or name the positions in the sort control. |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
You have now said or shown that in your file with about 2000 records: the six records you are showing are in fact in the same order as in the file with just those six records. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Again: Thunderstorm would have to show us the input, output and JES log for both runs in order for anyone to say anything conclusive about the results vs expectations. Anything else is just guessing. |
|
Back to top |
|
|
|