IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Binary search issue in table array


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
lavish.garg

New User


Joined: 06 Mar 2012
Posts: 4
Location: india

PostPosted: Tue Mar 06, 2012 6:50 pm
Reply with quote

Hi,

The data in the table array should be in sorted order to use SEARCH ALL. I want to know that the sorting of alphanumeric data should be according to the ascii code or any other code??

The reason I am asking this question is:
I am using an array having ascending on clause. The key has PIC clause X(20). The table array is populated with five entries which are case sensitive. These entries are populated from the table using a cursor having order by clause on the Key_Id of the table.
Now suppose I have following data in the table(all having length 20 and have trailing spaces to complete the length):
MMlavish1garg
MMlavish2garg
PlavPlagarg
Plavishgarg
Ulavishgarg

The data will be populated in the above order in the table array as 'i' has higher ASCII value then 'P'. I have vaerified this using the XPED in CICS that the data is populated in this way only.

Now when I use SEARCH ALL on the bases of Key_Id and try to get match corresponding to 'Plavishgarg', it is not able to find the entry in the table array. I have tried setting index to 1 before using 'search all' still it is not working.

Please provide me a solution to this.

Thanks..
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Tue Mar 06, 2012 7:03 pm
Reply with quote

First, you keep using the term "ASCII". This is wrong. Mainframes use EBCDIC, not ASCII. If you don't know the difference -- learn it. Fast.

Second, if you look in the COBOL Language Reference manual index, you will NOT find the term "array" anywhere used. COBOL uses TABLES, not ARRAYS. Terminology is critical in IT, where similar terms may mean very different things.

Third, you did not provide enough information to answre your question. Things that we need to know are:
1) How many table elements in your table?
2) How did you ensure that the table elements were loaded in ascending order -- just saying "ascending key" in your definition does NOT impose any kind of requirement for them to be sorted at run-time. And the table elements you posted are NOT sorted in ascending EBCDIC sequence -- hence if your table was loaded as indicated, your binary sort will not work since the elements are not sorted appropriately.
3) How did you initialize the unused elements of your table?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Mar 06, 2012 7:08 pm
Reply with quote

hate to impose yet another fact on a clueless rookie,

a COBOL SEARCH ALL (binary search) will be faster
(doubly important for a CICS module,
no telling what other BS you have in this module)
if the COBOL INTERNAL table is defined as ODO.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Mar 06, 2012 7:33 pm
Reply with quote

As dbz has mentioned, if you have, say, OCCURS 128 for your table but only five entries populated, the SEARCH ALL will look four times at data which is not populated before it even starts to hit your data. As Robert has mentioned, if you do not give the "empty" table entries a high key value (HIGH-VALUES is good) then again it plain will not work.

With the ODO, as dbz mentioned, and when used correctly, only the part of the table populated will be searched, and the values in the other table entries are irrelevant.

Anything anyone forgot to mention?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Tue Mar 06, 2012 9:28 pm
Reply with quote

yep.... but in ASCII the numbers come first..
as unpleasant as it might be, <ordering> in a multi-platform/distributed environment is a proper concern
the user ( AKA luser ) when asking for something to be ordered on an alphanumeric key has the right to a consistent display of the results
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Mar 06, 2012 9:59 pm
Reply with quote

Enrico,

i agree with you,

but it is the user/programmer/implementation team responsibility
to insure that all data used by the program belong to the same code page.

as long as all data is of the same code page, there will be no problem.

the same cobol program on the mainframe
should have data files in ebcdic
and the appropriate encode parms (bind card) on the db2 select/insert/update
to insure the ebcdic is converted to ascii prior to inclusion in the db
and converted to ebcdic prior to presentation to the cobol program.

same is so for a pc/server/other ascii implementations.

when that is done, there is no problem
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Mar 06, 2012 10:03 pm
Reply with quote

lavish.garg, where is your CICS running? ASCII machine? You say you used XPED to verify the value of the "i" and "P"....

So your problem is most likely that you do not have ODO and you have not initialised your entire table to a high-key value (like HIGH-VALUES).

The SEARCH ALL will start in the middle of the table. If you get a key hit, search ends, index set to the hit. If key in table is high, SEARCH ALL looks next at the middle of the low-half of the table; if key in table is low, SEARCH ALL looks next at the middle of the high-half of the table. Continues until finds the data, or no data can be found using the process.

If your table has undefined values or values from a previous load of data, you will likely not get a hit.

Say you have OCCURS 20 but load initially with only five items. The other 15 values are "undefined", and quite possibly binary zeros. The mid-point of the table may be a low-key, the SEARCH ALL will try the upper half of the table and stay up there until it determines you key does not exist. Even with non-binary zeros, the undefined data will only allow your data to be found by coincidence.

Either set you table entries to HIGH-VALUES before each load of the table (this is the lazy way, but it works), or use OCCURS DEPENDING ON in the table definition, from which SEARCH ALL will know the length of the table and will only search within that.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Mar 06, 2012 11:03 pm
Reply with quote

so, if the data in the cobol internal table is ascii and sorted,
he is running the module in a pc environment.
Back to top
View user's profile Send private message
lavish.garg

New User


Joined: 06 Mar 2012
Posts: 4
Location: india

PostPosted: Tue Mar 06, 2012 11:46 pm
Reply with quote

Ohhh..hold on guys. Thankss to all for so many replies.

Got my answer in ur conversation.. icon_smile.gif
actually what happened was i populated cobol >TABLE< variables from the database table. The query to retrieve the rows has order by clause which returns data in asc order but according to ascii codes and populates the cobol table with the data which is not in sorted order according to EBCDIC code.

As I used order by clause so thought the table should be populated in sorted order. My mistake, totally forgot about the ASCII and EBCDIC code difference.

So search all was not working.

>>To limit the table I am using depending on clause.

One more ques icon_razz.gif I have read some where that till 50 entries SEARCH works better then SEARCH ALL. By chance I can have max of 50 entries in table and most of the time these will be less then 20. So should i go for linear search or binary search?? As it is a CICS program so performance matters.
Back to top
View user's profile Send private message
lavish.garg

New User


Joined: 06 Mar 2012
Posts: 4
Location: india

PostPosted: Tue Mar 06, 2012 11:56 pm
Reply with quote

And one more thing.. As it is clear that the elements are not in sorted order. I will have to sort the table first in program itself. The no of times i will be hitting the search (in one program) can vary from 10 to 50 times in one transaction.

So Linear or Binary search???
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Mar 07, 2012 12:04 am
Reply with quote

don't sort,
use search not search all. remember to set the index to 1 before the search

and the break point is about 250.
actually perform varying an index is faster than
search or search all.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Mar 07, 2012 12:07 am
Reply with quote

Depends on your data, not on the method of searching.

If a very high proportion of the matches are on a few items, then SEARCH will likely be better, with the high-hitters at the start of the table. The more even the distribution, the more SEARCH ALL will win out. With the possibility of "misses", SEARCH ALL improves again.

The only way to find out in your case, is to try them both with your data.

Anyone who thinks that just the size of the table can give you the answer, doesn't know what they are talking about.

EDIT: Err... didn't see you there, dbz. That's something I'll have to check now. Haven't looked since the Enterprise took to the seas (metaphorically speaking). SEARCH ALL didn't used to be that slow, although you could well beat it with your own binary chop. Live 'n learn, live 'n learn.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Mar 07, 2012 12:37 am
Reply with quote

a link that you may want to follow/read especially since the TS is talking about 20 or so items.:
serial vs binary search

and the attachment at the bottom of his post is sorta interesting.

by the way, JSAYLES is instrumental in the RDZ development.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Mar 07, 2012 1:11 am
Reply with quote

Well, now I'm too confused to know if I'm confused or un-confused.

5815 : 650 : 169 in milliseconds, for perform, SEARCH and SEARCH ALL. Looks to me like SEARCH ALL usesfewer miliseconds, which I'd have thought was good :-)
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Mar 07, 2012 1:34 am
Reply with quote

all of the following is quoted from the link in my previous post:


5815 - for Perform Table Search 244th element
650 - Sequential Search 244th element
169 - Search All - 244th element

29 - for Perform Table Search 1st element
37 - Sequential Search 1st element

In Milliseconds. You can see that, even in a small table - to get to the 244 element,
the Search All was much more efficient than linear search -
and MUCH more efficient than PERFORM VARYING.

But if you hit the top of the table,
PERFORM VARYING actually out-performed (sorry :0 ) linear search
(I should have included a test for finding the first element with SEARCH ALL, but it would have been the worst of the three options).

end of quoted material/start of dbzbs
:

point is that the extra code executed with a SEARCH ALL starts to be meaningless after the item count reaches a certain volume.

but if the item to find is in the top part of the table 1st 10 or so (with 240 some items),
then the perform varying and linear search out perform the search all.

since we know the values for the table will be unsorted,
any savings of a search all would be used up with a sort.

so, don't sort them
use linear or perform varying.

and i am last, again
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Mar 07, 2012 4:33 am
Reply with quote

But I'd be repeating myself if I said that :-)

A common way to speed-up a table look-up is to not do it at all. If the current search value is the same as the previous value, don't do anything. Otherwise, look at the table.

With a small table, the SEARCH ALL won't be necessary. If the data is not in sequence, SEARCH ALL is out anyway. If the data is in sequence, there is the possibility of SEARCH with a termination when key match is no longer possible. Both SEARCH and a PERFORM VARYING will be faster at hitting the top n entries of a table, where n is less than the average number of attempts made by SEARCH ALL before finding or giving up on that one.

It really is down to the data. Collect some "frequency" figures that are likely to be representative, plus some figures for numbers of elements in the table. The bigger the table and the more even the values are distributed, the better a binary search will perform.

However, if the data is suitable for "optimisation", a binary search can be beaten for any size of table.

SEARCH ALL can be beaten by a "hand-coded" binary search, the thing is that the SEARCH ALL works out of the box, every time (assuming one knows how to use it) whereas as a "hand-coded" routine has to be proved.

It might no be so good to do dumb things to slow your CICS program down, but it is also not so good to "optimise" it so that it is no longer as easy to understand.

You know it's gonna be here, don't you? :-)
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts Load new table with Old unload - DB2 DB2 6
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Search two or more word with FILEAID Compuware & Other Tools 15
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts Pulling a fixed number of records fro... DB2 2
Search our Forums:

Back to Top