IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How HISTOGRAM and FIND Number Access works?


IBM Mainframe Forums -> Java & MQSeries
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Balaryan
Warnings : 2

New User


Joined: 20 Nov 2009
Posts: 27
Location: chennai

PostPosted: Wed Nov 25, 2009 3:55 pm
Reply with quote

Hi All,

I know Histogram (processing loop) returns the Number matching the DE or SP. Also its key values.
FIND NUMBER (Not processing loop) is used to return the Number for the DE or SP.

But both will accessing the Inverted List to get the number.
Then how can we tell that using Histogram is fast in retriving and performance wise better than FIND Number.

how it differs in accessing?

Kindly give some info and correct me, if i am wrong.

Thanks,
B@L@.
Back to top
View user's profile Send private message
ofer71

Global Moderator


Joined: 27 Dec 2005
Posts: 2358
Location: Israel

PostPosted: Wed Nov 25, 2009 5:40 pm
Reply with quote

IMHO, this is because HISTOGRAM accesses only the inverted list, while FIND accesses both the inverted list & the ADABAS Data component.

O.
Back to top
View user's profile Send private message
Balaryan
Warnings : 2

New User


Joined: 20 Nov 2009
Posts: 27
Location: chennai

PostPosted: Thu Nov 26, 2009 9:36 am
Reply with quote

Hi Ofer,

Here, I am talkin about FIND NUMBER statement...

so can u tell me how Histogram is fast in retriving and performance wise better than FIND Number?

THanks,
B@L@.
Back to top
View user's profile Send private message
atulbagewadikar

New User


Joined: 15 Jun 2006
Posts: 26

PostPosted: Thu Nov 26, 2009 2:57 pm
Reply with quote

Hi,

I cannot say that HITOGRAM is always better than FIND NUMBER performance wise.

For scenarios other than the simple existence check (counting, complex criteria, multiple-millions of records, etc, etc) you would have to define what you are trying to achieve more specifically.

The general answer would be "it depends" - it depends on what your environment - data, processing, application, etc.

For a simple existence check, we generally use the FIND NUMBER as a first choice:
- it is easier to code
- it allows add complex criteria (AND, OR, etc)

We will use a HISTOGRAM in an existence check to
- check a range of values such as might be found with a reverse date index (find most recent record for this key starting from datevalue)
- count a range of values
- check values involving PE's (that is descriptors on a PE element or superdescriptors with a PE element component)

The FIND NUMBER does result in a write to WORK(Adabas component) and may be a overhead sometimes.

Performance for both commands is nearly always "good enough".

Moreover if you really want to see what commands are issued to ADABAS, write a simple code with both these statements and issue TESTDBLOG command.

Also try to DISPLAY *CPU-TIME being used in both the cases.
Back to top
View user's profile Send private message
Steve Robinson

New User


Joined: 14 Nov 2009
Posts: 12
Location: U.S.

PostPosted: Tue Dec 01, 2009 5:12 pm
Reply with quote

If you are using HISTOGRAM to ascertain if a value exists, you should use the new option TO, as in:

STARTING FROM #VALUE TO #VALUE

as opposed to:

STARTING FROM #VALUE ENDING AT #VALUE
or
STARTING FROM #VALUE THRU #VALUE

When ENDING AT or THRU are used, Natural does the check to see if the loop is finished. When using TO, Adabas does the check.

steve
Back to top
View user's profile Send private message
Balaryan
Warnings : 2

New User


Joined: 20 Nov 2009
Posts: 27
Location: chennai

PostPosted: Wed Dec 09, 2009 10:54 am
Reply with quote

I got it..

Thanks a ton Atul and Steve.. icon_smile.gif
Back to top
View user's profile Send private message
Ralph Zbrog

New User


Joined: 21 Nov 2009
Posts: 58
Location: California

PostPosted: Thu Dec 10, 2009 12:38 pm
Reply with quote

The relative performance of FIND NUMBER versus HISTOGRAM is dependent on the contents of the inverted list. First, how the commands work:

The FIND statement is compiled into a FIND command, or S1 to Adabas. The purpose of an S1 is to create an ISN list for the selection criteria supplied in the FIND statement. Adabas returns the ISN list, and a count of the ISNs, to the Natural module. Normally the records associated with the ISNs are retrieved with subsequent GET (L1) commands (FIND is an implicit loop), but Natural employs an S1 option that also returns the first record. Ofer71 referred to this in his post.

FIND NUMBER is a special version of the S1 that returns only the ISN count; no ISN list, no first data record. An ISN list is generated and then thrown away after the count is computed.

The HISTOGRAM statement is compiled into a HISTOGRAM command, or L9 to Adabas. The L9 returns the descriptor value and a count of the ISNs, but no ISN list is created, as with an S1. The count is computed by summing the count fields in Associator blocks.

As atulbagewadikar stated, sometimes application logic forces you to use a FIND NUMBER, but in our examples we'll presume that we have a choice.

Scenario 1 - unique descriptor values e.g. Tax ID
FIND NUMBER is recommended.

In this scenario, you will find either 0 or 1 ISN for each key value.

Code:
FIND NUMBER file WITH TAX-ID = value

will cost you 1 command, with negligible overhead.
Quote:
S1


Code:
HISTOGRAM file FOR TAX-ID FROM value THRU value

will cost you 2 or 3 commands
Quote:
L9 returns value, if it exists
L9 returns next value or end-of-file indicator, for Natural to test THRU
RC command to close loop


Code:
HISTOGRAM file FOR TAX-ID FROM value TO value

will cost you 1 or 2 commands
Quote:
L9 returns value, if it exists
L9 returns end-of-file indicator


Scenario 2 - large count values e.g. Surname
HISTOGRAM/TO is recommended.

In this scenario, the SURNAME descriptor has 0 ISNs for "Smart" and 10 million ISNs "Smith".

Code:
FIND NUMBER file WITH SURNAME = 'Smart'

will cost you 1 command, with negligible overhead.
Quote:
S1 with count of 0


Code:
HISTOGRAM file FOR SURNAME FROM 'Smart' TO 'Smart'

will cost you 1 command
Quote:
L9 returns end-of-file indicator


Code:
HISTOGRAM file FOR TAX-ID FROM 'Smart' THRU 'Smart'

will cost you 2 commands
Quote:
L9 returns value 'Smith', for Natural to test THRU, after thousands of I/Os to compute count of 10M
RC to close loop


Code:
FIND NUMBER file WITH SURNAME = 'Smith'

will cost you 1 command, with enormous overhead to generate a list of 10M ISNs
Quote:
S1


Code:
HISTOGRAM file FOR SURNAME FROM 'Smith' TO 'Smith'

will cost you 2 commands
Quote:
L9 returns value 'Smith', after thousands of I/Os to compute count of 10M
L9 returns end-of-file indicator


Code:
HISTOGRAM file FOR TAX-ID FROM 'Smith' THRU 'Smith'

will cost you 3 commands
Quote:
L9 returns value 'Smith', after many I/Os to compute count of 10M
L9 returns next value or end-of-file indicator, for Natural to test THRU
RC to close loop


Bottom line: If the descriptor values are unique, use FIND NUMBER, otherwise use HISTOGRAM/TO.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> Java & MQSeries

 


Similar Topics
Topic Forum Replies
No new posts Access to non cataloged VSAM file JCL & VSAM 18
No new posts How to access web services/website? Mainframe Interview Questions 4
No new posts Pulling a fixed number of records fro... DB2 2
No new posts Substring number between 2 characters... DFSORT/ICETOOL 2
No new posts To find whether record count are true... DFSORT/ICETOOL 6
Search our Forums:

Back to Top