I know Histogram (processing loop) returns the Number matching the DE or SP. Also its key values.
FIND NUMBER (Not processing loop) is used to return the Number for the DE or SP.
But both will accessing the Inverted List to get the number.
Then how can we tell that using Histogram is fast in retriving and performance wise better than FIND Number.
how it differs in accessing?
Kindly give some info and correct me, if i am wrong.
I cannot say that HITOGRAM is always better than FIND NUMBER performance wise.
For scenarios other than the simple existence check (counting, complex criteria, multiple-millions of records, etc, etc) you would have to define what you are trying to achieve more specifically.
The general answer would be "it depends" - it depends on what your environment - data, processing, application, etc.
For a simple existence check, we generally use the FIND NUMBER as a first choice:
- it is easier to code
- it allows add complex criteria (AND, OR, etc)
We will use a HISTOGRAM in an existence check to
- check a range of values such as might be found with a reverse date index (find most recent record for this key starting from datevalue)
- count a range of values
- check values involving PE's (that is descriptors on a PE element or superdescriptors with a PE element component)
The FIND NUMBER does result in a write to WORK(Adabas component) and may be a overhead sometimes.
Performance for both commands is nearly always "good enough".
Moreover if you really want to see what commands are issued to ADABAS, write a simple code with both these statements and issue TESTDBLOG command.
Also try to DISPLAY *CPU-TIME being used in both the cases.
Joined: 21 Nov 2009 Posts: 58 Location: California
The relative performance of FIND NUMBER versus HISTOGRAM is dependent on the contents of the inverted list. First, how the commands work:
The FIND statement is compiled into a FIND command, or S1 to Adabas. The purpose of an S1 is to create an ISN list for the selection criteria supplied in the FIND statement. Adabas returns the ISN list, and a count of the ISNs, to the Natural module. Normally the records associated with the ISNs are retrieved with subsequent GET (L1) commands (FIND is an implicit loop), but Natural employs an S1 option that also returns the first record. Ofer71 referred to this in his post.
FIND NUMBER is a special version of the S1 that returns only the ISN count; no ISN list, no first data record. An ISN list is generated and then thrown away after the count is computed.
The HISTOGRAM statement is compiled into a HISTOGRAM command, or L9 to Adabas. The L9 returns the descriptor value and a count of the ISNs, but no ISN list is created, as with an S1. The count is computed by summing the count fields in Associator blocks.
As atulbagewadikar stated, sometimes application logic forces you to use a FIND NUMBER, but in our examples we'll presume that we have a choice.
Scenario 1 - unique descriptor values e.g. Tax ID
FIND NUMBER is recommended.
In this scenario, you will find either 0 or 1 ISN for each key value.
Code:
FIND NUMBER file WITH TAX-ID = value
will cost you 1 command, with negligible overhead.
Quote:
S1
Code:
HISTOGRAM file FOR TAX-ID FROM value THRU value
will cost you 2 or 3 commands
Quote:
L9 returns value, if it exists
L9 returns next value or end-of-file indicator, for Natural to test THRU
RC command to close loop
Code:
HISTOGRAM file FOR TAX-ID FROM value TO value
will cost you 1 or 2 commands
Quote:
L9 returns value, if it exists
L9 returns end-of-file indicator
Scenario 2 - large count values e.g. Surname
HISTOGRAM/TO is recommended.
In this scenario, the SURNAME descriptor has 0 ISNs for "Smart" and 10 million ISNs "Smith".
Code:
FIND NUMBER file WITH SURNAME = 'Smart'
will cost you 1 command, with negligible overhead.
Quote:
S1 with count of 0
Code:
HISTOGRAM file FOR SURNAME FROM 'Smart' TO 'Smart'
will cost you 1 command
Quote:
L9 returns end-of-file indicator
Code:
HISTOGRAM file FOR TAX-ID FROM 'Smart' THRU 'Smart'
will cost you 2 commands
Quote:
L9 returns value 'Smith', for Natural to test THRU, after thousands of I/Os to compute count of 10M
RC to close loop
Code:
FIND NUMBER file WITH SURNAME = 'Smith'
will cost you 1 command, with enormous overhead to generate a list of 10M ISNs
Quote:
S1
Code:
HISTOGRAM file FOR SURNAME FROM 'Smith' TO 'Smith'
will cost you 2 commands
Quote:
L9 returns value 'Smith', after thousands of I/Os to compute count of 10M
L9 returns end-of-file indicator
Code:
HISTOGRAM file FOR TAX-ID FROM 'Smith' THRU 'Smith'
will cost you 3 commands
Quote:
L9 returns value 'Smith', after many I/Os to compute count of 10M
L9 returns next value or end-of-file indicator, for Natural to test THRU
RC to close loop
Bottom line: If the descriptor values are unique, use FIND NUMBER, otherwise use HISTOGRAM/TO.