How HISTOGRAM and FIND Number Access works?

Balaryan · New User Joined: 20 Nov 2009 Posts: 27 Location: chennai

Hi All,

I know Histogram (processing loop) returns the Number matching the DE or SP. Also its key values.
FIND NUMBER (Not processing loop) is used to return the Number for the DE or SP.

But both will accessing the Inverted List to get the number.
Then how can we tell that using Histogram is fast in retriving and performance wise better than FIND Number.

how it differs in accessing?

Kindly give some info and correct me, if i am wrong.

Thanks,
B@L@.

ofer71 · Posted: Wed Nov 25, 2009 5:40 pm

IMHO, this is because HISTOGRAM accesses only the inverted list, while FIND accesses both the inverted list & the ADABAS Data component.

O.

Balaryan · New User Joined: 20 Nov 2009 Posts: 27 Location: chennai

Hi Ofer,

Here, I am talkin about FIND NUMBER statement...

so can u tell me how Histogram is fast in retriving and performance wise better than FIND Number?

THanks,
B@L@.

atulbagewadikar · New User Joined: 15 Jun 2006 Posts: 26

Hi,

I cannot say that HITOGRAM is always better than FIND NUMBER performance wise.

For scenarios other than the simple existence check (counting, complex criteria, multiple-millions of records, etc, etc) you would have to define what you are trying to achieve more specifically.

The general answer would be "it depends" - it depends on what your environment - data, processing, application, etc.

For a simple existence check, we generally use the FIND NUMBER as a first choice:
- it is easier to code
- it allows add complex criteria (AND, OR, etc)

We will use a HISTOGRAM in an existence check to
- check a range of values such as might be found with a reverse date index (find most recent record for this key starting from datevalue)
- count a range of values
- check values involving PE's (that is descriptors on a PE element or superdescriptors with a PE element component)

The FIND NUMBER does result in a write to WORK(Adabas component) and may be a overhead sometimes.

Performance for both commands is nearly always "good enough".

Moreover if you really want to see what commands are issued to ADABAS, write a simple code with both these statements and issue TESTDBLOG command.

Also try to DISPLAY *CPU-TIME being used in both the cases.

Steve Robinson · New User Joined: 14 Nov 2009 Posts: 12 Location: U.S.

If you are using HISTOGRAM to ascertain if a value exists, you should use the new option TO, as in:

STARTING FROM #VALUE TO #VALUE

as opposed to:

STARTING FROM #VALUE ENDING AT #VALUE
or
STARTING FROM #VALUE THRU #VALUE

When ENDING AT or THRU are used, Natural does the check to see if the loop is finished. When using TO, Adabas does the check.

steve

Balaryan · New User Joined: 20 Nov 2009 Posts: 27 Location: chennai

I got it..

Thanks a ton Atul and Steve..

Ralph Zbrog · Posted: Thu Dec 10, 2009 12:38 pm

The relative performance of FIND NUMBER versus HISTOGRAM is dependent on the contents of the inverted list. First, how the commands work:

The FIND statement is compiled into a FIND command, or S1 to Adabas. The purpose of an S1 is to create an ISN list for the selection criteria supplied in the FIND statement. Adabas returns the ISN list, and a count of the ISNs, to the Natural module. Normally the records associated with the ISNs are retrieved with subsequent GET (L1) commands (FIND is an implicit loop), but Natural employs an S1 option that also returns the first record. Ofer71 referred to this in his post.

FIND NUMBER is a special version of the S1 that returns only the ISN count; no ISN list, no first data record. An ISN list is generated and then thrown away after the count is computed.

The HISTOGRAM statement is compiled into a HISTOGRAM command, or L9 to Adabas. The L9 returns the descriptor value and a count of the ISNs, but no ISN list is created, as with an S1. The count is computed by summing the count fields in Associator blocks.

As atulbagewadikar stated, sometimes application logic forces you to use a FIND NUMBER, but in our examples we'll presume that we have a choice.

Scenario 1 - unique descriptor values e.g. Tax ID
FIND NUMBER is recommended.

In this scenario, you will find either 0 or 1 ISN for each key value.