View previous topic :: View next topic
|
Author |
Message |
Casper656
New User
Joined: 10 Mar 2014 Posts: 8 Location: INDIA
|
|
|
|
Hi,
I am trying to create Hash Table in Pl/I. I want generate a hash value which will be served as index to an array. The input to the hash generating function is a unique alpha numeric string and the output hash I require is an integer.
ex. My input is FXXB320 and output should be a number , any number that uniquely represent FXXB320, collision is acceptable.
What are your suggestion on how to achieve this ? |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
where are You facing problems ...
the algorithm or the coding ??? |
|
Back to top |
|
|
Akatsukami
Global Moderator
Joined: 03 Oct 2009 Posts: 1788 Location: Bloomington, IL
|
|
|
|
If the input is no more than eight characters, have you considered interpreting UNSPEC(input) as UNSIGNED FIXED BIN? |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
and ...
possibly modulus against the table size...
but how are You going to handle the collisions
( sequential search for a free slot after the hashed one )
how many entries in the table ? |
|
Back to top |
|
|
steve-myers
Active Member
Joined: 30 Nov 2013 Posts: 917 Location: The Universe
|
|
|
|
Mr. Sorrichetti's advice is good. I do this in Assembler In two steps:
Code: |
PACK AWORD(5),KEY(9)
L 1,AWORD
SR 0,0
D 0,HTABSIZE
LR 1,0
SLL 1,2
LA 1,HASHTAB(1)
AWORD DC F'0',AL1(0)
KEY DC CL8'FXXB320',C' ' |
I don't know how to compact the key to 4 bytes in PL/I, though I'm sure some PL/I expert can chime in. I handle collisions by making each hash table entry the address of a chained data area; a collision just adds a new entry to the chain. The number of entries in the table (in HTABSIZE in my example) is important, but to some extent it depends on the number of elements the hash table is pointing to. I'm not saying the hash table should be large enough so that there is one entry for each possible key as Mr. Sorrichetti implies, but you don't want the alias chains to get very long. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
since hashing/randomizing is NOT a beginner's thing
there is no reason for the TS not to write a small simulation with a parametrized table size to find out the best table dimension
and the longest sibling chain for the input key distribution |
|
Back to top |
|
|
steve-myers
Active Member
Joined: 30 Nov 2013 Posts: 917 Location: The Universe
|
|
|
|
Quote: |
... hashing/randomizing is NOT a beginner's thing |
Hard to know. Done properly it is much better than a search of a complete chain, and it doesn't have the size constraints implied by a binary search. Even poorly done it is very helpful.
Quote: |
there is no reason for the TS not to write a small simulation with a parametrized [sic] table size to find out the best table dimension and the longest sibling chain for the input key distribution |
I've done studies like this after the fact, mainly to see how well the hash arithmetic worked, but even when I did it I was never very happy with the results. I've come to the conclusion my hash tables tend to be too big.
Over the past couple of years once the table has filled I tend to build a monolithic chain from the hash table and sort the chain by the key for reporting purposes. The way I do it clears the hash table, so I can't easily do post-mortem studies. |
|
Back to top |
|
|
Casper656
New User
Joined: 10 Mar 2014 Posts: 8 Location: INDIA
|
|
|
|
Akatsukami wrote: |
If the input is no more than eight characters, have you considered interpreting UNSPEC(input) as UNSIGNED FIXED BIN? |
How do you suggest to do this. I am sorry, I am not a PL/I programmer. I am new to it. It would be helpful if you provide me an example.
Let me explain, I have a flat file. I need to read the input from the file and build Hash Table. So how should I read the character string in to FIXED BIN ? |
|
Back to top |
|
|
Casper656
New User
Joined: 10 Mar 2014 Posts: 8 Location: INDIA
|
|
|
|
enrico-sorichetti wrote: |
where are You facing problems ...
the algorithm or the coding ??? |
Problem is with coding. I am not flexible with PL/I. |
|
Back to top |
|
|
Nic Clouston
Global Moderator
Joined: 10 May 2007 Posts: 2455 Location: Hampshire, UK
|
|
|
|
With the wonders of Google this may be of help. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
|
|
|
|
Nic Clouston wrote: |
With the wonders of Google this may be of help. |
I believe SuperC uses a similar approach, simplified it takes three characters at a time, adds a zero byte in front of them and treats them as a PL/I fixed bin (31) that is summed. Something like:
Code: |
dcl 1 * union,
2 f fixed bin (31) unal,
2 *,
3 * char (1) init (low(1)),
3 c char (3);
hash = 0;
do i = 1 to length(c) by 3;
c = substr(whatever, i, 3);
hash = hash + f;
end; |
|
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
There is kind of an example in :
Enterprise PL/I for z/OS
Programming Guide
SC27-1457-09 |
|
Back to top |
|
|
|