View previous topic :: View next topic
|
Author |
Message |
chandracdac
New User
Joined: 15 Jun 2007 Posts: 92 Location: bangalore
|
|
|
|
i have a small confusion that am having 5o records in my file actually this is indexed file. we know random acces is faster than sequential access but what my doubt is i want to read 50th record that in case of sequential acces it takes 50 ios(input output operations) in the same way in case of random access also it takes 50ios (we have to read 50 keys ) so my question is how we can tell random access is faster than sequential access as here both are taking same no.f ios .
if any one knows answer please clarify me. thanks in advance. |
|
Back to top |
|
|
Bitneuker
CICS Moderator
Joined: 07 Nov 2005 Posts: 1104 Location: The Netherlands at Hole 19
|
|
|
|
Ages ago I was learned to use sequential when the number of records accessed exceed 12% of the number of records. BTW......reading 50 records will, depending on the length and blocking, lead to one single IO if they fit in the buffers. |
|
Back to top |
|
|
abin
Active User
Joined: 14 Aug 2006 Posts: 198
|
|
|
|
Hi Bitneuker,
Could you please explain why 12% limit for sequential read?
Thanks,
Abin |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
abin wrote: |
Could you please explain why 12% limit for sequential read? |
It's called a ROT....A Rule Of Thumb..... |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
chandracdac wrote: |
in the same way in case of random access also it takes 50ios (we have to read 50 keys )
|
the amount of invalid/false information in this thread is amazing!
where in the world did you come under the false impression that random access requires all the keys to be read until a hit. A little reading on your part about leaf & b-tree indexing would do you some good. Google if you don't want to wade thru the IBM manuals. |
|
Back to top |
|
|
Bitneuker
CICS Moderator
Joined: 07 Nov 2005 Posts: 1104 Location: The Netherlands at Hole 19
|
|
|
|
Quote: |
the amount of invalid/false information in this thread is amazing!
|
Well Dick.........explain why my 12% rule sucks |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 631 Location: Wisconsin
|
|
|
|
Say you are reading 13% of the records, and there are 200 million records in the file. And say you want to read the first 12% and the last 1%. If your read that file sequetially, you are doing over 100 million un needed reads. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
Bitneuker,
sorry that I was not more accurate with my finger pointing. Actually, what you said was about the only thing that I agreed with. (and you CICS Guy!)
Steve,
Though what you say is true, from a practical sense you should not be processing a 200 mill record file with the intent of only processing 10,000.
Most really large files have activity from the on-line (new account or new invoice, etc..) on a one-here-one-there basis. The change to the master is made and the activity should be recorded and a 'report or disposition' record should be (which contains everything that is necessary) written and sent to the back-end (batch) process. The only time you pass the 'master file' is to pass it - completely - and here, sequential will out run random any day. Because, as Bitneuker indicated, you benefit from the buffering and the look-ahead reads that are performed when sequentially reading a properly tuned file.
Any batch process (or an online full file update) is designed to pass-a-file. If you process a trigger file and use random reads to access data from the master, you should re-design your process (extract all data necessary at time of change to the master, to negate the necessity for re-acquisition of data that you had in the first place - because random reads on a master are slow.
Every record that is not already in memory (because the random read only buffers what is necessary - the i/o is a small as possible) needs to be acquired from disc.
You should not process a large master with random reads. You make changes to individual accounts on a task basis. (Another update or change; a new task). You should be affecting a very small number of records when your process is random. When the number of random reads starts getting higher than 1 or 2 % during a process, you are starting to get into the pass-the-master type processing. 10% or more, you either redesign the process or start reading sequentially.
extract what you need during a 'random change to the master' so you don't have to read the data again. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
I realize, back in the last century, the extract (or trigger) generated to an on-line update/change, was small and required re-reading the master during batch processing. But back then, code was cheap, memory and disc space were not.
5 & 6K extract records are the way to go now. They can be sorted easily, in fact, the batch process can complete without having access to the master.
If your shop is doing it the old way, your shop is doing it the old way! |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 631 Location: Wisconsin
|
|
|
|
Agreed. It was just a counter example. I wouldn't process like that on a VSAM anyway. I have some files (AFP with TLEs in them) that are huge like that and have had to do some one off reporting from them. I sort out the random lines with an include statement. Then post process that. |
|
Back to top |
|
|
ashokm
New User
Joined: 28 Feb 2006 Posts: 11 Location: Chennai,India
|
|
|
|
Hi chandracdac,
It is a good question. First understand the internal storage of keys. It should be stored either ascending or descending order.
For example :
Your file is like
Emp No Emp Name Emp date
00010 XXXXXX NNNN
00005 AAAAA NNNN
00002 BBBBB NNNN
………
Emp No is the key …
Internal storage is
00002 which points to BBBBB NNNN
00005 which point to  AAAAA NNNN
00010 which point to  XXXXXX NNNN
……..
There are many level of key storage.
Suppose your file size is 10,00000.
First level it is divided into 100000
Second level it again divided into 10000
Third level it again divided into 1000
Fourth level it again divided into 100
Fifth level it again divided into 10
Sixth it again divided into 1 each
If you are search a record in that file, first it check the first level (max 10 check) . The key should be any of the first level . The check the second level (Max 10 check) and then goes on. So within 60 checks we can retrieve the any record form file worth 1000000 records.
Just I gave one example with my assumption. There are lots of algorithms for searching key. I don’t know what algorithm they are using but this is the basic logic.
Thanks @ Regards
Ashok M |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8796 Location: Welsh Wales
|
|
|
|
A couple of things that I have learnt about ramdom access.
Always sort the trigger file by the KSDS key prior to executing your program, as this may help reduce the number of index I/O's required to read your file.
The default number of index buffers for a KSDS is 2. Take a little look at the output of an IDCAMS LISTCAT and find the number of index levels that your KSDS has, and for a batch job allocate three or four times that number for BUFNI. This keeps more index records in buffers and may again help reduce the number of index I/O's required to access your data. |
|
Back to top |
|
|
chandracdac
New User
Joined: 15 Jun 2007 Posts: 92 Location: bangalore
|
|
|
|
this question is asked in interview to my frend. i think there is no wrong in that question dbz. |
|
Back to top |
|
|
Bitneuker
CICS Moderator
Joined: 07 Nov 2005 Posts: 1104 Location: The Netherlands at Hole 19
|
|
|
|
Quote: |
so my question is how we can tell random access is faster than sequential access as here both are taking same no.f ios .
|
I think we should keep it simple and stay at the original question. The number of read instructions is the same; the number of IO's doesn't differ regarding 50 records. |
|
Back to top |
|
|
vasanthkumarhb
Active User
Joined: 06 Sep 2007 Posts: 275 Location: Bang,iflex
|
|
|
|
To explain in simplest way;
1. To read 50th record from sequential file; it will read from starting record so number of passes made is 50.
2. Where has in case of KSDS file we are having a concept like INDEX. When we try to read a 50Th record from KSDS File; it will loacte a record in INDEX SET and SEQUENCE SET(present in INDEX COMPONENT is like a pointer in VSAM) and it will fetch a actual record form DATA COMPONENT present in KSDS, so normally number of passes is less compared to sequential file so thats why random access is faster than seuential acces
Let me know feed back
Thanks & Regards,
raghav |
|
Back to top |
|
|
gupta vishal
New User
Joined: 25 Sep 2007 Posts: 15 Location: Gurgaon
|
|
|
|
To be more mathematical,
The concept of hashing is applied in index search, hashing has a time complexity O(1) ie 1, as to reach to a hash key only a calculation is to be made based on the index and thus the key is obtained. |
|
Back to top |
|
|
|