Random acces is faster than sequential access, Is it true

chandracdac · New User Joined: 15 Jun 2007 Posts: 92 Location: bangalore

i have a small confusion that am having 5o records in my file actually this is indexed file. we know random acces is faster than sequential access but what my doubt is i want to read 50th record that in case of sequential acces it takes 50 ios(input output operations) in the same way in case of random access also it takes 50ios (we have to read 50 keys ) so my question is how we can tell random access is faster than sequential access as here both are taking same no.f ios .

if any one knows answer please clarify me. thanks in advance.

Bitneuker · Posted: Mon Sep 03, 2007 3:28 pm

Ages ago I was learned to use sequential when the number of records accessed exceed 12% of the number of records. BTW......reading 50 records will, depending on the length and blocking, lead to one single IO if they fit in the buffers.

abin · Active User Joined: 14 Aug 2006 Posts: 198

Hi Bitneuker,

Could you please explain why 12% limit for sequential read?

Thanks,
Abin

CICS Guy · Posted: Mon Sep 03, 2007 4:02 pm

dbzTHEdinosauer · Posted: Mon Sep 03, 2007 11:33 pm

Bitneuker · Posted: Tue Sep 04, 2007 1:17 am

stodolas · Posted: Tue Sep 04, 2007 2:42 am

Say you are reading 13% of the records, and there are 200 million records in the file. And say you want to read the first 12% and the last 1%. If your read that file sequetially, you are doing over 100 million un needed reads.

dbzTHEdinosauer · Posted: Tue Sep 04, 2007 3:27 am

Bitneuker,

sorry that I was not more accurate with my finger pointing. Actually, what you said was about the only thing that I agreed with. (and you CICS Guy!)

Steve,

Though what you say is true, from a practical sense you should not be processing a 200 mill record file with the intent of only processing 10,000.

Most really large files have activity from the on-line (new account or new invoice, etc..) on a one-here-one-there basis. The change to the master is made and the activity should be recorded and a 'report or disposition' record should be (which contains everything that is necessary) written and sent to the back-end (batch) process. The only time you pass the 'master file' is to pass it - completely - and here, sequential will out run random any day. Because, as Bitneuker indicated, you benefit from the buffering and the look-ahead reads that are performed when sequentially reading a properly tuned file.

Any batch process (or an online full file update) is designed to pass-a-file. If you process a trigger file and use random reads to access data from the master, you should re-design your process (extract all data necessary at time of change to the master, to negate the necessity for re-acquisition of data that you had in the first place - because random reads on a master are slow.
Every record that is not already in memory (because the random read only buffers what is necessary - the i/o is a small as possible) needs to be acquired from disc.

You should not process a large master with random reads. You make changes to individual accounts on a task basis. (Another update or change; a new task). You should be affecting a very small number of records when your process is random. When the number of random reads starts getting higher than 1 or 2 % during a process, you are starting to get into the pass-the-master type processing. 10% or more, you either redesign the process or start reading sequentially.

extract what you need during a 'random change to the master' so you don't have to read the data again.

dbzTHEdinosauer · Posted: Tue Sep 04, 2007 3:33 am

I realize, back in the last century, the extract (or trigger) generated to an on-line update/change, was small and required re-reading the master during batch processing. But back then, code was cheap, memory and disc space were not.

5 & 6K extract records are the way to go now. They can be sorted easily, in fact, the batch process can complete without having access to the master.

If your shop is doing it the old way, your shop is doing it the old way!

stodolas · Posted: Tue Sep 04, 2007 3:35 am

Agreed. It was just a counter example. I wouldn't process like that on a VSAM anyway. I have some files (AFP with TLEs in them) that are huge like that and have had to do some one off reporting from them. I sort out the random lines with an include statement. Then post process that.

ashokm · Posted: Tue Sep 04, 2007 5:37 pm

Hi chandracdac,
It is a good question. First understand the internal storage of keys. It should be stored either ascending or descending order.
For example :
Your file is like
Emp No Emp Name Emp date
00010 XXXXXX NNNN
00005 AAAAA NNNN
00002 BBBBB NNNN
………
Emp No is the key …
Internal storage is
00002 which points to BBBBB NNNN
00005 which point to  AAAAA NNNN
00010 which point to  XXXXXX NNNN
……..
There are many level of key storage.
Suppose your file size is 10,00000.
First level it is divided into 100000
Second level it again divided into 10000
Third level it again divided into 1000
Fourth level it again divided into 100
Fifth level it again divided into 10
Sixth it again divided into 1 each
If you are search a record in that file, first it check the first level (max 10 check) . The key should be any of the first level . The check the second level (Max 10 check) and then goes on. So within 60 checks we can retrieve the any record form file worth 1000000 records.

Just I gave one example with my assumption. There are lots of algorithms for searching key. I don’t know what algorithm they are using but this is the basic logic.

Thanks @ Regards
Ashok M

expat · Posted: Wed Sep 05, 2007 10:13 pm

A couple of things that I have learnt about ramdom access.

Always sort the trigger file by the KSDS key prior to executing your program, as this may help reduce the number of index I/O's required to read your file.

The default number of index buffers for a KSDS is 2. Take a little look at the output of an IDCAMS LISTCAT and find the number of index levels that your KSDS has, and for a batch job allocate three or four times that number for BUFNI. This keeps more index records in buffers and may again help reduce the number of index I/O's required to access your data.

chandracdac · New User Joined: 15 Jun 2007 Posts: 92 Location: bangalore

this question is asked in interview to my frend. i think there is no wrong in that question dbz.

Bitneuker · Posted: Thu Sep 06, 2007 11:50 pm

vasanthkumarhb · Posted: Thu Sep 27, 2007 1:11 pm

To explain in simplest way;

1. To read 50th record from sequential file; it will read from starting record so number of passes made is 50.

2. Where has in case of KSDS file we are having a concept like INDEX. When we try to read a 50Th record from KSDS File; it will loacte a record in INDEX SET and SEQUENCE SET(present in INDEX COMPONENT is like a pointer in VSAM) and it will fetch a actual record form DATA COMPONENT present in KSDS, so normally number of passes is less compared to sequential file so thats why random access is faster than seuential acces

Let me know feed back

Thanks & Regards,
raghav

gupta vishal · New User Joined: 25 Sep 2007 Posts: 15 Location: Gurgaon

To be more mathematical,
The concept of hashing is applied in index search, hashing has a time complexity O(1) ie 1, as to reach to a hash key only a calculation is to be made based on the index and thus the key is obtained.