IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Random acces is faster than sequential access, Is it true


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
chandracdac

New User


Joined: 15 Jun 2007
Posts: 92
Location: bangalore

PostPosted: Mon Sep 03, 2007 3:23 pm
Reply with quote

i have a small confusion that am having 5o records in my file actually this is indexed file. we know random acces is faster than sequential access but what my doubt is i want to read 50th record that in case of sequential acces it takes 50 ios(input output operations) in the same way in case of random access also it takes 50ios (we have to read 50 keys ) so my question is how we can tell random access is faster than sequential access as here both are taking same no.f ios .


if any one knows answer please clarify me. thanks in advance.
Back to top
View user's profile Send private message
Bitneuker

CICS Moderator


Joined: 07 Nov 2005
Posts: 1104
Location: The Netherlands at Hole 19

PostPosted: Mon Sep 03, 2007 3:28 pm
Reply with quote

Ages ago I was learned to use sequential when the number of records accessed exceed 12% of the number of records. BTW......reading 50 records will, depending on the length and blocking, lead to one single IO if they fit in the buffers.
Back to top
View user's profile Send private message
abin

Active User


Joined: 14 Aug 2006
Posts: 198

PostPosted: Mon Sep 03, 2007 3:51 pm
Reply with quote

Hi Bitneuker,

Could you please explain why 12% limit for sequential read?

Thanks,
Abin
Back to top
View user's profile Send private message
CICS Guy

Senior Member


Joined: 18 Jul 2007
Posts: 2146
Location: At my coffee table

PostPosted: Mon Sep 03, 2007 4:02 pm
Reply with quote

abin wrote:
Could you please explain why 12% limit for sequential read?
It's called a ROT....A Rule Of Thumb.....
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Mon Sep 03, 2007 11:33 pm
Reply with quote

chandracdac wrote:

in the same way in case of random access also it takes 50ios (we have to read 50 keys )


the amount of invalid/false information in this thread is amazing!

where in the world did you come under the false impression that random access requires all the keys to be read until a hit. A little reading on your part about leaf & b-tree indexing would do you some good. Google if you don't want to wade thru the IBM manuals.
Back to top
View user's profile Send private message
Bitneuker

CICS Moderator


Joined: 07 Nov 2005
Posts: 1104
Location: The Netherlands at Hole 19

PostPosted: Tue Sep 04, 2007 1:17 am
Reply with quote

Quote:
the amount of invalid/false information in this thread is amazing!


Well Dick.........explain why my 12% rule sucks icon_wink.gif
Back to top
View user's profile Send private message
stodolas

Active Member


Joined: 13 Jun 2007
Posts: 631
Location: Wisconsin

PostPosted: Tue Sep 04, 2007 2:42 am
Reply with quote

Say you are reading 13% of the records, and there are 200 million records in the file. And say you want to read the first 12% and the last 1%. If your read that file sequetially, you are doing over 100 million un needed reads.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Sep 04, 2007 3:27 am
Reply with quote

Bitneuker,

sorry that I was not more accurate with my finger pointing. Actually, what you said was about the only thing that I agreed with. (and you CICS Guy!)

Steve,

Though what you say is true, from a practical sense you should not be processing a 200 mill record file with the intent of only processing 10,000.

Most really large files have activity from the on-line (new account or new invoice, etc..) on a one-here-one-there basis. The change to the master is made and the activity should be recorded and a 'report or disposition' record should be (which contains everything that is necessary) written and sent to the back-end (batch) process. The only time you pass the 'master file' is to pass it - completely - and here, sequential will out run random any day. Because, as Bitneuker indicated, you benefit from the buffering and the look-ahead reads that are performed when sequentially reading a properly tuned file.

Any batch process (or an online full file update) is designed to pass-a-file. If you process a trigger file and use random reads to access data from the master, you should re-design your process (extract all data necessary at time of change to the master, to negate the necessity for re-acquisition of data that you had in the first place - because random reads on a master are slow.
Every record that is not already in memory (because the random read only buffers what is necessary - the i/o is a small as possible) needs to be acquired from disc.

You should not process a large master with random reads. You make changes to individual accounts on a task basis. (Another update or change; a new task). You should be affecting a very small number of records when your process is random. When the number of random reads starts getting higher than 1 or 2 % during a process, you are starting to get into the pass-the-master type processing. 10% or more, you either redesign the process or start reading sequentially.

extract what you need during a 'random change to the master' so you don't have to read the data again.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Tue Sep 04, 2007 3:33 am
Reply with quote

I realize, back in the last century, the extract (or trigger) generated to an on-line update/change, was small and required re-reading the master during batch processing. But back then, code was cheap, memory and disc space were not.

5 & 6K extract records are the way to go now. They can be sorted easily, in fact, the batch process can complete without having access to the master.

If your shop is doing it the old way, your shop is doing it the old way!
Back to top
View user's profile Send private message
stodolas

Active Member


Joined: 13 Jun 2007
Posts: 631
Location: Wisconsin

PostPosted: Tue Sep 04, 2007 3:35 am
Reply with quote

Agreed. It was just a counter example. I wouldn't process like that on a VSAM anyway. I have some files (AFP with TLEs in them) that are huge like that and have had to do some one off reporting from them. I sort out the random lines with an include statement. Then post process that.
Back to top
View user's profile Send private message
ashokm

New User


Joined: 28 Feb 2006
Posts: 11
Location: Chennai,India

PostPosted: Tue Sep 04, 2007 5:37 pm
Reply with quote

Hi chandracdac,
It is a good question. First understand the internal storage of keys. It should be stored either ascending or descending order.
For example :
Your file is like
Emp No Emp Name Emp date
00010 XXXXXX NNNN
00005 AAAAA NNNN
00002 BBBBB NNNN
………
Emp No is the key …
Internal storage is
00002 which points to BBBBB NNNN
00005 which point to  AAAAA NNNN
00010 which point to  XXXXXX NNNN
……..
There are many level of key storage.
Suppose your file size is 10,00000.
First level it is divided into 100000
Second level it again divided into 10000
Third level it again divided into 1000
Fourth level it again divided into 100
Fifth level it again divided into 10
Sixth it again divided into 1 each
If you are search a record in that file, first it check the first level (max 10 check) . The key should be any of the first level . The check the second level (Max 10 check) and then goes on. So within 60 checks we can retrieve the any record form file worth 1000000 records.

Just I gave one example with my assumption. There are lots of algorithms for searching key. I don’t know what algorithm they are using but this is the basic logic.

Thanks @ Regards
Ashok M
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8796
Location: Welsh Wales

PostPosted: Wed Sep 05, 2007 10:13 pm
Reply with quote

A couple of things that I have learnt about ramdom access.

Always sort the trigger file by the KSDS key prior to executing your program, as this may help reduce the number of index I/O's required to read your file.

The default number of index buffers for a KSDS is 2. Take a little look at the output of an IDCAMS LISTCAT and find the number of index levels that your KSDS has, and for a batch job allocate three or four times that number for BUFNI. This keeps more index records in buffers and may again help reduce the number of index I/O's required to access your data.
Back to top
View user's profile Send private message
chandracdac

New User


Joined: 15 Jun 2007
Posts: 92
Location: bangalore

PostPosted: Thu Sep 06, 2007 5:52 pm
Reply with quote

this question is asked in interview to my frend. i think there is no wrong in that question dbz.
Back to top
View user's profile Send private message
Bitneuker

CICS Moderator


Joined: 07 Nov 2005
Posts: 1104
Location: The Netherlands at Hole 19

PostPosted: Thu Sep 06, 2007 11:50 pm
Reply with quote

Quote:
so my question is how we can tell random access is faster than sequential access as here both are taking same no.f ios .


I think we should keep it simple and stay at the original question. The number of read instructions is the same; the number of IO's doesn't differ regarding 50 records.
Back to top
View user's profile Send private message
vasanthkumarhb

Active User


Joined: 06 Sep 2007
Posts: 275
Location: Bang,iflex

PostPosted: Thu Sep 27, 2007 1:11 pm
Reply with quote

To explain in simplest way;

1. To read 50th record from sequential file; it will read from starting record so number of passes made is 50.

2. Where has in case of KSDS file we are having a concept like INDEX. When we try to read a 50Th record from KSDS File; it will loacte a record in INDEX SET and SEQUENCE SET(present in INDEX COMPONENT is like a pointer in VSAM) and it will fetch a actual record form DATA COMPONENT present in KSDS, so normally number of passes is less compared to sequential file so thats why random access is faster than seuential acces

Let me know feed back

Thanks & Regards,
raghav
Back to top
View user's profile Send private message
gupta vishal

New User


Joined: 25 Sep 2007
Posts: 15
Location: Gurgaon

PostPosted: Thu Sep 27, 2007 2:59 pm
Reply with quote

To be more mathematical,
The concept of hashing is applied in index search, hashing has a time complexity O(1) ie 1, as to reach to a hash key only a calculation is to be made based on the index and thus the key is obtained.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts Access to macro definitions PL/I & Assembler 4
No new posts Access to non cataloged VSAM file JCL & VSAM 18
No new posts How to access web services/website? Mainframe Interview Questions 4
No new posts To find whether record count are true... DFSORT/ICETOOL 6
No new posts Generate random number from range of ... COBOL Programming 3
Search our Forums:

Back to Top