View previous topic :: View next topic
|
Author |
Message |
sudarshan.srivathsav
New User
Joined: 10 Jul 2012 Posts: 24 Location: USA
|
|
|
|
Hi,
I am trying to understand how BSAM does compression , Does IBM share the internal documentations of how they use BSAM to compress datasets. How much sampling they do before they start the actual compression.
I tried to PRINT the compressed striped dataset using the ADRDSSU utility, but i understood that after sampling a block of data, they start compressing.
I was interested to learn how IBM internally does this compression, what control blocks are involved in it etc etc.. I tried to see many documentations online but wasn't really successful in finding something useful.
Any inputs will be really helpful.
Thanks,
Sudarshan |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3075 Location: NYC,USA
|
|
Back to top |
|
|
sudarshan.srivathsav
New User
Joined: 10 Jul 2012 Posts: 24 Location: USA
|
|
|
|
Thanks Rohit.
Have another question, BSAM initially does sampling to find the compression token, but then later it does sampling again after sometime, but trying to find what triggers BSAM to do another sampling??
Any thoughts? |
|
Back to top |
|
|
Pete Wilson
Active Member
Joined: 31 Dec 2009 Posts: 587 Location: London
|
|
|
|
Between 8K and 64K for generic compression and much more for tailored compression is sampled in deciding whether a file is eligible for compression.
Also, I believe the initial files primary allocation size has to be at least 8MB (~10cyls) for compression services to be invoked.
Definitely read the link Rohit posted...I just did and learnt a bit more myself.
I'm not sure what you meant about the DFDSS PRINT. Any process that opens a compressed file automatically decompresses the data being read. The format of the output would depend on the parameters you feed into the utility or the type of utility reading the file. Some of the data may be 'unprintable'. |
|
Back to top |
|
|
sudarshan.srivathsav
New User
Joined: 10 Jul 2012 Posts: 24 Location: USA
|
|
|
|
Thanks Pete , I happened to read the document and did learn a lot from that.
The utility I used to print the data set would not decompress it, here is the jcl code snippet to do the same:
Code: |
//DUMPME EXEC PGM=ADRDSSU,REGION=0M
//SYSPRINT DD SYSOUT=*
//DATA DD DISP=SHR,VOL=SER=SMW083,UNIT=SYSDA
//SYSIN DD *
PRINT DATASET(WWCSRS.SAMPLE.OUTPUT.V14.TEST) INDD(DATA)
|
|
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
Quote: |
I was interested to learn how IBM internally does this compression, what control blocks are involved in it etc etc.. I tried to see many documentations online but wasn't really successful in finding something useful. |
That was the question. There is a manual about the inner workings, but that is "company confidential". So that has to stay behind closed doors. |
|
Back to top |
|
|
sudarshan.srivathsav
New User
Joined: 10 Jul 2012 Posts: 24 Location: USA
|
|
|
|
Peter, I agree with you, but someone who knew about it can share a generic idea about it, so i could get better idea of why more TCB/SRB is spent on compressing files, and why the compression ratio is bad etc..
Did not mean to steal anything from IBM !! |
|
Back to top |
|
|
Pete Wilson
Active Member
Joined: 31 Dec 2009 Posts: 587 Location: London
|
|
|
|
Looking at your JCL Sudarshan I see you have a volser coded which implies it is not an SMS managed dataset, so it is probably not compressed. I think it's still the case that they have to be SMS managed to be compressed.
If a file IS compressed, the system automatically decompresses the file when you open it and you have no control over that. Can you show a full LISTCAT ENT output for the file to show if it has a compression token or not. |
|
Back to top |
|
|
sudarshan.srivathsav
New User
Joined: 10 Jul 2012 Posts: 24 Location: USA
|
|
|
|
Peter ,
Please below:
Code: |
NONVSAM ------- WWCSRS.SAMPLE.OUTPUT.V14.TEST
IN-CAT --- USERCAT.TSOUSERS
HISTORY
DATASET-OWNER-----(NULL) CREATION--------2014.227
RELEASE----------------2 EXPIRATION------0000.000
***
ACCOUNT-INFO-----------------------------------(NULL)
SMSDATA
STORAGECLASS -----STRIPE MANAGEMENTCLASS---(NULL)
DATACLASS ------STRIPEC1 LBACKUP ---0000.000.0000
VOLUMES
VOLSER------------SMW137 DEVTYPE------X'3010200F' FSEQN------------------0
ASSOCIATIONS--------(NULL)
ATTRIBUTES
VERSION-NUMBER---------1
STRIPE-COUNT-----------1
ACT-DIC-TOKEN----X'4000000B01F00240070208050D0108FE0DFE05FE0EFE0AFE000000000000000000000000'
COMP-FORMT EXTENDED
STATISTICS
USER-DATA-SIZE------------------------------325659960 COMP-USER-DATA-SIZE-------------------------235216437
SIZES-VALID--------(YES)
***
|
[/code] |
|
Back to top |
|
|
Pete Wilson
Active Member
Joined: 31 Dec 2009 Posts: 587 Location: London
|
|
|
|
OK so it is definitely SMS managed and compressed.
Not sure how you know for certain it is not decompressing when you print it with DFDSS. Have you verified that by trying to print the file with IDCAMS or SAS or something? If you browse the file through something like Fileaid does the data appear to be printable characters?
I doubt you will easily be able to get any details on the internal workings of the compression process. Even if it is available this sort of thing is usually what's termed 'Licenced Material' and would restricted to the teams who need to know about it.
To be honest there's probably more productive ways to spend your time. Normally I'd encourage research but in this case I wouldn't. |
|
Back to top |
|
|
|