IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

IMS BMP program causes 878 system abend


IBM Mainframe Forums -> IMS DB/DC
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Artemk

New User


Joined: 22 Nov 2016
Posts: 4
Location: Russia

PostPosted: Tue Nov 22, 2016 8:26 pm
Reply with quote

Hi, actually I'm a beginner in IMS tuning, and reading docs for me is better, but I already digged through tonns of it, learned numeros new things, but still have no idea what how to locate where does my problem come from.
It would be great if someone hints me where to dig. Some times I even think that it could be an internal IMS defect caused by memory leak...

The problrem is that some times I get 878 abend while batch program processing. The only message displayed is "DFS629I IMS XFP TCB ABEND - SYS 878 IVP1"

Configuration is like this: IMS version 12, z/os 1.13, batch is executed for initial load of DEDB, Fast path 64bit buffer manager is enabled, it's buffer size is maximal 2047M. The number of segments inserted is about 11million, total size of loaded data is 1,9GB.
BMP program is written to make checkpoints either after getting FW or every N inserted segments (N is about 1500).
The only dependency I figured out for now is that if NBA is bigger than 2000 in one BMP, or if there are several BMPs that use total NBA > 2000, and the scope of data inserted is long enough, I'm getting abend after about 200-300 checkpoints.
If NBA is 2000 or less, I do not get abend - at least while loading my 2GB of data, may be if I had 20, I got, who knows...


Thanks,
Artem
Back to top
View user's profile Send private message
Artemk

New User


Joined: 22 Nov 2016
Posts: 4
Location: Russia

PostPosted: Tue Nov 22, 2016 8:59 pm
Reply with quote

Forgot to tell about another observation.
(Some clarification about what does the program do:
BMP takes the data from prepared GSAM input file. one record consists of 3 fields: DSN=44char, Type=3-char, data=1917char and each record is inserted in IMS database like 3 segment - root, child, child of child.
I described it so detailed to show that each input record has the same structure, and causes the insertion of the same segmnets of same size.)

And back to BMP. I start the BMP program with NBA=2000. The first FW I get after 2000 input records - as expected. Make checkpoint. Next FW appears after 19xx , the next after even smaller number, and so on.
For example, 100th checkpoint happens 1800 records read after 99th. But on my input data, BMP program ends successfully.
If I use NBA=8000, the behaviour is the same. Sooner or later the interval between checkpoints decreases until about 4000, and 878 abend happens.
This abend says that GetMain failed, so I decided to check what happens with CSA and ECSA. Both of them have a plenty of place - IMS have 1GB, uses 100-160MB, some times up to 600MB. ECSA has about 4500MB, and 2GB used. So, seems that no real lack of memory.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Tue Nov 22, 2016 10:18 pm
Reply with quote

S878 generally has a reason code, which tells much about the reason for the shortage of storage. Look in the job output or the console log to find the reason code.
Back to top
View user's profile Send private message
Artemk

New User


Joined: 22 Nov 2016
Posts: 4
Location: Russia

PostPosted: Tue Nov 22, 2016 11:47 pm
Reply with quote

True, usually s878 comes with 2-digit reason code, but not this time.
May be I lost some thing, see only one line that could have RC -
DFS629I IMS XFP TCB ABEND - SYS 878 IVP1

here is a system log -
Code:

20:32:07.43 JOB00265 00000090  +DFS0542I  *EXTENDED CHECKPOINT* DAY=327,TIME=173207146201+0300 IVP1         
20:33:12.28 STC00249 00000090  DFS3257I ONLINE LOG NOW SWITCHED - FROM DFSOLP02 TO DFSOLP03 IVP1           
20:33:12.29 STC00249 00000090  DFS3257I ONLINE LOG NOW SWITCHED - FROM DFSOLS02 TO DFSOLS03 IVP1           
20:33:12.35 JOB00267 00000281  $HASP100 IVPGNJCL ON INTRDR      IBMUSER               FROM STC00251         
                               IMS12RC1                                                                     
20:33:12.38 JOB00267 00000290  IRR010I  USERID START2   IS ASSIGNED TO THIS JOB.                           
20:33:12.40 STC00249 00000090  DFS2484I JOBNAME=IVPGNJCL GENERATED BY LOG AUTOMATIC ARCHIVING IVP1         
20:33:12.59 JOB00267 00000281  ICH70001I START2   LAST ACCESS AT 20:23:00 ON TUESDAY, NOVEMBER 22, 2016     
20:33:12.61 JOB00267 00000090  $HASP373 IVPGNJCL STARTED - INIT 4    - CLASS E - SYS S0W1                   
20:33:12.62 JOB00267 00000281  IEF403I IVPGNJCL - STARTED - TIME=20.33.12                                   
20:33:12.72 JOB00265 00000090  +DFS0540I  *EXTENDED CHECKPOINT* PROGRAM=DOCCARD ,CKPTID=  000025,           
                               REGID=0007,DAY=327,TIME=1733121  IVP1                                       
20:33:12.73 JOB00265 00000090  +DFS0542I  *EXTENDED CHECKPOINT* DAY=327,TIME=173312174829+0300 IVP1         
20:33:38.13 JOB00267 00000290  -                                         --TIMINGS (MINS.)--               
                                ----PAGING COUNTS---                                                       
20:33:38.14 JOB00267 00000290  -JOBNAME  STEPNAME PROCSTEP    RC   EXCP    TCB    SRB  CLOCK   SERV  PG     
                                  PAGE   SWAP    VIO SWAPS                                                 
20:33:38.14 JOB00267 00000290  -IVPGNJCL                      00  45220 ******    .08     .4  23435   0     
                                     0      0      0     0                                                 
20:33:38.15 JOB00267 00000090  IEF404I IVPGNJCL - ENDED - TIME=20.33.38                                     
20:33:38.15 JOB00267 00000290  -IVPGNJCL ENDED.  NAME-IBMUSER              TOTAL TCB CPU TIME=   .17       
                               TOTAL ELAPSED TIME=    .4                                                   
20:33:38.17 JOB00267 00000281  $HASP395 IVPGNJCL ENDED                                                     
20:33:38.20          00000281  $HASP309 INIT 4    INACTIVE ******** C=ABCDE                                 
20:50:43.34 STC00249 00000090  DFS2361I XCF GROUP CLOSED SUCCESSFULLY.  IVP1                               
20:50:43.34 STC00172 00000090  HWSD0282I COMMUNICATION WITH DS=IMSDSNM  CLOSED; M=DREC                     
20:50:43.35 STC00249 00000090  DFS3896I ATTEMPTING TO PURGE OLDS BUFFERS  IVP1                             
20:50:47.36 STC00249 00000090  DFS616I OLDS BUFFERS SUCCESSFULLY PURGED  IVP1                               
20:50:47.37 STC00249 00000281  DFS629I IMS XFP TCB ABEND - SYS 878           IVP1                           
20:50:47.38 STC00249 00000281  DFS629I PSW AT ERROR = *** NO SDWA ***    IVP1                               
20:50:47.45          00000090  IEA045I AN SVC DUMP HAS STARTED AT TIME=20.50.47 DATE=11/22/2016 754         
                 754 00000090  FOR ASIDS(0070,007F,0071,0041)                                               
                 754 00000090  ERROR ID = SEQ00282 CPU8B ASID0070 TIME20.50.41.6                           
                 754 00000090  QUIESCE = YES                                                               
20:51:04.45 STC00032 00000090  DSNL511I  -DBAG DSNLIENO TCP/IP CONVERSATION FAILED 755                     
                 755 00000090             TO LOCATION ::10.0.40.132                                         
                 755 00000090             IPADDR=::10.0.40.132 PORT=49209                                   
                 755 00000090             SOCKET=RECV RETURN CODE=1121 REASON CODE=76650446                 
20:51:04.56 STC00032 00000090  DSNL511I  -DBAG DSNLIENO TCP/IP CONVERSATION FAILED 756                     
                 756 00000090             TO LOCATION ::10.0.40.22                                         
                 756 00000090             IPADDR=::10.0.40.22 PORT=53719                                   
                 756 00000090             SOCKET=RECV RETURN CODE=1121 REASON CODE=76650446                 
20:51:04.64 STC00032 00000090  DSNL511I  -DBAG DSNLIENO TCP/IP CONVERSATION FAILED 757                     
                 757 00000090             TO LOCATION ::10.0.40.22                                         
                 757 00000090             IPADDR=::10.0.40.22 PORT=53893                                   
                 757 00000090             SOCKET=SENDMSG RETURN CODE=140 REASON CODE=76697242               
20:51:36.13 STC00035 00000090  IEA043I SVC DUMP REACHED MAXSPACE LIMIT - MAXSPACE=00000500 MEG             

Actually, what happens in log - IMS makes checkpoint, switches online logs, again checkpoint, IVPGNJCL starts that writes online logs to offline logs - and here is a beginning of a problem - look at the time line at left column.
15 minutes later, abend message and IMS starts to fall down.

Job log has no other information.
Back to top
View user's profile Send private message
Artemk

New User


Joined: 22 Nov 2016
Posts: 4
Location: Russia

PostPosted: Thu Nov 24, 2016 12:12 am
Reply with quote

Today I got a dump again, and now found the reason code. Thanks for hint to look in the job log, more It was in job log of control region, not bmp. Unfortunately, I'm unable to read dumps. The only thing I see that PSW at entry for ABEND is from 31bit address space.
The RC=10, so I went to an old topic ibmmainframes.com/about38154.html unfortunately they do not tell what CSA worked for that case.

I understand that there is something wrong with virtual storage - need to tune CSA/SQA size, or IMS buffer pools. I believe that only buffers that provide space for NBA/OBA should be considered, because changes of NBA change behaviour of the system.
But, in my case we use Fast path 64bit buffer pool manager, and do not need to tune pools manually like needed in older IMS (surely, if I undestand correct how does this new manager works).
The only parameter could be changes in FP64 is pool size(if I haven't missed smth else), we tried different sizes - from 100MB till maximal 2047, no changes.
CSA/SQA were different too - now they are
CSA = 1MB(tried 4) ECSA = 1GB SQA = 1MB(tried 4) ESQA = 50MB
We decreased sizes from 4 to 1 because thought that areas take 8mb below the line giving only other 8mb to other programs, and usually 640KB are enough.
Actually, we could give more space to IMS in CSA or other areas - from 16MB CSA system takes may be 1 MB, user programs hopefully fit in 2, so 10-12MB is real.
But I'd better become sure that exactly this is a cause for abend.

So, folks, are there any minds what I could try locate the problem?
Big thanks from me for any hints.

Artem
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Thu Nov 24, 2016 1:10 am
Reply with quote

I am not sure if you have looked at some of these links as well (may or may not help):

http://www-01.ibm.com/support/docview.wss?uid=swg21561508

http://ibmmainframes.com/about28380.html

http://ibmmainframes.com/about15045.html

.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Thu Nov 24, 2016 1:27 am
Reply with quote

From the MVS System Codes manual:
Quote:
10
There is not enough virtual private area storage available to satisfy the request. This is due to one of the
following:
v
A program is requesting virtual storage, but not subsequently freeing the storage.
v
The region size specified is too small.
v
An installation exit is requesting virtual storage from a V=R region before initialization of the region has
completed.
The third one is not likely to be the problem since initialization appears to be complete. Therefore, either you're not giving it enough region or a program is not releasing storage. Note that the private area extends from the top of the prefixed storage area (PSA) to the beginning (or bottom end) of the common service area (CSA) -- and hence making changes to your CSA size may be making things worse instead of better.

If the problem is in 24-bit storage, you may have a real problem because there's only so much room below the line and if you're using it all, there's not a lot of places to get more from.

If the problem is in 31-bit storage, then you probably can adjust things (as long as you have enough storage assigned to the LPAR; if not, you may have to change the LPAR storage allocation - stealing from other LPARs if necessary).

The problem will NOT be in 64-bit storage; 64-bit storage allocation works in a completely different way (see memory objects on the IBM web site if you want to find out more) such that many concepts that apply to 24-bit and 31-bit memory have no meaning for 64-bit memory.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu Nov 24, 2016 9:27 pm
Reply with quote

if it is an old application - written with old cobol - you may have to compile a bunch to 32bit to get them to load above the line.
if your application is old enough, you are looking at a retirement project.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> IMS DB/DC

 


Similar Topics
Topic Forum Replies
No new posts Sysplex System won't IPL at DR site I... All Other Mainframe Topics 2
No new posts Using API Gateway from CICS program CICS 0
No new posts ISAM and abend S03B JCL & VSAM 9
No new posts DB2 Event passed to the Application P... DB2 1
No new posts How to pass the PARM value to my targ... COBOL Programming 8
Search our Forums:

Back to Top