1. The commented code in cobol is an actual comment that programmer has written, something like
Code:
000006 MOVE VAR-A TO VAR-B.
000007*
000008* THIS IS A COMMENTED LINE.
000009*
000010 MOVE VAR-B TO VAR-C.
2. Or if the comment is a line of cobol that has been commented.
Code:
000110*
000111* MOVE VAR-A TO VAR-B <-- this is a commented code
000112* MOVE SPACE TO VAR-B IS THE SOLUTION TO CORRECT ABEND <--this line is comment though the initial four letters of the comment is a valid cobol code
000113 MOVE SPACES TO VAR-D.
Since cobol as a language is so similar to english, i am wondering how to acheive this logic. It would be really great if someone can give me a logic that I can build up on.
I am trying to create a report on what all changes are getting into our programs with every release (blame it on requirements, I did not write them ). For doing this the easiest way is to run a SUPERC compare and get the line of code that have been inserted or deleted and create a report from there.
But, how to handle the comments is something I am trying to dig since yesterday. I do not want the comments to be part of the report. But again, a comment can be a someone accidently commenting a line of code which should get reported.
And since you are helping me out from Mars, this should not be impossible .
Dont you have versioning control tools in your shop?
If no:
It is possible solution can be got only if you follow a "Coding practice"(Tagging for changes with relase number or may be a date from 73 - 80th
column and not tagging for comments and also need to ensure no one provides unnum,num,renum so that the column 73 - 80 is not disturbed)
Then based on release number you could just execute a sort step to find the changes done and report them
Quote:
And since you are located on Mars and helping me out, it should not be impossible .
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Code:
Original
MOVE A TO B
MOVE C TO D
MOVE F TO G
Change 1
MOVE A TO B
* DELETED THE LINE
MOVE F TO G
Change 2
MOVE A TO B
* DELETED THE LINE
* MOVE C TO D
MOVE F TO G
In Change 1, the will be an "addition" only of a comment.
In Change 2, the will be an "addition" only of a comment, the "addition" of a commented line of code, and the "deletion" of a line of code.
If you are just looking at insertions/deletions without any context, the results will be meaningless whether the comments are there are not.
If you are looking at some context, you'll delude yourselves in the case where the full context would show different meaning, whether the comments are there or not.
If you have full context, I don't see the problem in including comments. As your example shows, comments are often just plain nonsense, but sometimes might hint at something which needs to be looked at.
Yes, we do follow tagging from column 73-80 and what you suggested is what I have already accomplished. But my report right now is reporting all the comments that programmers have put in the code along with commented code. It is only the commented out code that should be getting reported and not comments(again, blame requirements).
In short what I am expecting is:
1. Report should not report any comments that have been introduced in a program.
2. Report should report any code line that has been commented out. This commented line can be a existing piece of code that has been commented out or a new code that is commented.
Also as Bill mentioned, I am looking (just trying at the moment) at the context on line to determine if the line is a comment or a cobol code that is commented.
The only place I am stuck here is how do I know if the commented out line is code or a programmer expressing himself.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
What I was trying to indicate, despite thinking that it is silly, is that if you have an "insert" with a comment marker which is identical to a "delete" without the comment marker, then it was a "change" of code to comment. If no match, it was a plain comment. Presumably you have a "source line number" on your "changes listing"?
Joined: 06 Jun 2008 Posts: 8700 Location: Dubuque, Iowa, USA
What you are asking is, in a different form, how to perform syntax analysis for a COBOL program -- just for comments instead of code. While this can be done, it is an EXTREMELY difficult task. The rule of thumb I learned in college is that writing a compiler is three times harder than writing an application.
If you insist in pursuing this course (and I like others already posted strongly recommend that you do NOT attempt to do this), you will need to find a language specification document for COBOL (there are a number of different expressions of language definition that could be used such as Backus-Naur Form) and work through it. Expect such a task, not including the rest of the project, to run several hundred to maybe 2000 hours depending upon how experienced you are at writing compilers already.
Joined: 21 May 2013 Posts: 2 Location: United States
I am suggesting this to reduce your effor. This is not 100% effective.
Just get only the commented lines (* at 7th byte) in your program.
Replace all * by space at 7th byte.
The lines that start with Red color first word are commented logic lines.
The lines that start with Green color first word are the comments.
But note that this is not 100% effective. You have some exceptions in this case like working storage varialbe alone in a line.
000030 01 INDX PIC S9(4) COMP VALUE ZERO.
000031 01 SUB PIC S9(4) COMP VALUE ZERO.
000032 01 SUB2 PIC S9(4) COMP VALUE ZERO.
Changed Code
Code:
000030 01 INDX PIC S9(4) COMP VALUE ZERO.
000031*01 SUB PIC S9(4) COMP VALUE ZERO.
000032 01 SUB2 PIC S9(4) COMP VALUE ZERO.
Production Code:
Code:
000333 MOVE 'M' TO WHERE-SW
000334 GO TO WRITE-MASTR-50-PC.
000335
000336 GO TO NO-SELECT.
Changed Code:
Code:
000333 MOVE 'M' TO WHERE-SW
000334 GO TO WRITE-MASTR-50-PC.
000335
* BISWAJIT CODE START
000867 PERFORM CALL-DATEPROG THRU CALL-DATEPROG-EXIT.
* BISWAJIT CODE ENDS
000336 GO TO NO-SELECT.
Production code:
Code:
001427*** CODE UPDATING, SINCE IT WILL HAPPEN AUTOMATICALLY AT EOM.
001428***
001429*UPDATE-CLS-CD.
Changed code:
Code:
001428***
001429*UPDATE-CLS-CD.
*CALL DATEPROG TO OBTAIN THE CURRENT DATE AND TIME.
CALL-DATEPROG.
MOVE 'CURRDATE' TO DATEPROG-FUNCTION
CALL 'DATEPROG' USING DATEPROG-CONTROL-BLOCK
IF DATEPROG-SUCCESSFUL
MOVE DATEPROG-CYYMMDD TO WS-SAVE-DATE
ELSE
DISPLAY 'ERROR CALLING DATEPROG FROM TESTPROG'
DATEPROG-RETURN-CD
CALL 'CANCEL'
END-IF.
CALL-DATEPROG-EXIT.
EXIT.
The outputs:
USERID.TEMP1 - This one has the SUPERC of the programs. I pulled the complete listing (List Type - Long) but showing only the insert/deletes below:
Code:
000030 01 INDX PIC S9(4) COMP VALUE ZERO.
I - 000031*01 SUB PIC S9(4) COMP VALUE ZERO.
D - 000031 01 SUB PIC S9(4) COMP VALUE ZERO.
000032 01 SUB2 PIC S9(4) COMP VALUE ZERO.
.
.
.
000109 01 MOS REDEFINES FREQ-TBL.
000110 03 WS-MONTH OCCURS 4 PIC 99.
I - COPY COPY0012.
000111 LINKAGE SECTION.
000112 COPY COPYWRK2.
.
.
.
000334 GO TO WRITE-MASTR-50-PC.
000335
I - * BISWAJIT CODE START
I - PERFORM CALL-PI00012 THRU CALL-PI00012-EXIT.
I - * BISWAJIT CODE ENDS
000336 GO TO NO-SELECT.
000337
.
.
.
001427*** CODE UPDATING, SINCE IT WILL HAPPEN AUTOMATICALLY AT EOM
001428***
001429*UPDATE-CLS-CD.
I - 001541*CALL DATEPROG TO OBTAIN THE CURRENT DATE AND TIME.
I - 001542 CALL-DATEPROG.
I - 001543 MOVE 'CURRDATE' TO DATEPROG-FUNCTION
I - 001544 CALL 'DATEPROG' USING DATEPROG-CONTROL-BLOCK
I - 001545 IF DATEPROG-SUCCESSFUL
I - 001546 MOVE DATEPROG-CYYMMDD TO WS-SAVE-DATE
I - 001547 ELSE
I - 001548 DISPLAY 'ERROR CALLING DATEPROG FROM TESTPROG'
I - 001549 DATEPROG-RETURN-CD
I - 001550 CALL 'CANCEL'
I - 001551 END-IF.
I - 001552 CALL-DATEPROG-EXIT.
I - 001553 EXIT.
I - 001554
USERID.TEMP2
Code:
I - 000031*01SUBPICS9(4)COMPVALUEZERO.
D - 000031 01SUBPICS9(4)COMPVALUEZERO.
I - *BISWAJITCODESTART
I - *BISWAJITCODEENDS
I - 001541*CALLDATEPROGTOOBTAINTHECURRENTDATEANDTIME.
What I did here was
1. Select only those lines that are insert with comment, comment and delete lines.
2. Then I remove all the space from these lines.
3. What I am working on a REXX to somehow process this file to just tell me that "000031*01SUBPICS9(4)COMPVALUEZERO. " is a cobol code that has been commented out. The code would be ignoring all the other 3 lines.
I scribbled something yesterday and felt like sharing it.
Inputs needed:
1. Take any existing code and make changes to that code by commenting any existing lines, adding comments that look similar to cobol statements and adding nonsense comments too . (USERID.PROGRAM.PDS(PROGRAM))
2. Compile the edited program and catalog the SYSPRINT of IGYCRCTL. It would/should have the program listing. (USERID.COMPILE.LIST)
3. An input file containing the cobol reserved words (Found the list @ publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.langref.doc/c0925395695.htm) (PS, 80 Bytes, COBOL.RESERVED.WORDS)
/*REXX*/
/* OPEN THE INPUT FILES*/
/* COMPFIL1 HAS THE COMPLETE SUPERC LISTING. LIST OPTION 4 */
"EXECIO * DISKR COMPFIL1 ( STEM COMP1. FINIS"
/* COMPFIL2 HAS INSERTS WITH COMMENTS, COMMENTS AND DELETE LINES. ALL
SPACES REMOVED FROM THESE LINES*/
"EXECIO * DISKR COMPFIL2 ( STEM COMP2. FINIS"
/* FULLCOMP HAS THE COMPLETE PROGRAM WITH EXPANDED COPYBOOKS */
"EXECIO * DISKR FULLCOMP ( STEM FULLC. FINIS"
/*THIS ONE CONTAINS THE LIST OF COBOL RESERVED WORDS*/
"EXECIO * DISKR RESWORD ( STEM RESW. FINIS"
/* */
PROC='N'
COMMENT='N'
CNT=0
CNT1=0
/*GET ALL MEMBERS OF PRODUCTION LIBRARY STEM*/
PDSNAME = 'PRODUCTION LIBRARY PDS'
X = OUTTRAP(STEM.)
"LISTDS '"PDSNAME"' MEMBERS"
X = OUTTRAP(OFF)
DO I = 7 TO STEM.0
J=I-6
STEM.J=STRIP(STEM.I)
END
J=J+1
STEM.J='CANCEL'
STEM.0=J
DO I=1 TO FULLC.0
IF (POS('*',FULLC.I)=25 |,
POS('/',FULLC.I)=25) THEN DO
COMMENT='Y'
END
ELSE DO
COMMENT='N'
END
IF (COMMENT='N' &,
POS('PROCEDURE',FULLC.I)> 25 &,
POS('DIVISION',FULLC.I)>POS('PROCEDURE',FULLC.I) &,
PROC='N') THEN DO
PROC='Y'
END
IF COMMENT='N' & PROC='N' THEN DO
N=WORDS(SUBSTR(FULLC.I,26,90))
IF N > 1 THEN DO
/*STACK UP ALL THE VARIABLES IN A VAR. */
IF DATATYPE(WORD(SUBSTR(FULLC.I,26,90),1),W) = 1 THEN DO
CNT=CNT+1
VAR.CNT=WORD(SUBSTR(FULLC.I,26,90),2)
VAR.0=CNT
END
IF (WORD(SUBSTR(FULLC.I,26,90),1)) = 'COPY' THEN DO
/*STACK UP ALL THE COPYBOOKS IN A VAR. */
CNT=CNT+1
VAR.CNT=WORD(SUBSTR(FULLC.I,26,90),2)
VAR.0=CNT
END
END
END
IF COMMENT='N' & PROC='Y' THEN DO
N=WORDS(SUBSTR(FULLC.I,26,90))
IF N > 0 THEN DO
STR=SUBSTR(FULLC.I,26,90)
IF POS(WORD(STR,1),STR) > 0 &,
POS(WORD(STR,1),STR) < 5 &,
WORD(STR,1)<>'COPY' THEN DO
/*STACK UP ALL THE PARAGRAPH NAMES IN A PARA. */
CNT1=CNT1+1
PARA.CNT1=WORD(SUBSTR(FULLC.I,26,90),1)
PARA.0=CNT1
END
END
END
END
/* SORT VAR. AND PARA. */
SYS.1=" SORT FIELDS=(1,18,CH,A)"
SYS.2=" SUM FIELDS=NONE"
X = MSG('OFF')
"FREE FI(SYSOUT)"
"ALLOC FI(SYSOUT) SYSOUT(X)"
IF VAR.0 > 1 THEN DO
"FREE FI(SYSIN,SORTIN,SORTOUT)"
"ALLOC FI(SYSIN) NEW TRACKS SPACE(3 3) RECFM(F B) LRECL(80)"
"ALLOC FI(SORTIN) NEW TRACKS SPACE(10 10) RECFM(V B) LRECL(80)"
"ALLOC FI(SORTOUT) NEW TRACKS SPACE(10 10) RECFM(V B) LRECL(80)"
PUSH " OPTION VLSHRT VLSCMP"
"EXECIO 1 DISKW SYSIN"
PUSH " SORT FIELDS=(1,18,CH,A)"
"EXECIO 1 DISKW SYSIN"
PUSH " SUM FIELDS=NONE"
"EXECIO 1 DISKW SYSIN ( FINIS"
"EXECIO * DISKW SORTIN ( STEM VAR. FINIS"
DROP VAR.
VAR.0 = 0
"CALL *(SORT)"
"EXECIO * DISKR SORTOUT ( STEM VAR. FINIS"
"FREE FI(SYSIN,SORTIN,SORTOUT)"
END
"FREE FI(SYSOUT)"
"ALLOC FI(SYSOUT) SYSOUT(X)"
IF VAR.0 > 1 THEN DO
"FREE FI(SYSIN,SORTIN,SORTOUT)"
"ALLOC FI(SYSIN) NEW TRACKS SPACE(3 3) RECFM(F B) LRECL(80)"
"ALLOC FI(SORTIN) NEW TRACKS SPACE(10 10) RECFM(V B) LRECL(80)"
"ALLOC FI(SORTOUT) NEW TRACKS SPACE(10 10) RECFM(V B) LRECL(80)"
PUSH " OPTION VLSHRT VLSCMP"
"EXECIO 1 DISKW SYSIN"
PUSH " SORT FIELDS=(1,18,CH,A)"
"EXECIO 1 DISKW SYSIN"
PUSH " SUM FIELDS=NONE"
"EXECIO 1 DISKW SYSIN ( FINIS"
"EXECIO * DISKW SORTIN ( STEM PARA. FINIS"
DROP PARA.
PARA.0 = 0
"CALL *(SORT)"
"EXECIO * DISKR SORTOUT ( STEM PARA. FINIS"
"FREE FI(SYSIN,SORTIN,SORTOUT)"
END
/* INITIAL SETUP ENDS HERE. PROCESSING BEGINS HERE*/
R=' '
C=' '
O=' '
PCCNT=0
PVCNT=0
COMCNT=0
DO I=1 TO COMP2.0
/*P -> Pass, F -> Fail, Y -> Yes, N -> No*/
IF (POS('*',COMP2.I)=12 |,
POS('/',COMP2.I)=12) THEN DO
COMMENT='Y'
CHECK1='F'
IF POS(' I - ',COMP2.I) = 1 THEN DO
CHECK1='P'
END
IF CHECK1='P' THEN DO
J=I+1
CHECK2='F'
IF POS(' D - ',COMP2.J) = 1 THEN DO
CHECK2='P'
END
ELSE DO
CHECK2='X'
END
END
IF CHECK2='P' THEN DO
W1=WORDS(SUBSTR(COMP2.I,13,72))
W2=WORDS(SUBSTR(COMP2.J,13,72))
P='Y'
IF W1 = W2 THEN DO
DO K=1 TO W1
STR1=SUBSTR(COMP2.I,13,72)
STR2=SUBSTR(COMP2.J,13,72)
IF WORD(STR1,1) <> WORD(STR2,1) |,
POS(WORD(STR1,1),WORD(STR2,1)) > 0
THEN DO
P='N'
END
END
END
IF W1 <> W2 THEN DO
P='N'
END
IF P='Y' THEN DO
PCCNT=PCCNT+1
PC.PCCNT=SUBSTR(COMP2.I,13,72)
END
IF P='N' THEN DO
PVCNT=PVCNT+1
PV.PVCNT=SUBSTR(COMP2.I,13,72)
END
CHECK1='F'
CHECK2='F'
END
IF CHECK2='X' THEN DO
W1=WORDS(SUBSTR(COMP2.I,13,72))
STR=SUBSTR(COMP2.I,13,72)
IF W1 > 0 THEN DO
D1='N'
E1='N'
F1='N'
G1='N'
O1='N'
DO J = 1 TO W1
D='N'
E='N'
F='N'
G='N'
O='N'
IF LENGTH(WORD(STR,J)) > 1 THEN DO
DO K=1 TO RESW.0
IF WORD(STR,J) = RESW.K THEN DO
D='Y'
END
END
DO K=1 TO VAR.0
IF WORD(STR,J) = VAR.K THEN DO
E='Y'
END
END
DO K=1 TO PARA.0
IF WORD(STR,J) = PARA.K THEN DO
F='Y'
END
END
DO K=1 TO STEM.0
IF WORD(STR,J) = STEM.K THEN DO
G='Y'
END
END
IF D='Y' THEN D1='Y'
IF E='Y' THEN E1='Y'
IF F='Y' THEN F1='Y'
IF G='Y' THEN G1='Y'
IF D='N' & E='N' & F='N' & G='N' THEN DO
O='Y'
O1='Y'
J=W1+1
END
END
END
IF D1='Y' | E1='Y' | F1='Y' | G1='Y' THEN DO
PVCNT=PVCNT+1
PV.PVCNT=SUBSTR(COMP2.I,13,72)
END
ELSE IF (D1='N' & E1='N' & F1='N' & G1='N') |,
O='Y' | O1='Y'
THEN DO
O='Y'
COMCNT=COMCNT+1
COM.COMCNT=SUBSTR(COMP2.I,13,72)
END
END
CHECK1='F'
CHECK2='F'
END
END
END
IF PCCNT > 0 THEN DO
SAY ' '
SAY '--------------------------------------------------------'
SAY '!!!! PRODUCTION CODE COMMENTED !!!!'
DO I = 1 TO PCCNT
SAY '-->' PC.I
END
END
IF PVCNT > 0 THEN DO
SAY ' '
SAY '--------------------------------------------------------'
SAY 'KINDLY REPHRASE THESE COMMENTS.'
DO I = 1 TO PVCNT
SAY '-->' PV.I
END
SAY '-------------------------------------------'
END
IF COMCNT > 0 THEN DO
SAY ' '
SAY '--------------------------------------------------------'
SAY 'NEW COMMENTS'
DO I = 1 TO COMCNT
SAY '-->' COM.I
END
END
SAY ' '
IF PCCNT > 0 THEN EXIT 8
IF PVCNT > 0 THEN EXIT 6
EXIT 0
000000 IDENTIFICATION DIVISION.
000000 PROGRAM-ID. MULTIPLIER.
000000 AUTHOR. MICHAEL COUGHLAN.
000000* EXAMPLE PROGRAM USING ACCEPT, DISPLAY AND MULTIPLY TO
000000* GET TWO SINGLE DIGIT NUMBERS FROM THE USER AND MULTIPLY THEM TO
000000
000000 DATA DIVISION.
000000
000000 WORKING-STORAGE SECTION.
000000 01 NUM1 PIC 9 VALUE ZEROS.
000000 01 NUM2 PIC 9 VALUE ZEROS.
000000 01 RESULT PIC 99 VALUE ZEROS.
000000
000000 PROCEDURE DIVISION.
000000 DISPLAY "ENTER FIRST NUMBER (1 DIGIT) : " WITH NO ADVANCING.
000000 ACCEPT NUM1.
000000 DISPLAY "ENTER SECOND NUMBER (1 DIGIT) : " WITH NO ADVANCING.
000000 ACCEPT NUM2.
000000 MULTIPLY Num1 BY Num2 GIVING Result.
000000 DISPLAY "Result is = ", Result.
000000 STOP RUN.
After
Code:
000000 IDENTIFICATION DIVISION.
000000 PROGRAM-ID. MULTIPLIER.
000000 AUTHOR. MICHAEL COUGHLAN.
000000* EXAMPLE PROGRAM USING ACCEPT, DISPLAY AND MULTIPLY TO
000000* GET TWO SINGLE DIGIT NUMBERS FROM THE USER AND MULTIPLY THEM TO
000000
000000 DATA DIVISION.
000000
000000 WORKING-STORAGE SECTION.
000000*01 NUM1 PIC 9 VALUE ZEROS.
000000 01 NUM1 PIC 99 VALUE ZEROS.
000000*01 NUM2 PIC 9 VALUE ZEROS COMMENTING THIS LINE FOR TESTING
000000*01 NUM3 PIC 9 VALUE ZEROS.
000000 01 NUM4 PIC 9 VALUE ZEROS.
000000 01 OUTPUT-RES PIC 99 VALUE ZEROS.
000000
000000 PROCEDURE DIVISION.
000000 DISPLAY "ENTER FIRST NUMBER (1 DIGIT) : " WITH NO ADVANCING.
000000 ACCEPT NUM1.
000000 DISPLAY "ENTER SECOND NUMBER (1 DIGIT) : ".
000000 ACCEPT NUM4.
000000* MULTIPLY NUM1 BY NUM4 GIVING OUTPUT-RES.
000000 DIVIDE NUM1 BY NUM4 GIVING OUTPUT-RES.
000000 DISPLAY "OUTPUT-RES OF DIFFERENCE IS = ", OUTPUT-RES.
000000 STOP RUN.
Output
Code:
--------------------------------------------------------
KINDLY REPHRASE THESE COMMENTS.
--> *01 NUM1 PIC 9 VALUE ZEROS.
--> *01 NUM2 PIC 9 VALUE ZEROS COMMENTING THIS LINE FOR TESTING
--> *01 NUM3 PIC 9 VALUE ZEROS.
--> * MULTIPLY NUM1 BY NUM4 GIVING OUTPUT-RES.
--------------------------------------------------------
KINDLY REPHRASE THESE COMMENTS --> indicates that there is a cobol string in the comment which might be a commented cobol code.
Althought this is day 1 for this report (means lot can be improved), I am requesting you to please suggest any other approach to this. I was going through Backus-Naur Form for cobol and was wondering if implementing that can be acheived thru cobol/rexx/any other language?