Joined: 06 Jun 2008 Posts: 8700 Location: Dubuque, Iowa, USA
UNSTRING won't help you here. Either get the source of the data to remove the commas inside the quotes, or use reference modification (or arrays) to examine each byte, one at a time, to determine how to process it.
Joined: 26 Nov 2012 Posts: 17 Location: Switzerland
If you can't have the delimiters changed at their origin they come into COBOL you could try this:
Before unstringing the data stream you could replace the commas outside the " " by some other character that isn't used anywhere in the datastream and unstring with the help of that character.
Try the COBOL handbooks for INSPECT (TALLYING) REPLACING...
Possibly you'll have to build a loop to get through the complete data stream, replacing the commas outside the " " and just jumping forward to the end of the " " part.
Could that be a solution for you?
It would be the easiest, though, to ask for different delimiters to be delivered.
src = "11/21/2012,,3.125%,""$172,000.00"",$0.00,""$172,000.00"""
len = length(src)
str = 0
j = 1
fld.= ""
do i = 1 to len
c = substr(src,i,1)
if c = "," then do
if str then ,
fld.j = fld.j || c
else ,
j = j + 1
iterate
end
fld.j = fld.j || c
if c = """" then ,
str = \str
end
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Here's another one, a bit rough'n'ready:
Code:
IDENTIFICATION DIVISION.
PROGRAM-ID. STUBA.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 W-WHEN-COMPILED PIC X(8)BX(8).
01 W-NUMBER-OF-COLUMNS COMP PIC 9(4) VALUE 6.
01 W-DATA-TO-UNSTRING PIC X(60).
01 W-LENGTH-OF-LINE-DISP COMP PIC 9(4).
01 W-LENGTH-OF-THIS-COLUMN COMP PIC 9(4).
01 W-LENGTH-OF-STRING COMP PIC 9(4).
01 W-MAX-THIS-COLUMN COMP PIC 9(4).
01 W-COLUMN-DATA-TABLE.
05 FILLER OCCURS 6 TIMES INDEXED BY WI-CDT-UNSTRING-INDEX.
10 W-COLUMN-DATA PIC X(60).
LINKAGE SECTION.
01 L-WORK-LINE.
05 L-WL-LINE-TEXT-DATA.
10 L-WL-LINE-DISPLACEMENT.
15 FILLER OCCURS 0 TO 60 TIMES
DEPENDING ON
W-LENGTH-OF-LINE-DISP.
20 FILLER PIC X.
10 L-WL-COLUMN.
15 FILLER OCCURS 0 TO 60 TIMES
DEPENDING ON
W-LENGTH-OF-THIS-COLUMN.
20 FILLER PIC X.
10 L-NEXT-BYTE-TO-LOOK-AT PIC X.
88 L-NEXT-BYTE-IS-QUOTE VALUE '"'.
88 L-NEXT-BYTE-IS-COMMA VALUE ','.
88 L-NEXT-BYTE-IS-SPACE VALUE SPACE.
PROCEDURE DIVISION.
MOVE WHEN-COMPILED TO W-WHEN-COMPILED
DISPLAY "STUBA COMPILED ON " W-WHEN-COMPILED
MOVE '11/21/2012,,3.125%,"$172,000.00",$0.00,"$172,000.00"'
TO W-DATA-TO-UNSTRING
PERFORM 05-DO-THE-WORK
MOVE SPACE TO W-DATA-TO-UNSTRING
PERFORM 05-DO-THE-WORK
MOVE '11/21/2012,,3.125%,"$172,000.00",$0.00,$172000.00XXXXXX
- 'XXXXX'
TO W-DATA-TO-UNSTRING
PERFORM 05-DO-THE-WORK
MOVE '",,,,,,,,",,3.125%,"$172,000.00",$0.00,"$172,000.00X"'
TO W-DATA-TO-UNSTRING
PERFORM 05-DO-THE-WORK
MOVE ',",,,,,,,",3.125%,"$172,000.00",$0.00,,'
TO W-DATA-TO-UNSTRING
PERFORM 05-DO-THE-WORK
GOBACK
.
05-DO-THE-WORK.
DISPLAY W-DATA-TO-UNSTRING
SET ADDRESS OF L-WORK-LINE TO ADDRESS OF
W-DATA-TO-UNSTRING
PERFORM 10-FIND-LENGTH-OF-STRING
SET WI-CDT-UNSTRING-INDEX TO 1
MOVE ZERO TO W-LENGTH-OF-LINE-DISP
W-LENGTH-OF-THIS-COLUMN
PERFORM 20-EXTRACT-COLUMNS
W-NUMBER-OF-COLUMNS TIMES
DISPLAY "COL 1>" W-COLUMN-DATA ( 1 ) "<"
DISPLAY "COL 2>" W-COLUMN-DATA ( 2 ) "<"
DISPLAY "COL 3>" W-COLUMN-DATA ( 3 ) "<"
DISPLAY "COL 4>" W-COLUMN-DATA ( 4 ) "<"
DISPLAY "COL 5>" W-COLUMN-DATA ( 5 ) "<"
DISPLAY "COL 6>" W-COLUMN-DATA ( 6 ) "<"
.
10-FIND-LENGTH-OF-STRING.
MOVE LENGTH OF W-DATA-TO-UNSTRING
TO W-LENGTH-OF-STRING
W-LENGTH-OF-LINE-DISP
IF W-DATA-TO-UNSTRING EQUAL TO SPACE
MOVE ZERO TO W-LENGTH-OF-STRING
ELSE
SUBTRACT 1 FROM W-LENGTH-OF-LINE-DISP
PERFORM 10A-WIND-BACK
UNTIL NOT L-NEXT-BYTE-IS-SPACE
COMPUTE W-LENGTH-OF-STRING =
W-LENGTH-OF-LINE-DISP
+ 1
END-IF
DISPLAY "LENGTH >" W-LENGTH-OF-STRING "<"
.
10A-WIND-BACK.
SUBTRACT 1 FROM W-LENGTH-OF-LINE-DISP
.
20-EXTRACT-COLUMNS.
IF L-NEXT-BYTE-IS-QUOTE
PERFORM 20A-FIND-NEXT-QUOTE
PERFORM 20D-FIND-NEXT-COMMA
ELSE
PERFORM 20D-FIND-NEXT-COMMA
END-IF
MOVE L-WL-COLUMN TO W-COLUMN-DATA
( WI-CDT-UNSTRING-INDEX )
COMPUTE W-LENGTH-OF-LINE-DISP
= W-LENGTH-OF-LINE-DISP
+ W-LENGTH-OF-THIS-COLUMN
+ 1
MOVE ZERO TO W-LENGTH-OF-THIS-COLUMN
SET WI-CDT-UNSTRING-INDEX UP BY 1
.
20A-FIND-NEXT-QUOTE.
ADD 1 TO W-LENGTH-OF-THIS-COLUMN
PERFORM 99A-WIND-ALONG
UNTIL L-NEXT-BYTE-IS-QUOTE
.
20D-FIND-NEXT-COMMA.
COMPUTE W-MAX-THIS-COLUMN = W-LENGTH-OF-STRING
- W-LENGTH-OF-LINE-DISP
PERFORM 99A-WIND-ALONG
UNTIL ( W-LENGTH-OF-THIS-COLUMN
EQUAL TO W-MAX-THIS-COLUMN )
OR L-NEXT-BYTE-IS-COMMA
.
99A-WIND-ALONG.
ADD 1 TO W-LENGTH-OF-THIS-COLUMN
.
Output is:
Code:
STUBA COMPILED ON 11/29/12 01.56.03
11/21/2012,,3.125%,"$172,000.00",$0.00,"$172,000.00"
LENGTH >0052<
COL 1>11/21/2012 <
COL 2> <
COL 3>3.125% <
COL 4>"$172,000.00" <
COL 5>$0.00 <
COL 6>"$172,000.00" <
LENGTH >0000<
COL 1> <
COL 2> <
COL 3> <
COL 4> <
COL 5> <
COL 6> <
11/21/2012,,3.125%,"$172,000.00",$0.00,$172000.00XXXXXXXXXXX
LENGTH >0060<
COL 1>11/21/2012 <
COL 2> <
COL 3>3.125% <
COL 4>"$172,000.00" <
COL 5>$0.00 <
COL 6>$172000.00XXXXXXXXXXX <
",,,,,,,,",,3.125%,"$172,000.00",$0.00,"$172,000.00X"
LENGTH >0053<
COL 1>",,,,,,,," <
COL 2> <
COL 3>3.125% <
COL 4>"$172,000.00" <
COL 5>$0.00 <
COL 6>"$172,000.00X" <
,",,,,,,,",3.125%,"$172,000.00",$0.00,,
LENGTH >0039<
COL 1> <
COL 2>",,,,,,," <
COL 3>3.125% <
COL 4>"$172,000.00" <
COL 5>$0.00 <
COL 6> <
Yes. Take the manual and beat the producer of the data until heesh agrees to change the delimiter. Otherwise, you have four solutions (my pseudo-code, Dr. Sorichetti's Rexx script, and Sr. Woodger's and Sri Sai's COBOL programs); it is useless to try to put off having to actually do some work.
Cant you change the delimiters to something else unless there is no impacts?
No actually, We are getting the feed from some application and they dump it on our server with the format I mentioned thru csv file. Now I am writing a program to validate the data by doing unstring with deliminator as ",". So hence the problem.[/quote]
The upstream process is not owned by us so they have there layouts fixed as per the requirement.So we as a receiver have to take care of these layouts.
I have used the below logic to convert all ',' to ';' outside the '"'.
Code:
0000-MAINLINE.
MOVE
'11/11/2012,$0.00,"$1,111.14"' TO WS-TEST-DATA
MOVE ZERO TO WS-SPACE-COUNT
INSPECT FUNCTION REVERSE(WS-TEST-DATA)
TALLYING WS-SPACE-COUNT
FOR LEADING SPACES
COMPUTE WS-DATA-LEN = 100 - WS-SPACE-COUNT
MOVE 1 TO WS-START-POSN
MOVE 1 TO WS-POSN
MOVE ZERO TO WS-PROCESS-CNT
PERFORM UNTIL WS-START-POSN > WS-DATA-LEN
IF WS-TEST-DATA(WS-START-POSN:1) = '"'
MOVE WS-TEST-DATA(WS-START-POSN:1) TO
WS-TEMP-DATA(WS-POSN:1)
ADD 1 TO WS-POSN
ADD 1 TO WS-START-POSN
SET START-QUOTE TO TRUE
PERFORM UNTIL END-QUOTE
IF WS-TEST-DATA(WS-START-POSN:1) = '"'
SET END-QUOTE TO TRUE
ELSE
MOVE WS-TEST-DATA(WS-START-POSN:1) TO
WS-TEMP-DATA(WS-POSN:1)
ADD 1 TO WS-POSN
ADD 1 TO WS-START-POSN
END-IF
END-PERFORM
END-IF
IF WS-TEST-DATA(WS-START-POSN:1) NOT = ','
MOVE WS-TEST-DATA(WS-START-POSN:1) TO
WS-TEMP-DATA(WS-POSN:1)
ADD 1 TO WS-POSN
ADD 1 TO WS-START-POSN
ELSE
MOVE ';' TO WS-TEMP-DATA(WS-POSN:1)
ADD 1 TO WS-POSN
ADD 1 TO WS-START-POSN
END-IF
END-PERFORM
DISPLAY 'WS-TEMP-DATA:' WS-TEMP-DATA
DISPLAY 'WS-TEST-DATA:' WS-TEST-DATA
STOP RUN.
0000-EXIT.
EXIT.
I tested and it is working fine, also I would request you to verify.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
It was suggested to me that I should have done a solution for the specific, rather than the general. He's got a big smile on now, I bet :-)
Once the first " is encountered, there are no "lonely" commas for conversion. So, count characters to first " change the "lonely" (as in can be treated as single) commas up to that point to @. From that point, change ", and ," to "@ and @" respectively. Then UNSTRING on @.
STUBB COMPILED ON 12/01/12 11.04.08
11/21/2012,,3.125%,"$172,000.00",$0.00,"$172,000.00"
COL 1>11/21/2012 <
COL 2> <
COL 3>3.125% <
COL 4>"$172,000.00" <
COL 5>$0.00 <
COL 6>"$172,000.00" <