You can't process these files "as you would any other". You may have to process them as variable blocks. E.g., if a field is "empty" it's sent as
",,", i.e. 123.45,,68.98 (13 chars). The next block may be 1367.45,12.34,12.78 (19 chars).
Using UNSTRING can also be problematic:
If your fields contain commas, the sending app encloses it in quotes, e.g.
"100,234.45",123.45
If a field contains quotes, I think they double up on the quote, e.g. """Wild"" Bill Donovan", "12,976.00"
Of course, if your needs don't require these kinds of complexities you can simplify the code, but you may be at risk.
My input data is from an oracle database that I manually pull using PS/SQL and then export it to a CSV file:
Code:
Command ===> Scroll ===> CSR
********************************* Top of Data **********************************
"ID","NAME","ADDR_1","ADDR_2","CITY","STATE","ZIP_CD","MAIL_ADDR_1","MAIL_ADDR_2
"0123456","SMITH","123 WASHINGTON AVE"," ","SEATTLE ","WA","012340000"," "," ","
"2345678","JOHNSON","700 APPLE RD"," ","PITTSBURGH","PA","152432040"," "," "," "
"9012345","LIVINGSTON","1 PACIFIC DR"," ","TYRONE","PA","166861898"," "," "," ",
"8765432","APPLEGATE","130 N APPLE ST"," ","UNION CITY","PA","164381094"," "," "
******************************** Bottom of Data ********************************
When I use the unstring to delimited by ",", I get this:
Code:
COMMAND ===> SCROLL ===> CSR
********************************* Top of Data **********************************
"ID" "NAME" "ADDR_1" "AD
"012345"SMITH" "123 WASHINGTON AVE" " "
"234567"JOHNSON" "700 APPLE RD" " "
"901234"LIVINGSTON" "1 PACIFIC DR" " "
"876543"APPLEGATE" "130 N APPLE ST" " "
******************************** Bottom of Data ********************************
Is there anyway to not get the "? Can Dfsort delete the " before I use it in my program? Thanks.
All the literal came with double (")s. I have used unstring delimited by ',' or '"', but didn't work because I would get the extra spaces.
One way a co-worker suggested seemed to work. Read the input CSV file with a perform varying from 1 to 1 until the end of the record length. Rewrite the each character without the ". This will shift the data.
Then unstring the new record delimited by ','. For now this works.
Joined: 22 Nov 2005 Posts: 700 Location: Troy, Michigan USA
Newbie,
Jack is correct, there are many circumstances that need to be considered when using UNSTRING and it will not always give you the desired results. I should have spent more time going over these. Thank you Jack for picking this up.
One of the thinks that I do is to FTP to a fixed length file LRECL=max length. This way I don?t have to play with the variable length files. The down side of this is that you don?t know 100% how long the actual data was.
The method you chose seems to be fine, as long as there will NEVER be embedded ???, or ?,? in the fields. One way to get by part of the problem is to FTP delimited by ?tab?.
Good luck, And thanks again Jack for picking this up.
All good points to think about. Looping through each character is a tried and true method by can be tedious and tricky to look at. I use INSPECT to remove the double quotes.
I created a generic COBOL program specifically for the purpose of reading in .csv files. Basically it takes a parm that tells how long you want each of the output fields to be and then does stuff like get rid of the double quotes(you can do much of this usually in the program that creates the csv, too. Then it UNSTRINGS and puts the results into fields the length you specified on the parm. It's set up so you can use either a parm or control card in case there are too many fields to fit on the parm.
I think I have a version that uses some assembler program so you can specify the lrecl of the input and output files for flexiblity and a version that uses fixed width lrecls.
I like using this method because the only thing I have to change to use this on a different file is the parm indicating the lengths of the fields.