IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to delimit the CSV file that may contain double quotes


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Aham

New User


Joined: 24 Oct 2007
Posts: 42
Location: chennai

PostPosted: Thu Nov 29, 2007 5:29 pm
Reply with quote

I have a dataset that may contain double quotes as well as comma anywhere along the length of record. when I delimit the fields(fields embedded in double quote ) with comma i have a problem when there is double quote as part of the field.

e.g Input fields of first record: murali calls "aham" .
my first field is 'murali' and second field is 'calls "aham"'. So when my sort card is something like this...
OUTREC=(1:C'"',
2:1,6
8:C'"',
9:C',',
10:C'"',
11:8,12
23:c'"')
When the Output is sent to lotus notes as .CSV It identies aham a seperate field. Please advice
Back to top
View user's profile Send private message
Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Thu Nov 29, 2007 6:13 pm
Reply with quote

Use the TAB character x'05' as the delimiter instead of the comma.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Nov 29, 2007 9:53 pm
Reply with quote

It's not clear what your input records look like, how the fields are delimited, or what you want the output records to look like.

Quote:
e.g Input fields of first record: murali calls "aham" .
my first field is 'murali' and second field is 'calls "aham"'


I don't see a comma delimiter here. I see three "words" here, each delimited by a space:
murali
calls
"aham"

but you say the first field is murali and the second field is calls "aham". What is the rule that tells you the first word is one field, and the second and third words are one field?

Please show a better example of what your input records can look like (more records, all significant variations) and what you'd expect for output. And explain the rules for parsing out the fields from the input record.

Also, what is the RECFM and LRECL of the input file and output file?
Back to top
View user's profile Send private message
Aham

New User


Joined: 24 Oct 2007
Posts: 42
Location: chennai

PostPosted: Fri Nov 30, 2007 7:05 pm
Reply with quote

Sorry for the inconvenience.. I have shown sample input records. The fields are of fixed width. Also the sort card currently used is shown in the attached document. My problem is when I send the CSV file using IEBGENER to Lotus Notes the double quotes present in the record creates confusion.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Nov 30, 2007 10:40 pm
Reply with quote

Well, you didn't explain this very well. It appears you have fixed fields that you're trying to format as CSV fields for Lotus Notes. Your job would create CSV fields like this:

Code:

"MURALI","CALLS "AHAM"","NO:7,WEST STREET   ","ADELAIDE            "
"RAM   ","CALLS "RAM" ","NO:1,FIRST LANE    ","RIO D' JANERIO      "
"LAKAN ","ALIAS NONE  ","NO ADDRESS         ","CITY "NONE"         "


But Lotus Notes doesn't handle the CSV file in that form the way you want it to because of the quotes within quotes. So the real question is how the field SHOULD be formatted so that Lotus Notes will treat each comma separated value as a separate field, e.g. CALLS "AHAM" must be treated as one field.

I don't know what Lotus Notes can handle. Would using apostrophes around the fields rather than quotes work? (You could try that - see below.)

Anyone know?

BTW, I'd suggest simplifying your control statements as follows:

Code:

...
//SYMNAMES DD * 
QT,C'"'         
BQT,C'","'       
/*
...
//SYSIN DD *
  SORT FIELDS=COPY                                       
  OUTREC BUILD=(QT,1,6,BQT,7,12,BQT,19,19,BQT,38,20,QT)   


If you want to try using apostrophes around the fields instead of quotes, just change the SYMNAMES statements to:

Code:

QT,C''''         
BQT,C''','''     


Then the output will be:

Code:

'MURALI','CALLS "AHAM"','NO:7,WEST STREET   ','ADELAIDE            '
'RAM   ','CALLS "RAM" ','NO:1,FIRST LANE    ','RIO D' JANERIO      '
'LAKAN ','ALIAS NONE  ','NO ADDRESS         ','CITY "NONE"         '
Back to top
View user's profile Send private message
Aham

New User


Joined: 24 Oct 2007
Posts: 42
Location: chennai

PostPosted: Fri Nov 30, 2007 11:43 pm
Reply with quote

Thanks you for the reply frank!!! The simplified code that you have showed here looks crystal clear and easy to understand. BTW the problem in using the apostrophie is that the field values may have them too.. As shown in the example 2 of my input record - Rio d' janerio my input record will contain apostrophes also icon_sad.gif Not only that they contain semicolon (;) colon(icon_smile.gif comma(,) slash(/) hash(#) and all the possible symbols. I learnt that when the quotes is doubled it is accepted in output in CSV file. For example
Code:

"MURALI","CALLS ""AHAM""","NO:7,WEST STREET   ","ADELAIDE


Now CALLS "AHAM" will be treated as a single field in the CSV file. Now my question is do we have utility that will insert character ? That is If I use IFTHEN=WHEN... OVERLAY it will simply overlay the character with the given character. Can we add extra character using IFTHEN=WHEN?
Back to top
View user's profile Send private message
Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Fri Nov 30, 2007 11:51 pm
Reply with quote

Just to repeat myself, try using TAB x'05' instead of the comma and don't use any delimiter.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sat Dec 01, 2007 12:00 am
Reply with quote

It's not as simple as "adding an extra character". What you really need to do is find each quote anywhere in a field and replace it with two quotes. . DFSORT does not have that kind of find and replace function.

We might be able to do something with DFSORT's PARSE function. But it depends on what can be in each field. If the fields of interest can only have string and "string" in various combinations, e.g.

string string string
string "string" string
"string" string "string"
...

then I think we could work with that. But you'd have to tell me which of the four input fields can have "string" in them and the maximum number of words each such field could have.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sat Dec 01, 2007 12:09 am
Reply with quote

Quote:
Just to repeat myself, try using TAB x'05' instead of the comma and don't use any delimiter.


So that would be:

Code:

...
//SYMNAMES DD *   
TAB,X'05'         
/*
...
//SYSIN DD *
  SORT FIELDS=COPY                                       
  OUTREC BUILD=(1,6,TAB,7,12,TAB,19,19,TAB,38,20)   
/*


and the output would be (@ = tab):

Code:

MURALI@CALLS "AHAM"@NO:7,WEST STREET   @ADELAIDE         
RAM   @CALLS "RAM" @NO:1,FIRST LANE    @RIO D' JANERIO   
LAKAN @ALIAS NONE  @NO ADDRESS         @CITY "NONE"     


Is that what you mean?

That would certainly be an easy way to do it if it works.

Aham - have you tried this?
Back to top
View user's profile Send private message
Aham

New User


Joined: 24 Oct 2007
Posts: 42
Location: chennai

PostPosted: Fri Dec 07, 2007 4:01 pm
Reply with quote

Quote:
Just to repeat myself, try using TAB x'05' instead of the comma and don't use any delimiter.


When I used tab there is no improvement when exported to spreadsheet. Also the tab is not recognised as a delimiter for CSV files in spreadsheet.
You have any idea what are the delimiters recognised for CSV files in a spreadsheet?
Back to top
View user's profile Send private message
agkshirsagar

Active Member


Joined: 27 Feb 2007
Posts: 691
Location: Earth

PostPosted: Fri Dec 07, 2007 4:17 pm
Reply with quote

Tab delimited files are stored as .txt, not .csv. You can open them in excel/wordpad/notepad/editplus/****editor..

Quote:
You have any idea what are the delimiters recognised for CSV files in a spreadsheet


CSV stands for comma separated values. Self explanatory isn't it.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 0
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
Search our Forums:

Back to Top