Query on Splitting up a file

abdulrafi · Posted: Wed Apr 20, 2016 3:26 pm

Hi,

I do have a CSV file as shown below.

I have to import it to mainframe and place every value in its respective fields.

For eg: Here I do have "AL,AR,...,TX". You could see 10 occurences inside the quotes. Even after the quotes you can see another set too like "ED,EE". At times I can receive even 5, but the maximum occurences can be 16. If I get only 5 occurences, its difficult for me to figure out if the next comma belongs to the same variable or to a different variable. I used delimited by comma but the values went to different fields.

Could you please help me with a logic on how I can get the values exactly populated to respective fields. If there are 5 occurences, rest of 11 occurences should be spaces considering the max is 16.

the attachments were deleted,
repost the data using the code tags

Bill Woodger · Posted: Wed Apr 20, 2016 3:35 pm

Post the code you are using, including the definitions of all relevant fields.

Post sample data and expected output and the output you get.

All of these inside the Code tags.

Rohit Umarjikar · Posted: Wed Apr 20, 2016 5:34 pm

please follow Bill's advise to get more options.
my understanding says: i try to avoid reference modifications so see if this is what you want.
step1: use inspect and replace ',"' by '|'
step2 : use another inspect and replace '"' by ''
step3: unstrung delimited by '|' into your 10 fields and use a pointer adnavced by 1 to get data into next occurance .
step4: move each occurance into seaparate variable as you want

abdulrafi · Posted: Wed Apr 20, 2016 6:06 pm

As its a image given in a requirement document, I am unable to post it using code tag. I have just started to code and hence I was checking with delimited by or inspect verb to segregate the values.

Please help. [/code]

Rohit Umarjikar · Posted: Wed Apr 20, 2016 6:11 pm

My internet got off for some time so could not edit the earlier post.
Step3 : you get the values in 1 variable instead of 10.
Step5:After that you still want to have AL OR AR and so on then perform another unstrung delimited by , into 10 different variable.

Rohit Umarjikar · Posted: Wed Apr 20, 2016 6:22 pm

What more Help you need, Did you try what I have suggested?
You can show us the sample input and what output you want and use a code tags. You don't have to post the original requirement to us or tell us what do you mean by split the files in above context as you may get dfsort solution as well.

enrico-sorichetti · Posted: Wed Apr 20, 2016 6:43 pm

abdulrafi · Posted: Thu Apr 21, 2016 11:10 am

Thanks Rohit. I shall try it and see.

Sorry Enrico if I had wasted much of time.

abdulrafi · Posted: Fri Apr 22, 2016 11:35 am

Bill Woodger · Posted: Fri Apr 22, 2016 11:56 am

Your output does not match what you say you have done. INSPECT ... REPLACING ... can change character-sequences (not just single bytes), they just have to be the same length. You have "lost" some commas, so it looks like you've discovered that already.

Abid Hasan · New User Joined: 25 Mar 2013 Posts: 88 Location: India

Hello,

abdulrafi · Posted: Fri Apr 22, 2016 12:13 pm

I again changed it as below in the code and hence I got that output,

abdulrafi · Posted: Fri Apr 22, 2016 2:19 pm

Hi,

I need one another help. I have written the code and executed and I am getting the output as expected except for one field.

In my file I have values as "ED, EE". When I delimit and pass the variable to PIX X(02), one holds value as ED and other has E because of the space after comma.

Is there anyway I can remove the spaces and make it sit in PIC X(02) bytes. ?.

abdulrafi · Posted: Fri Apr 22, 2016 2:35 pm

Output :
ws-var =ED
ws-var1=E

Expected output:
ws-var=ED
ws-var1=EE

Bill Woodger · Posted: Fri Apr 22, 2016 3:02 pm

"ED, EE" is data, not two fields. It is data containing a comma, which is why whatever created the CSV in the first place put quotes around it.

You can treat it is two fields. However, you need to be very sure of the source of the data. If it is user-input, then it could just as easily come as "ED,EE", even if that is by typo.

If it is fixed-format and always will have a leading space you could define a three-byte field but only use the last two to pick up the data, or you could use a second delimiter of ", ". Which may be better depends on your code.

abdulrafi · Posted: Fri Apr 22, 2016 3:29 pm

Yes Bill. I tried moving the 2 bytes field to a 3 byte field.
Then Check if the first byte is spaces and if its YES I do move the 2nd and 3rd byte only. If first byte is not spaces then move the 1st and 2nd byte only.

its sure that it will have only 2 bytes and not more than that.

Instead of going by my logic which I felt its quite ugly, thought if I could use INSPECT verb to strip the spaces and move only the bytes which have values.

Bill Woodger · Posted: Fri Apr 22, 2016 3:51 pm

Because the original and new values for the INSPECT must be the same size, you can't make your data "shorter" or "longer" than it is with INSPECT.

Rohit Umarjikar · Posted: Fri Apr 22, 2016 4:51 pm

You can still do with inspect. try doing a reverse of a field and tally for leading spaces and the use that vaue +1 for your reference mod as a starting offset.
Do you have a DB2? if yes then instead of all this you can do everything in in one shot using REPLACE function.

abdulrafi · Posted: Tue May 03, 2016 4:39 pm

Need one more help. I have the input records as below.

Current input file,

Rohit Umarjikar · Posted: Tue May 03, 2016 10:02 pm

1.scan the input field byte by byte and as soon as you hit the '"' then replace ',' by ';' till you hit ending '"' and continue this till the end of the string.
2.Do a inspect and replace ',' by '|'
N|88320|00254XMT03GUIDE|6U|6156|1/1/2001|GLOVEBOX XM C
ANNEL GUIDE|2016||||O|"ED; EE"'.
3. Unstring delimited by '|'
4. Now you will have "ED; EE"' value in one of the variable.
5. Work in this now , use unstring delimited by ';' into 10 variables.
6. You will get "ED in one of the variable from step5, now remove '"' by any method you want ( e.g. use byte by byte check and get rid of '"')