View previous topic :: View next topic
|
Author |
Message |
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Hi,
I have a dataset as below, assigned to DDNAME DDIN.
It has 3 records in it, but when I read it with Python it reads the dataset as one single line. How to make the program treat NL as new line?
Code: |
Command ===>
****** ********************************* Top of D
000001 THIS IS A FIRST LINE IN THE INPUT FILE
000002 THIS IS THE SECOND LINE IN THE INPUT FILE
000003 THIS IS THE THIRD LINE IN THE INPUT FILE |
The program:
Code: |
count = 0
reader = open("//DD:DDIN","r",encoding='cp037')
for line in reader:
count +=1
reader.close()
print(f'Number of lines in the file is {count}') |
Current Output:
Code: |
Number of lines in the file is 1 |
Expected Output:
Code: |
Number of lines in the file is 3 |
|
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
|
|
Shouldn't you add a readline or similar? |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
To debug, I think you should print each line. That is, I do not think it is reading every line. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
The program is reading the input PS file as one single line,
Quote: |
Shouldn't you add a readline or similar? |
I think readline is not required as this will do the reading
Code: |
for line in reader: |
For code:
Code: |
reader = open("//DD:DDIN","r",encoding='cp037')
#######
count = 0
for line in reader:
count +=1
print(line)
reader.close()
print(f'Number of lines in the file is {count}') |
The output is
Code: |
THIS IS A FIRST LINE IN THE INPUT FILE THIS IS THE SECOND LINE IN THE INPUT FILE THIS IS THE THIRD LINE IN THE INPUT FILE
Number of lines in the file is 1 |
|
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
re: treat NL as new line?
z/OS data sets do not normally have NL characters at the end of the line. For FB datasets, the system knows how long each line is. For VB datasets, the length is in the first half word (?) of the line.
Turn HEX ON to verify that there is indeed a NL character at the end of each line. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
How to tell that NL is the new line character? the python documentation doesn't have the NL character, they got CR, CRLF, LF |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Thank you Pedro and Joerg for looking at this,
Quote: |
Turn HEX ON to verify that there is indeed a NL character at the end of each line. |
There is no NL at the end of each line.
Quote: |
z/OS data sets do not normally have NL characters at the end of the line. For FB datasets, the system knows how long each line is. For VB datasets, the length is in the first half word (?) of the line. |
I dont know how to tell the program when a record ends |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
I do not have experience with python on z/OS, but likely the documentation is incomplete.
I think you need to experiment. Add x'00' (nul) to the end of the line and maybe x'25' (LF) to another. And also whatever hex chars your documentation says CR and CRLF are. |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
|
|
@vasanthz: Can you please try the split() thing out? That would also cover @sergeyken's suggestion with the count of NL's. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Thank you for the link, unfortunately we don't have zoautil installed currently. I was thinking of installing it, but it requires APF authorization which I currently don't have access to. I will ask my sysprogs. |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
re: "Can you please try the split() thing out?"
Vasanth has not described the data set attributes, but normally ISPF will not put \n characters at the end of each line.
For example, if it is FB80 with blksize 3120, when the access method creates out the data, it bunches up 39 logical records concatenated into a single physical record, without any \n separator characters. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Quote: |
For example, if it is FB80 with blksize 3120, when the access method creates out the data, it bunches up 39 logical records concatenated into a single physical record, without any /n separator characters. |
The dataset is PS, FB, LRECL 80, Block Size 24000.
I can try to put more data into the file and see if it splits it by blocks. Thank you |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
Check your NULLS setting in your ISPF editor profile. Maybe turn on may change how the python reads it.
If I recall correctly, you have to edit each line for the nulls to be added at the end. |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2589 Location: Silicon Valley
|
|
|
|
re: The dataset is PS, FB, LRECL 80, Block Size 24000.
Try blksize 80. It is inefficient storage wise, but if that is what it takes for your program to work.... |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Quote: |
Try blksize 80. It is inefficient storage wise, but if that is what it takes for your program to work.... |
I tried it, It didn't help still the input file gets read as a single record.
I guess as Joerg mentioned zoautil is essential for working with Mainframe datasets |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Quote: |
Check your NULLS setting in your ISPF editor profile. Maybe turn on may change how the python reads it. |
Created two files with NULLS ON and NULLS OFF profile setting, still no joy. I will try to print the file in hex and see what information it has |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
|
|
Was/is it possible to apply the split() operation to the one single read record?
It was just a sample, I don't think you need that zoautil util. Please see also the other link regarding GitHub for reading large amounts of data. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10882 Location: italy
|
|
|
|
FB files/datasets do not have any control chars embedded
FB files/dataset should/must be trated as binary files
something along the lines of
Code: |
file = open ("FB80file", "rb")
print(file.read(80))
file.close() |
up to you to assign the record read to a byte buffer and loop until end of file |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1319 Location: Bamberg, Germany
|
|
|
|
enrico-sorichetti wrote: |
file = open ("FB80file", "rb") |
Remove the blank between the function name and it's arguments. It can make a big difference. |
|
Back to top |
|
|
vasanthz
Global Moderator
Joined: 28 Aug 2007 Posts: 1744 Location: Tirupur, India
|
|
|
|
Thanks Enrico for the binary suggestion, but binary data is hard to work with.
The below code works OK, it uses the splitlines method, similar to the one Joerg suggested. The problem is, it reads the whole file and does the splitting. I will live with this for now and try to get zoautil installed.
Code: |
reader = open("//DD:DDIN","r",encoding='cp037')
#######
count = 0
for line in reader:
listrecords = line.splitlines()
for record in listrecords:
count +=1
print(record)
reader.close()
print(f'Number of lines in the file is {count}') |
Output:
Code: |
THIS IS A FIRST LINE IN THE INPUT FILE
THIS IS THE SECOND LINE IN THE INPUT FILE
THIS IS THE THIRD LINE IN THE INPUT FILE
Number of lines in the file is 3 |
Thank you everyone for useful tips and suggestions |
|
Back to top |
|
|
|