View previous topic :: View next topic
|
Author |
Message |
sreejeshcs
New User
Joined: 28 May 2007 Posts: 31 Location: Pune
|
|
|
|
hello,
We are migrating mainframe CSV file to Hadoop, I need help to write a Rexx Program which will read list file names from a input file and give output as file name , file size & number of lines to automate the validation. Input to this program is mainframe file names.
Thanks
Sreejesh Sreenivasan |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
tell the level of Your Rexx knowledge
define the extent of help You need |
|
Back to top |
|
|
sreejeshcs
New User
Joined: 28 May 2007 Posts: 31 Location: Pune
|
|
|
|
I have basic knowledge if someone can guide i can write the program. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1316 Location: Vilnius, Lithuania
|
|
|
|
The deaf leading the blind...
What don't your organisation trust? FTP, MQ, Connect:Direct?
No, it's better to get a user with basic knowledge to knock up something that is unlikely to verify anything. Number of lines might stay the same in a transfer, but file-size will change, as every line will get a CR/LF added.
Please close this topic, it does not belong in a forum for experts! |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2147 Location: USA
|
|
|
|
sreejeshcs wrote: |
hello,
We are migrating mainframe CSV file to Hadoop, I need help to write a Rexx Program which will read list file names from a input file and give output as file name , file size & number of lines to automate the validation. Input to this program is mainframe file names.
Thanks
Sreejesh Sreenivasan |
Based on your question, my impression is that your familiarity not only with REXX, but with Information Technology (i.e. "computers") as a whole, stopped at the level of computer games guru. Is this correct?
If so then you need (1) RTFM, (2) ask your boss about details of the job, (3) go to the beginner's forum. |
|
Back to top |
|
|
sreejeshcs
New User
Joined: 28 May 2007 Posts: 31 Location: Pune
|
|
|
|
below code is failing. from listfile i am able to read file name\. from the file name i am not able to read
Code: |
do index = 1 to listfile.0
csvfile = strip(listfile.index)
/* read input file */
"execio * diskr csvfile (stem filecsv. fini"
if rc \= 0 then do
say 'error reading input csvfile. RC =' rc
exit 8
end
say filecsv.0 'records read from file' infile
outindex = outindex + 1
outfile.jdx = strip(listfile.index)','filecsv.0
end |
|
|
Back to top |
|
|
sreejeshcs
New User
Joined: 28 May 2007 Posts: 31 Location: Pune
|
|
|
|
Tried below option but i m getting below error. Can some one help ?
Code: |
/* alocate csvfile */
"ALLOC FI(CSVDD) DA("csvfile") SHR REUSE"
Spool Output:
6 records read
6 datasets extracted
A.B.C.D
INVALID DATA SET NAME, A.B.C.D
MISSING DATA SET NAME OR *+
MISSING NAME OF DATA SET TO BE ALLOCATED
The input or output file CSVDD is not allocated. It cannot be opened for I/O.
EXECIO error while trying to GET or PUT a record.
error reading input file. RC = 20
READY
END |
|
|
Back to top |
|
|
sreejeshcs
New User
Joined: 28 May 2007 Posts: 31 Location: Pune
|
|
|
|
Resolved the issue . Can some one help to find number of columns in a Tab delimited file in Rexx ? |
|
Back to top |
|
|
Nic Clouston
Global Moderator
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
|
|
|
|
Please use the code tags when posting code, data and anything else that needs a fixed pitch font.
When asking for help to resolve a problem you should provide, not only the code, but the trace.
To find the number of tabs in a record read the record a byte at a time counting the number of tab characters. You may need to look at the data with HEX ON to determine what the EBCDIC code for tab is - or look up your reference card.
An alternative way would be to use PARSE.
EDIT: Perhaps a better way would be to use POS in a loop, counting how the times the tab character is found. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1316 Location: Vilnius, Lithuania
|
|
|
|
sreejeshcs wrote: |
below code is failing. from listfile i am able to read file name\. from the file name i am not able to read
Code: |
do index = 1 to listfile.0
csvfile = strip(listfile.index)
/* read input file */
"execio * diskr csvfile (stem filecsv. fini"
if rc \= 0 then do
say 'error reading input csvfile. RC =' rc
exit 8
end
say filecsv.0 'records read from file' infile
outindex = outindex + 1
outfile.jdx = strip(listfile.index)','filecsv.0
end |
|
Hadoop = Big Data
The idea is completely stupid, using EXECIO to count the number of lines in a file. Your support staff will be ever so happy if you do this with a few dozen files containing a few million records each.
Please close this thread, it does not belong on this forum, which is for experts, and not dimwits like this git abusing the title "software engineer"! |
|
Back to top |
|
|
steve-myers
Active Member
Joined: 30 Nov 2013 Posts: 917 Location: The Universe
|
|
|
|
sreejeshcs wrote: |
...Can some one help to find number of columns in a Tab delimited file in Rexx ? |
Meaningless. The tab character - if that is what you are using - is a single character which is interpreted as a tab character when the spreadsheet program reads the file and is used as a field separator. It is 1 character in the input, 1 character as transmitted, and 1 character in the file at the work station. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
Wouldn't DFSORT a better choice , why REXX? |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1316 Location: Vilnius, Lithuania
|
|
|
|
Nic Clouston wrote: |
Please use the code tags when posting code, data and anything else that needs a fixed pitch font.
When asking for help to resolve a problem you should provide, not only the code, but the trace.
To find the number of tabs in a record read the record a byte at a time counting the number of tab characters. You may need to look at the data with HEX ON to determine what the EBCDIC code for tab is - or look up your reference card.
An alternative way would be to use PARSE.
EDIT: Perhaps a better way would be to use POS in a loop, counting how the times the tab character is found. |
One single REXX statement will do it, and there are (at least) two of them. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
Quote: |
Can some one help to find number of columns in a Tab delimited file in Rexx ? |
the definition of <number of columns> might get murky ...
each record might have different number of columns ( tabs/columns )
( emtpy columns at the end of the record might be missing - no separator stored ) |
|
Back to top |
|
|
steve-myers
Active Member
Joined: 30 Nov 2013 Posts: 917 Location: The Universe
|
|
|
|
Mr. Sorichetti is correct. Mainframe text editors (e.g. the ISPF editor) do not use tabs, at least within a data set.
Code: |
****** ***************************** Top of Data ******************************
==MSG> -CAUTION- Data contains invalid (non-display) characters. Use command
==MSG> ===> FIND P'.' to position cursor to these
000001 ----+----1----+----2----+----3
000002 124 122 256 9999
****** **************************** Bottom of Data **************************** |
The blanks between the numbers are actually EBCDIC tab characters. Now I send the data set to a PC and display it with Windoze notepad, and I see it spread out like this -
Code: |
----+----1----+----2----+----3
124 122 256 9999 |
Notepad obviously recognized the tab characters. |
|
Back to top |
|
|
|