I have an amount field in a file and I want to split the records into multiple records 250000 each into one file until the total number of records reaches 10. Even after 10 splits if the amount is greater than zero then write the record into a separate file with the remaining amount.
Joined: 10 May 2007 Posts: 2455 Location: Hampshire, UK
Is this a one-off for this particular record? If so you would be better off writing a simple program or even doing it manually. If not, will the value to be output on each split always be 250,000? If not, how is that calculated? Suppose you cannot get 10 records from splitting e.g. if your input record was:
Code:
12345 1250000
you would only get 5 splits and nothing in your second dataset (not file).
Yes the split limit is always 250,000. number of splits is based on the total amount. We will have to write to second file only if the amount is greater than 0 after 10 splits. If the amount exhausts after 5 splits, no need to write to second file. If the remaining amount after 10 splits is greater than 25000, we will only write one record for the total amount in the second file. I mean no splits required in the second file.
EDIT : Apparently my tests were not exhaustive enough. Thanks Bill for notifying me, I have edited my cards above. This assumes original amount to be split >= the individual split amount (250,000 in this case)
For example I have give that record. My file contains many of such records with key and amount fields. And amount is not a fixed one. If the amount is greater than 250000 then we should go for split. Else we will write one record and proceed to next. Also, if the amount split goes beyond 10 records then we will write 10 split records in the first file and another record with rest of the amount is written to second file.
Each record is repeated 10 times. And an offset to be subtracted from the original amount is created in incremental amounts (column-4 above). I have used the same column to store the subtraction results. So before subtraction, column 4 will look like this
So in the example above, we need all records to output-1 as long as the subtraction result is non-negative. Plus if your subtraction result is less than the incremental amount (250000), we are writing the leftover as an additional record. Because it cannot be split further.
We need to write output-2 only when the last record computation result is non-negative. ie, when we have a valid left over amount even after the 10th iteration.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
magesh23586 wrote:
Arun Raj wrote:
This is a good alternative. But again the number of steps may not always decide if there is any savings or not.
You are reading the file twice. One pass is always better than two pass.
You have extended each record 10 times towards the end or record. Depends on what exactly you try to "save", and if we have run sufficient tests to substantiate that.
And what if for some reason in the future, someone decides to split into 50 amounts, you need to extend your record further more towards the right and code additional 40 lines in the OVERLAY
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
My main concern with this topic is the data. Asked for a representative sample, data was only shown which is exactly divisible by 250000 (despite the example in the first post).
If that second set of sample data is representative, then the task is easy, one step, no ICETOOL.
You note that all the values on the first output file are 250000, and on the first OUTFIL output 1-10 records (using the slash-operator) depending on the amount (if greater than 2.5 million, output 10, 2.25 million, output nine, etc).
On the second OUTFIL, INCLUDE= for GT 2.5 million, and subtract 2.5 million to give the residual value.
If, as I suspect, the numbers are not nice and "round", it doesn't take much to make the final one a calculation on the first OUTFIL (subtract the value "below" the one you are testing for).
With all the references to the same fields and constants, I'd definitely show the solution with symbols.
But then, do I want to prepare and test all that, only to have TS/OP explain further that the description of the data is not quite right?
Lots of IFTHENs, lots of code to create, only two calculations at maximum per record.
For a different requirement, I'd go for RESIZE over two passes of the data, but Arun is correct, for any given solution it is only known to perform better or worse than another with the actual data. Generally I'd expect the RESIZE to work better, but no guarantees, as it does depend on what goes along with it, and can, not so much in this example, depend on the data.
As to "over engineering" there's a certain amount of that in both proposed solutions :-)
All suggested solutions would suffer from an increase to 50 splits, but all the code for all the solutions could be "generated" should such a thing arise.
My specific advice is to wait for/obtain clarity from TS/OP before getting to code...
Here I am overlaying the input record starting from 105 position and my amount field in the input file resides in 28th position in PD format with a length of 7. And, I want the subtracted value in ZD format.
After execution I am not getting the subtracted value correctly. Please suggest if I am doing something wrong.
I have also tried putting the subtracted value in a separate subsequent position. Its still the same. Subtraction is not working fine here. Please advise what am I am doing wrong.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Yes, on a re-count you are correct. I have a presctiption for new glasses, but I threw it away thinking it was an old receipt from the supermarket...
Your "M" simply indicates you have calculated a negative number. Find the value of M in EBCDIC, the "D" there is the sign, and the number-looking thing is the number.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
vnktrrd,
Apart from the negative sign, the number itself does not seem to be what you expected. Maybe the actual value presented to the computation was different from what you have shown.
a quick and dirty computation shows that
278304 + 249000 ==> 527304
and that
( -278304 ) - 249000 ==> - 527304
which is exactly what You got
( any decent IT practitioner would have noticed that )
Quote:
I am subtracting smaller number from greater number
Your analysis could not be more wrong ...
subtracting a negative number from a negative number gives a MORE negative number
and a negative number was what You had in some preceding computations
again ... the topic is becoming a waste of time for everybody and it should be locked as I did before