View previous topic :: View next topic
|
Author |
Message |
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
Hello, how are you, I have a question regarding this.
with this:
JOIN UNPAIRED,F1,F2
If my 2 output files are
1) file with matching records by keys
2) File with only records on f1
is better in terms of performance to use this because there is no writring of the non machting records from f2?:
JOIN UNPAIRED,F1
Thanks. |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1338 Location: Bamberg, Germany
|
|
|
|
Why use more resources than necessary? Performance is also dependent from the amount of data to be processed. |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2146 Location: USA
|
|
|
|
JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.
Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT) |
|
Back to top |
|
|
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
Joerg.Findeisen wrote: |
Why use more resources than necessary? Performance is also dependent from the amount of data to be processed. |
So, putting F1 F2 signifies using more resources than using just F1?
Can you briefly explain why?
Thanks. |
|
Back to top |
|
|
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
sergeyken wrote: |
JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.
Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT) |
Thanks, can you explain more about those disadvantages? |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
|
|
You really need to understand main task and sub tasks .. try all options and test against your input data and with real volume of records that you ca expect to measure performance. I would not expect any major deviations ( specially when data is sorted already then specify SORTED,NOSEQCK), right join order and selecting only necessary data etc. |
|
Back to top |
|
|
sergeyken
Senior Member
Joined: 29 Apr 2008 Posts: 2146 Location: USA
|
|
|
|
Ali_gezer wrote: |
sergeyken wrote: |
JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.
Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT) |
Thanks, can you explain more about those disadvantages? |
#1. The obvious one:
With F2 included, an intermediate joined record becomes in average twice as long as a single file record. It requires more resources at each of processing steps, and double size of working files (if any involved) - only to be fully eliminated at the very end. When handling multi-million records files it may seriously affect total performance.
The results may vary. You need to run plenty of tests with various options, involving your real data (or similar test data). |
|
Back to top |
|
|
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
Rohit Umarjikar wrote: |
You really need to understand main task and sub tasks .. try all options and test against your input data and with real volume of records that you ca expect to measure performance. I would not expect any major deviations ( specially when data is sorted already then specify SORTED,NOSEQCK), right join order and selecting only necessary data etc. |
FROM THE guide I only found this
''JOIN
If you don't specify a JOIN statement, only paired records from F1 and F2 are kept and processed by
the main task as the joined records (inner join). You can optionally specify a JOIN statement to have
the main task keep and process: unpaired F1 records as well as paired records (left outer join);
unpaired F2 records as well as paired records (right outer join); unpaired F1 and F2 records as well as
paired records (full outer join); only unpaired F1 records; only unpaired F2 records, or only unpaired
F1 and F2 records.''
I tried with a file with 500.000 records and there is no diferrence using
F1,F2 and F1.
Do you recomend me put F1, because there is more than 10 sort joinkeys using F1,F2 BUT keeping only F1 unmatched. |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1338 Location: Bamberg, Germany
|
|
|
|
If you only need joined records and F1, specify exactly that. |
|
Back to top |
|
|
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
Joerg.Findeisen wrote: |
If you only need joined records and F1, specify exactly that. |
Can you reccomend me where can I find info to back up your reccomendation? in the sort user guide there is no info and I would like to know why is better in terms of performance |
|
Back to top |
|
|
Joerg.Findeisen
Senior Member
Joined: 15 Aug 2015 Posts: 1338 Location: Bamberg, Germany
|
|
|
|
You can do one step to the beach in swim shorts or with mountain gear, no difference. If you have to walk a longer distance, you'll certainly notice why unnecessary things should be avoided for some tasks. |
|
Back to top |
|
|
Ali_gezer
Active User
Joined: 06 Apr 2021 Posts: 123 Location: argentina
|
|
|
|
Joerg.Findeisen wrote: |
You can do one step to the beach in swim shorts or with mountain gear, no difference. If you have to walk a longer distance, you'll certainly notice why unnecessary things should be avoided for some tasks. |
Thanks.
I know that in the large run it will form a bad performance but my intention was why was that regarding the Join STATEMENT.
In the user guide I cant see no info, I wanted to know why.
thanks again. |
|
Back to top |
|
|
Rohit Umarjikar
Global Moderator
Joined: 21 Sep 2010 Posts: 3076 Location: NYC,USA
|
|
Back to top |
|
|
|