IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

JOIN STATEMENT PERFORMANCE.


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Tue Nov 29, 2022 1:21 am
Reply with quote

Hello, how are you, I have a question regarding this.

with this:
JOIN UNPAIRED,F1,F2

If my 2 output files are

1) file with matching records by keys
2) File with only records on f1


is better in terms of performance to use this because there is no writring of the non machting records from f2?:

JOIN UNPAIRED,F1

Thanks.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1255
Location: Bamberg, Germany

PostPosted: Tue Nov 29, 2022 3:15 am
Reply with quote

Why use more resources than necessary? Performance is also dependent from the amount of data to be processed.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Tue Nov 29, 2022 5:40 am
Reply with quote

JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.

Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT)
Back to top
View user's profile Send private message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Tue Nov 29, 2022 5:45 pm
Reply with quote

Joerg.Findeisen wrote:
Why use more resources than necessary? Performance is also dependent from the amount of data to be processed.


So, putting F1 F2 signifies using more resources than using just F1?
Can you briefly explain why?

Thanks.
Back to top
View user's profile Send private message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Tue Nov 29, 2022 5:49 pm
Reply with quote

sergeyken wrote:
JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.

Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT)


Thanks, can you explain more about those disadvantages?
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3053
Location: NYC,USA

PostPosted: Tue Nov 29, 2022 5:50 pm
Reply with quote

You really need to understand main task and sub tasks .. try all options and test against your input data and with real volume of records that you ca expect to measure performance. I would not expect any major deviations ( specially when data is sorted already then specify SORTED,NOSEQCK), right join order and selecting only necessary data etc.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2023
Location: USA

PostPosted: Tue Nov 29, 2022 6:16 pm
Reply with quote

Ali_gezer wrote:
sergeyken wrote:
JOIN UNPAIRED,F1 is enough. Extra F2 should not give any benefit, except disadvantages.

Next, split two OUTFILs by the matching flag (‘?’ from REFORMAT)


Thanks, can you explain more about those disadvantages?


#1. The obvious one:

With F2 included, an intermediate joined record becomes in average twice as long as a single file record. It requires more resources at each of processing steps, and double size of working files (if any involved) - only to be fully eliminated at the very end. When handling multi-million records files it may seriously affect total performance.

The results may vary. You need to run plenty of tests with various options, involving your real data (or similar test data).
Back to top
View user's profile Send private message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Tue Nov 29, 2022 6:43 pm
Reply with quote

Rohit Umarjikar wrote:
You really need to understand main task and sub tasks .. try all options and test against your input data and with real volume of records that you ca expect to measure performance. I would not expect any major deviations ( specially when data is sorted already then specify SORTED,NOSEQCK), right join order and selecting only necessary data etc.


FROM THE guide I only found this
''JOIN
If you don't specify a JOIN statement, only paired records from F1 and F2 are kept and processed by
the main task as the joined records (inner join). You can optionally specify a JOIN statement to have
the main task keep and process: unpaired F1 records as well as paired records (left outer join);
unpaired F2 records as well as paired records (right outer join); unpaired F1 and F2 records as well as
paired records (full outer join); only unpaired F1 records; only unpaired F2 records, or only unpaired
F1 and F2 records.''


I tried with a file with 500.000 records and there is no diferrence using
F1,F2 and F1.


Do you recomend me put F1, because there is more than 10 sort joinkeys using F1,F2 BUT keeping only F1 unmatched.
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1255
Location: Bamberg, Germany

PostPosted: Tue Nov 29, 2022 6:49 pm
Reply with quote

If you only need joined records and F1, specify exactly that.
Back to top
View user's profile Send private message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Tue Nov 29, 2022 7:07 pm
Reply with quote

Joerg.Findeisen wrote:
If you only need joined records and F1, specify exactly that.


Can you reccomend me where can I find info to back up your reccomendation? in the sort user guide there is no info and I would like to know why is better in terms of performance
Back to top
View user's profile Send private message
Joerg.Findeisen

Senior Member


Joined: 15 Aug 2015
Posts: 1255
Location: Bamberg, Germany

PostPosted: Tue Nov 29, 2022 7:46 pm
Reply with quote

You can do one step to the beach in swim shorts or with mountain gear, no difference. If you have to walk a longer distance, you'll certainly notice why unnecessary things should be avoided for some tasks.
Back to top
View user's profile Send private message
Ali_gezer

Active User


Joined: 06 Apr 2021
Posts: 123
Location: argentina

PostPosted: Thu Dec 01, 2022 12:48 am
Reply with quote

Joerg.Findeisen wrote:
You can do one step to the beach in swim shorts or with mountain gear, no difference. If you have to walk a longer distance, you'll certainly notice why unnecessary things should be avoided for some tasks.


Thanks.
I know that in the large run it will form a bad performance but my intention was why was that regarding the Join STATEMENT.
In the user guide I cant see no info, I wanted to know why.

thanks again.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3053
Location: NYC,USA

PostPosted: Thu Dec 01, 2022 3:46 am
Reply with quote

www.ibm.com/docs/en/zos/2.3.0?topic=files-joinkeys-application-examples
Please go thru this and run tests to know why.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts exploiting Z16 performance PL/I & Assembler 2
No new posts Join multiple records using splice DFSORT/ICETOOL 5
No new posts Join 2 files according to one key field. JCL & VSAM 3
No new posts Join files where value in one is betw... DFSORT/ICETOOL 6
No new posts Relate COBOL statements to EGL statement All Other Mainframe Topics 0
Search our Forums:

Back to Top