Setting number of executors to a jobflow

Support questions related to CloverETL Server

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Setting number of executors to a jobflow

Postby hewills » Tue Apr 18, 2017 4:46 pm

I have a jobflow with 8 graphs, that I'm running on Clover Server. Each graph is truncating a target table, selecting data from my source, and loading the target table. They are connected and running synchronously in the jobflow.
I created a "global" parameter called EXECUTORS, with a value of 5.

I set the jobflow "number of executors" to this parameter, but the graphs still only run one at a time.
I also tried....
Setting this value on each of the graphs.
Setting all the graphs, plus the jobflow, with this parameter.
Making the parameter a string.
Making the parameter an integer.
Combination of all these things.

Every time it still only runs one graph at a time. From reading the documentation, I thought the jobflow would run more than 1.
What am I doing wrong? Pic of jobflow below.


jobflow.png
jobflow.png (19.51 KiB) Viewed 298 times

jandikovae
Posts: 27
Joined: Fri Nov 04, 2016 8:51 am

Re: Setting number of executors to a jobflow

Postby jandikovae » Fri Apr 21, 2017 9:49 am

Hi,

Actually, the "Number of Executors" attribute is applied to a situation when a single ExecuteGraph component executes one child graph multiple times. For example, you can have a ListFiles component at the beginning and the ExecuteGraph then starts a child graph for each file (for each record coming from ListFile component to the ExecuteGraph component). Then you can setup what is the maximum number of runs of the same child graph at the same time (by setting up the Number of Executors attribute).

In your design, the reason why the graphs are not running simultaneously is the fact that each ExecuteGraph component starts (executes its graph) when it receives the input data from the previous component to the connected input port. As long as the components need data from the previous ones, they have to be executed one after another and cannot run simultaneously.

My question would be: Is there a business reason to connect the ExecuteGraph components to each other? Could you, by any chance, let them be parallel with no input edge connected and just connect outputs to the TokenGather? This way they would all work at the same time. See the example picture below (in a design like that the components work in parallel).

Capture.PNG
Capture.PNG (62.05 KiB) Viewed 287 times


I hope this helps. Have a nice day.

Eva
---
Eva Jandikova
CloverCARE Support
CloverETL | Rapid Data Integration

Visit us online at http://www.cloveretl.com
How to speed up communication with CloverCARE support

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Re: Setting number of executors to a jobflow

Postby hewills » Fri Apr 21, 2017 5:51 pm

Thanks for the reply Eva, that makes sense.

You're right that in my example the jobs could be run in parallel, except if too many run at a time, it causes issues.
We have about 500 graphs that need to be executed, so I'm trying to figure out the best way to organize them.

I was hoping that there was a way to place ~100 graphs in one jobflow, and the jobflow itself would limit the number that run in parallel, using the 'number of executors' parameter.

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Re: Setting number of executors to a jobflow

Postby hewills » Mon Apr 24, 2017 7:50 pm

I decided to try setting up the "dynamic table load", where I feed a list of table names and parameters to one ExecuteGraph. This looks like a better solution for us.

jandikovae
Posts: 27
Joined: Fri Nov 04, 2016 8:51 am

Re: Setting number of executors to a jobflow

Postby jandikovae » Tue Apr 25, 2017 7:39 am

Hi,

You are right, this is definitely one of the options. This way you can use the "Number of Executors" to control a number of simultaneously running instances of the child graph. However, similarly to the "Number of Executors", you can also setup maximum number of running instances of the same graph/jobflow in case you don't execute it from a single component. This can be managed by CloverETL Server.

To do so, go to the CloverETL Server application -> Sandboxes -> Config Properties. The properties that you might be looking for are as follows:
1. max_running_concurrently (max number of concurrently running instances of the job) and
2. enqueue_executions (boolean value; if it is true, executions above max_running_concurrently are enqueued, if it is false executions above max_running_concurrently fail).

Navigate to any sandbox, jobflow or graph in the above-mentioned menu on the Server, in the "Create new config property" section, choose a parameter from a list, enter the desired value and add the parameter to the selected location.

Please note that it is applied per file, it means that it will work only if you call always the same graph (with a different parameter, for example).

Please give this a try and let me know if this is what you have been looking for.

Eva
---
Eva Jandikova
CloverCARE Support
CloverETL | Rapid Data Integration

Visit us online at http://www.cloveretl.com
How to speed up communication with CloverCARE support

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Re: Setting number of executors to a jobflow

Postby hewills » Tue Apr 25, 2017 11:42 pm

Now that I understand how the executors work, I don't think the 'max_running_concurrently' parameter is needed. But it's nice to know in case we need it for a different scenario.
I setup the ExecuteGraph with dynamic table parameters and metadata, and so far it's working great. Thanks!