Using allocation with DBInputTable

Support questions related to CloverETL Server

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Using allocation with DBInputTable

Postby hewills » Tue Apr 18, 2017 5:00 pm

I'm using Corp Plus Server, and saw that the data partition objects are available to use, including the Allocation parameter.
I've been testing how to use it on a DBInputTable object, where each allocation pulls a different set of data from our source table, and I use ParallelSimpleGather to bring them together to load.

In order to get it to work, I "build" a Query URL for each allocation node, based on their WORKER_ID, but it takes time.
I'm curious if there is another way to use allocations/partitions with SQL queries, table objects, etc...

So far, I've only discovered that I can reference WORKER_ID and WORKER_COUNT. Are there any other useful parameters involving allocations/partitions I could use?

vazquezrosariop
Posts: 92
Joined: Mon Feb 29, 2016 5:33 pm

Re: Using allocation with DBInputTable

Postby vazquezrosariop » Thu Apr 20, 2017 7:53 pm

Hi hewills,

Could you please provide me with following information:

  • How many records are you processing?
  • How long does this process usually take?
  • What Database Server are you utilizing (vendor, version)?
  • Can you please provide me with your graph, I would like to look over your use case as well are how you configured it (please remove any sensitive data)
---
Pedro Vazquez Rosario
CloverCARE Support
CloverETL | Rapid Data Integration

Visit us online at http://www.cloveretl.com
How to speed up communication with CloverCARE support

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Re: Using allocation with DBInputTable

Postby hewills » Fri Apr 21, 2017 6:07 pm

Hi, thanks for taking a look.

How many records are you processing?
~ 3,000,000

How long does this process usually take?
7.5 minutes

What Database Server are you utilizing (vendor, version)?
Oracle 12c database, on Red Hat 7.2
Clover Server is running on CentOS
I don't have access to the source server that I'm pulling data from, but it is probably similar.

Can you please provide me with your graph, I would like to look over your use case as well are how you configured it (please remove any sensitive data). See attached.

allocation_example.grf
(9.81 KiB) Downloaded 6 times

vazquezrosariop
Posts: 92
Joined: Mon Feb 29, 2016 5:33 pm

Re: Using allocation with DBInputTable

Postby vazquezrosariop » Fri Apr 21, 2017 8:18 pm

Hi hewills,

After looking over your graph, I have a few suggestions that will improve the overall performance:

  • I would recommend using a Normalizer component to be able to create multiple queries
  • I would recommend removing the ParallelGather and adding a ParallelPartition before the DBInputTable
  • The ParallelPartition will distribute incoming data records among different workers (please read more information here)

I have taken the liberty of adding a simple graph that simulates your use case

Edit: I changed the Query URL processing type to Discrete on the DBInputTable

send_optimization_data.grf
(4.75 KiB) Downloaded 6 times
---
Pedro Vazquez Rosario
CloverCARE Support
CloverETL | Rapid Data Integration

Visit us online at http://www.cloveretl.com
How to speed up communication with CloverCARE support

svecp
Posts: 25
Joined: Wed Nov 09, 2016 11:51 pm
Location: 2111 Wilson Blvd., Arlington VA 22201
Contact:

Re: Using allocation with DBInputTable

Postby svecp » Tue Apr 25, 2017 2:28 pm

There might be a major performance improvement in case that particular table is partitioned. You can query each partition separately, using "PARTITION" clause and instead of using ROW_NUMBER() you can select the whole partition. That should lower DB overhead considerably. See https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#SQLRF01702.

But that really depends on your use case and DB structure.
--
Pavel Švec | CloverETL | Sales Engineer | 2111 Wilson Blvd | Suite 320 | Arlington, VA 22201

hewills
Posts: 9
Joined: Tue Apr 18, 2017 4:18 pm
Location: CA

Re: Using allocation with DBInputTable

Postby hewills » Tue May 02, 2017 5:02 pm

Thanks Pedro! I had a chance to use your design, and it sped up the graph as expected. As you said, using the 'ParallelPartition' seems to be key.
Using 'PARTITION' wouldn't be an option for the database we are currently working with. But I'll keep it in mind for the future.


cron