Determine priority of ETL workflow activities and their parallel execution

The process of ETL could be treated as a data-centric workflow.This paper discussed the execution of the ETL workflow and proposed an algorithm to determine the priority of the activities in the ETL workflow,threads were created for the activities that share the same priority and were not dependent on each other.The activities were put in the parallel execution environment,which could improve the execution efficiency of the ETL workflow.The result of the experiment shows that the acceleration ratio of the parallel algorithm and the serial algorithm could be approaching the ideal value,as long as the data records involving is large enough.The acceleration ratio rises as the number of the involved data records increases.