|By Jessica Qiu||
|May 26, 2014 12:03 PM EDT||
Recently, we skillfully handled an industry project to import a great amount of data from file to Oracle in comparatively short time.
At the beginning, we tried to import the data with the sqlldr of Oracle, only to find it is surprisingly time-consuming to load a great amount of data: Too slow if 2.5 hours have to be spent to import a table containing 80 million records.
Later, by adopting the parallel import of sqlldr, the time is shortened to 0.8 hour. Here are the full details:
Train of thoughts
Split the data file to import into 10 shares. Then, with the multi-task parallelism, execute the sqlldr command for the corresponding shares. Needless to say, prepare the control files of the same amount. Then, multiple clients will start to import data to database all at the same time.
Please note these two things: 1. The way to generate multiple sqlldr commands and corresponding number of control files - a bit tried if writing them one by one; 2, The way to perform parallelly - ever more tired if performing one by one.
In this case, we use the tool, esProc by name, to generate the commands and control files automatically and then run parallelly.
Responsible for task control, task distribution, and calling sub-program.
Generate the specific control file and sqlldr command, and execute the import command to complete the data loading
Note: In this case, the parallelism feature of esProc is used to execute multiple sqlldr commands; The function system is used to call the system commands.
Because of the programmable-controlled parallel tasks, the number of parallel tasks can be set as necessary to tap into the machine’s full performance potential.
The below figure illustrates the sqlldr import speed for different degree of parallelism - linear increasing on the whole - the more parallel tasks, the faster the import would be.
SYS-CON Events announced today that Interface Masters Technologies, a leader in Network Visibility and Uptime Solutions, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Interface Masters Technologies is a leading vendor in the network monitoring and high speed networking markets. Based in the heart of Silicon Valley, Interface Masters' expertise lies in Gigabit, 10 Gigabit and 40 Gigabit Eth...
Sep. 30, 2016 09:30 AM EDT Reads: 2,724
Creating replica copies to tolerate a certain number of failures is easy, but very expensive at cloud-scale. Conventional RAID has lower overhead, but it is limited in the number of failures it can tolerate. And the management is like herding cats (overseeing capacity, rebuilds, migrations, and degraded performance). Download Slide Deck: ▸ Here In his general session at 18th Cloud Expo, Scott Cleland, Senior Director of Product Marketing for the HGST Cloud Infrastructure Business Unit, discusse...
Sep. 30, 2016 09:30 AM EDT Reads: 2,575
Sep. 30, 2016 09:15 AM EDT Reads: 3,046
Sep. 30, 2016 09:15 AM EDT Reads: 3,039
Sep. 30, 2016 09:00 AM EDT Reads: 2,898
Sep. 30, 2016 08:45 AM EDT Reads: 161
Sep. 30, 2016 08:45 AM EDT Reads: 4,473
Sep. 30, 2016 08:15 AM EDT Reads: 1,704
Sep. 30, 2016 08:15 AM EDT Reads: 557
Sep. 30, 2016 08:00 AM EDT Reads: 172
Sep. 30, 2016 08:00 AM EDT Reads: 4,220
Sep. 30, 2016 08:00 AM EDT Reads: 2,172
Sep. 30, 2016 08:00 AM EDT Reads: 2,823
Sep. 30, 2016 08:00 AM EDT Reads: 567
Sep. 30, 2016 07:30 AM EDT Reads: 199