Friday, March 9, 2012

Indexes slowing down BULK INSERT

I've been doing some experiments with speeding up copying tables of
approximately 1 million rows between databases using BCP and BULK INSERT.

I noticed that the total time for removing the indexes (non-clustered) and
then recreating them after the BULK INSERT was significantly less than just
doing the BULK INSERT with the indexes left there, even though I specified
TABLOCK.

I would have expected SQL Server not to update the index until the insert
completed (given the table lock) and so removing the indexes would have no
effect. Can anyone explain why removing the indexes should speed it up?

This is on SQL Server 7.

Cheers
DaveDavid Sharp (no email address supplied) writes:
> I've been doing some experiments with speeding up copying tables of
> approximately 1 million rows between databases using BCP and BULK INSERT.
> I noticed that the total time for removing the indexes (non-clustered)
> and then recreating them after the BULK INSERT was significantly less
> than just doing the BULK INSERT with the indexes left there, even though
> I specified TABLOCK.
> I would have expected SQL Server not to update the index until the insert
> completed (given the table lock) and so removing the indexes would have no
> effect. Can anyone explain why removing the indexes should speed it up?

I have not studied this case very closely. But a few observations: if
you supplied a batch size with /b, SQL Server had no choice but to
maintain the indexes while loading, since each batch is committed
separately.

When running some bulk-loading recently, I notice that when loading on
a completely unindexed table, BCP reported the copied rows swiftly, and
then completely directly, whereas on indexed tables there was a delay
from when all rows had been loaded until the command had completed,
which I supposed was spent on rebuilding indexes. I did not use TABLOCK.

I should add that I was working on SQL 2000, and the behaviour I saw
may reflect an improvement from SQL7.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||"Erland Sommarskog" <sommar@.algonet.se> wrote in message
news:Xns941B1079D8D3Yazorman@.127.0.0.1...
> David Sharp (no email address supplied) writes:
> > I've been doing some experiments with speeding up copying tables of
> > approximately 1 million rows between databases using BCP and BULK
INSERT.
> > I noticed that the total time for removing the indexes (non-clustered)
> > and then recreating them after the BULK INSERT was significantly less
> > than just doing the BULK INSERT with the indexes left there, even though
> > I specified TABLOCK.
> > I would have expected SQL Server not to update the index until the
insert
> > completed (given the table lock) and so removing the indexes would have
no
> > effect. Can anyone explain why removing the indexes should speed it up?
> I have not studied this case very closely. But a few observations: if
> you supplied a batch size with /b, SQL Server had no choice but to
> maintain the indexes while loading, since each batch is committed
> separately.

I have studied this somewhat closely. :-)

And what Erland says about the /b is a critical part of it. We on a
quarterly basis have to load a multimillion set of rows. As an experiment
recently (to reconfirm my thoughts) I did a test load on a backup server.
It took I believe 3 days to do the load. This was without removing the
non-clustered indices first.

For the actual load I I removed the non-clustered indices (but kept the
clustered index since the data is BCP'd out of another table already in
order). This took well under 12 hours. (Actually not 100% sure how long it
took since I started it around midnight and the scripts finished sometime
before 9:00 AM). This included re-applying the indices to the table.

I don't know how much of a difference there would be if the data had been
completely unordered.

> When running some bulk-loading recently, I notice that when loading on
> a completely unindexed table, BCP reported the copied rows swiftly, and
> then completely directly, whereas on indexed tables there was a delay
> from when all rows had been loaded until the command had completed,
> which I supposed was spent on rebuilding indexes. I did not use TABLOCK.
> I should add that I was working on SQL 2000, and the behaviour I saw
> may reflect an improvement from SQL7.
> --
> Erland Sommarskog, SQL Server MVP, sommar@.algonet.se
> Books Online for SQL Server SP3 at
> http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment