Wednesday, March 7, 2012

Indexes on Joined Columns

I have the following select statement used to create a view:
select
st.col1,
st.col2
from
FirstTable ft
inner join
SecondTable st
on
ft.coluid = st.colfid
and
ft.colkey = st.colkey
My statistics are up to date.
On FirstTable I have a clustered (Unique) primary key comprised of a
composite of coluid and colkey (in that order) having 64,000 rows. Column
coluid has a selectivity ratio of 0.94 and colkey has a selectivity ratio of
0.
On SecondTable I have a clustered (non unique) primary key comprised of a
composite of colfid and colkey (in that order) having 168,000 rows. Column
colfid has a selectivity ratio of 0.35 and colkey has a selectivity ratio of
0.
The two columns returned (st.col1, st.col2) from the select statement are no
t
indexed.
I was under the impression from various readings that having indexes on join
s
are beneficial, but in testing, when I have the clustered indexes, versus
dropping the indexes and running the select statement, there is no differenc
e
in the output of statistics io.
Also the execution plan states that each table receives a clustered index
scan.
The only benefit I see from having indexes on the joins is in CPU time, with
indexes 187 ms., without 375 ms.
Is that the benefit (reduced CPU, but no reduction of Logical Reads) spoken
of when I read that indexes on joined columns are beneficial? Also, wouldn't
my two columns returned be covered by the clustered index, if the index were
ever used?
Message posted via droptable.com
http://www.droptable.com/Uwe/Forum...server/200605/1Um, basically, you want the entire contents of both tables. How about
filtering in some way? Also. if you are using SQL Server 2000, try adding
clustered index on (st.colfid, st.colkey, st.col1, st.col2). If you have
SQL 2005, create the nonclustered on (st.col1, st.col2) with included
columns on (st.colfid, st.colkey).
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Toronto, ON Canada
"cbrichards via droptable.com" <u3288@.uwe> wrote in message
news:60c001bcd2645@.uwe...
I have the following select statement used to create a view:
select
st.col1,
st.col2
from
FirstTable ft
inner join
SecondTable st
on
ft.coluid = st.colfid
and
ft.colkey = st.colkey
My statistics are up to date.
On FirstTable I have a clustered (Unique) primary key comprised of a
composite of coluid and colkey (in that order) having 64,000 rows. Column
coluid has a selectivity ratio of 0.94 and colkey has a selectivity ratio of
0.
On SecondTable I have a clustered (non unique) primary key comprised of a
composite of colfid and colkey (in that order) having 168,000 rows. Column
colfid has a selectivity ratio of 0.35 and colkey has a selectivity ratio of
0.
The two columns returned (st.col1, st.col2) from the select statement are
not
indexed.
I was under the impression from various readings that having indexes on
joins
are beneficial, but in testing, when I have the clustered indexes, versus
dropping the indexes and running the select statement, there is no
difference
in the output of statistics io.
Also the execution plan states that each table receives a clustered index
scan.
The only benefit I see from having indexes on the joins is in CPU time, with
indexes 187 ms., without 375 ms.
Is that the benefit (reduced CPU, but no reduction of Logical Reads) spoken
of when I read that indexes on joined columns are beneficial? Also, wouldn't
my two columns returned be covered by the clustered index, if the index were
ever used?
Message posted via droptable.com
http://www.droptable.com/Uwe/Forum...server/200605/1|||A clustered index scan is a table scan, so that is no surprise.
You cannot use statistics IO to tell which is the better plan. As you have
noticed, the IO is the same, but the CPU is different. What kind of join
does the plan say is being performed? Without indexes, my guess is that
you're getting a hash join, which does a single pass through each table
while it builds and analyzes hash lists. So the IO is only one pass through
each table. But the extra work of building and examining the hash lists, but
the extra memory required to stored the hashed information, can make that
type of join very resource intensive.
HTH
Kalen Delaney, SQL Server MVP
"cbrichards via droptable.com" <u3288@.uwe> wrote in message
news:60c001bcd2645@.uwe...
>I have the following select statement used to create a view:
> select
> st.col1,
> st.col2
> from
> FirstTable ft
> inner join
> SecondTable st
> on
> ft.coluid = st.colfid
> and
> ft.colkey = st.colkey
> My statistics are up to date.
> On FirstTable I have a clustered (Unique) primary key comprised of a
> composite of coluid and colkey (in that order) having 64,000 rows. Column
> coluid has a selectivity ratio of 0.94 and colkey has a selectivity ratio
> of
> 0.
> On SecondTable I have a clustered (non unique) primary key comprised of a
> composite of colfid and colkey (in that order) having 168,000 rows. Column
> colfid has a selectivity ratio of 0.35 and colkey has a selectivity ratio
> of
> 0.
> The two columns returned (st.col1, st.col2) from the select statement are
> not
> indexed.
> I was under the impression from various readings that having indexes on
> joins
> are beneficial, but in testing, when I have the clustered indexes, versus
> dropping the indexes and running the select statement, there is no
> difference
> in the output of statistics io.
> Also the execution plan states that each table receives a clustered index
> scan.
> The only benefit I see from having indexes on the joins is in CPU time,
> with
> indexes 187 ms., without 375 ms.
> Is that the benefit (reduced CPU, but no reduction of Logical Reads)
> spoken
> of when I read that indexes on joined columns are beneficial? Also,
> wouldn't
> my two columns returned be covered by the clustered index, if the index
> were
> ever used?
> --
> Message posted via droptable.com
> http://www.droptable.com/Uwe/Forum...server/200605/1|||I follow your logic in SQL Server 2000, of adding a clustered index on (st.
colfid, st.colkey, st.col1, st.col2). But I am not sure about your statemen
t
regarding SQL 2005. Are you saying create a nonclustered index on (st.col1,
st.col2, st.colfid, st.colkey)? If so, what is the difference between the
indexing from SQL 2000 to SQL 2005 that would make you want to cluster in SQ
L
2000 and adding an additional nonclustered index in SQL 2005?
Tom Moreau wrote:
>Also. if you are using SQL Server 2000, try adding
>clustered index on (st.colfid, st.colkey, st.col1, st.col2). If you have
>SQL 2005, create the nonclustered on (st.col1, st.col2) with included
>columns on (st.colfid, st.colkey).
Message posted via droptable.com
http://www.droptable.com/Uwe/Forum...server/200605/1|||I am receiving a merge join, with clustered index scans when the clustered
indexes are in use. Additionally I can change the clustered index scan for
the node representing FirstTable when I apply the filter "where ft.coluid >
0".
When the clustered indexes are removed I have table scans and a hash join.
Kalen Delaney wrote:[vbcol=seagreen]
>A clustered index scan is a table scan, so that is no surprise.
>You cannot use statistics IO to tell which is the better plan. As you have
>noticed, the IO is the same, but the CPU is different. What kind of join
>does the plan say is being performed? Without indexes, my guess is that
>you're getting a hash join, which does a single pass through each table
>while it builds and analyzes hash lists. So the IO is only one pass through
>each table. But the extra work of building and examining the hash lists, bu
t
>the extra memory required to stored the hashed information, can make that
>type of join very resource intensive.
>
>[quoted text clipped - 49 lines]
Message posted via http://www.droptable.com|||Sorry, I wasn't clear. I can change the clustered index scan for the node
representing FirstTable when I apply the filter "where ft.coluid 0" to a
clustered index seek.
cbrichards wrote:[vbcol=seagreen]
>I am receiving a merge join, with clustered index scans when the clustered
>indexes are in use. Additionally I can change the clustered index scan for
>the node representing FirstTable when I apply the filter "where ft.coluid >
>0".
>When the clustered indexes are removed I have table scans and a hash join.
>
>[quoted text clipped - 11 lines]
Message posted via droptable.com
http://www.droptable.com/Uwe/Forum...server/200605/1|||In SQL 2005, you have the option of "included columns". This makes covering
indexes more efficient, since the included columns exist only in the leaf
pages of the index. Basically, make the criteria from your JOIN and WHERE
clauses to be your index keys. Then make the included columns to be any
other columns from your SELECT list that don't already exist in your index
keys.
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Toronto, ON Canada
.
"cbrichards via droptable.com" <u3288@.uwe> wrote in message
news:60c845387d9c4@.uwe...
I follow your logic in SQL Server 2000, of adding a clustered index on (st.
colfid, st.colkey, st.col1, st.col2). But I am not sure about your
statement
regarding SQL 2005. Are you saying create a nonclustered index on (st.col1,
st.col2, st.colfid, st.colkey)? If so, what is the difference between the
indexing from SQL 2000 to SQL 2005 that would make you want to cluster in
SQL
2000 and adding an additional nonclustered index in SQL 2005?
Tom Moreau wrote:
>Also. if you are using SQL Server 2000, try adding
>clustered index on (st.colfid, st.colkey, st.col1, st.col2). If you have
>SQL 2005, create the nonclustered on (st.col1, st.col2) with included
>columns on (st.colfid, st.colkey).
Message posted via droptable.com
http://www.droptable.com/Uwe/Forum...server/200605/1

No comments:

Post a Comment