|
|
|||||||||
|
|||||||||
|
|||||||||
| |
|||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
I am doing a basic SQL statement that reads from 1 file and has a simple where statement as such...
select * from table1 where date1 = '1/8/2003' Table1 has over 8 million records and an index on the field in the where statement (date1). This works lightning fast. Problem is when I join another file that currently has only one record in it as such... select * from table1, table2 where date1 = date2 Seems simple enough, and with only one record in table2... I didn't think performance would be effected too much. WRONG! The statement went from sub-second results to taking over 4 minutes. I am doing the join because table2 may have multiple records. Anybody have a clue how to beat this problem? |
|
#2
|
|||
|
|||
|
Have you tried an inner join instead?:
select * from table1 inner join table2 on date1 = date2 where date1 = '1/8/2003' Also try creating an index on the specifity field that you're after in table 2. |
|
#3
|
|||
|
|||
|
I did try an inner join and also tried doing a nested SQL statement as such
select field1 from table1 where field1 in (select fieldA from table2) but same problem. I did stumble upon a solution, but now would like to know why from anyone that may know. When I reduced table2 from 2 fields to 1 field... the join executed in under 1 second. As soon as I add a second field to table 2 again... 4-5 minutes to execute. With only 1 record in table2... I don't comprehend why the drastic difference. Can anyone explain this? |
|
#4
|
|||
|
|||
|
Quote:
To the best of my knowledge, it's because you're comparing a single entry against the 8 million in the other table, which is the same as comparing a single string against those 8 million entries. The single row-table gets treated that way, however when you have 2 rows, it then has to perform a join between the two tables regardless of how you do it. Either through an inner join or through a nested select. Meaning you're making 16-million comparisons if every entry is duplicated across both tables. You're probably getting milisecond responses for a single entry search because the larger table is indexed, but indexing doesn't help when you have to duplicate everything. It could come down to memory usage too... with 16-million entries trying to be processed, there could be a lot of disk caching going on that would slow things down. Hope this helps... and if anything I've stated is incorrect, please correct me so I'll know in the future. |
![]() |
| Viewing: Dev Articles Community Forums > Databases > Microsoft SQL Server > SQL query performance SQL Server |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|