|
|
|||||||||
|
|||||||||
|
|||||||||
| |
|||
| |||||||||
![]() |
|
|
«
Previous Thread
|
Next Thread
»
|
Thread Tools | Search this Thread | Display Modes |
|
|
|
Stay one step ahead of the competition. Evaluate and give feedback
on some of the hottest web development tools on the market today.
Make your opinion heard! Click
Here
|
|
#1
|
|||
|
|||
|
SEt Datatype Performance Problems
I have a little performance problem using the SET datatype.
I have a database with over 800,000 rows which have 34 columns which are enum(0,1) to describe various categories. So when I search for a certain category or categories the search would be like: Select count(id) from my_table where cat1=1 and cat10=1 and cat25=1; If cat1 is indexed and cat10 is not. Select count(id) from my_table where cat=1; would return data in less than a sec while select count(id) from my_table where cat10=1 might take over 5-6 secs to return the value. Since I cant index all 34 category columns, due limitations of how many indexes I can create (there are other fields that required index besides the categories) I looked into how to do this more efficiently. I came up with the idea if using the set datatype. I created a test table CREATE TABLE set_test( id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, cats SET('cat1','cat2','cat3', 'cat34') ); I loaded it up with 800,000 values from my main table and searches like Select count(id) from set_test where cats&1; Would return the count in less than a second. So far so good. So I added a new column to the main database with the cats column, containing the same info I had in the set_test table. To my surprise: Select count(id) from my_table where cats&1; Would take over 6 secs to return the value. Or course the table has 45 columns (including the old 34 category columns) instead of two, but does it really make that much of a difference? If so, what can I do to get a decent performance? Thanks in advance, Juan |
|
#2
|
||||
|
||||
|
Sorry, I don't have a solution to offer you... rather some commentary...
Wow, you've done some heavy research on this issue... I like your own technique of "just try it and time the performance". Although this will probably decrease your performance problem, why do you have 34 fields in the table for categories? My broader question is what happens if you need to add a 35th category? Reading up on this SET Datatype, it sounds interesting... I've never used it before. The documentation does mention that a SET causes the table to be unnormalized, and you can't index a set. |
|
#3
|
|||
|
|||
|
Depending on the specifics of your data, another design option is:
Code:
TABLE1 id name foo bar TABLE2 table1_id category Then you would insert something like: Code:
INSERT INTO table1 ( name, foo, bar ) VALUES ( 'myname', 'myfoo', 'mybar') INSERT INTO table2 VALUES (table1_id,'cat1'), (table1_id,'cat5'), (table1_id,'cat20') if only cats 1, 5, and 20 are positives. Then just join the table on table1.id = table2.table1_id to only get positive results.
__________________
"A pawn is the most important piece on the chessboard -- to a pawn" |
![]() |
| Viewing: Dev Articles Community Forums > Databases > MySQL Development > SEt Datatype Performance Problems |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|
|
|