MySQL Development
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
 
Go Back   Dev Articles Community ForumsDatabasesMySQL Development

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Dev Articles Community Forums Sponsor:
Stay one step ahead of the competition. Evaluate and give feedback on some of the hottest web development tools on the market today. Make your opinion heard! Click Here
  #1  
Old April 21st, 2005, 07:33 AM
redfive redfive is offline
Registered User
Dev Articles Newbie (0 - 499 posts)
 
Join Date: Apr 2005
Posts: 1 redfive User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 13 m 43 sec
Reputation Power: 0
SEt Datatype Performance Problems

I have a “little” performance problem using the SET datatype.

I have a database with over 800,000 rows which have 34 columns which are enum(‘0’,’1’) to describe various categories.

So when I search for a certain category or categories the search would be like:

Select count(id) from my_table where cat1=’1’ and cat10=’1’ and cat25=’1’;

If cat1 is indexed and cat10 is not.

Select count(id) from my_table where cat=’1’;

would return data in less than a sec while

select count(id) from my_table where cat10=’1’

might take over 5-6 secs to return the value.

Since I can’t index all 34 category columns, due limitations of how many indexes I can create (there are other fields that required index besides the categories) I looked into how to do this more efficiently. I came up with the idea if using the set datatype. I created a test table

CREATE TABLE set_test(
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
cats SET('cat1','cat2','cat3',……'cat34')
);


I loaded it up with 800,000 values from my main table and searches like

Select count(id) from set_test where cats&1;

Would return the count in less than a second.

So far so good. So I added a new column to the main database with the ‘cats’ column, containing the same info I had in the set_test table. To my surprise:

Select count(id) from my_table where cats&1;

Would take over 6 secs to return the value.

Or course the table has 45 columns (including the old 34 category columns) instead of two, but does it really make that much of a difference? If so, what can I do to get a decent performance?


Thanks in advance,

Juan

Reply With Quote
  #2  
Old April 21st, 2005, 11:26 AM
MadCowDzz's Avatar
MadCowDzz MadCowDzz is offline
I'm Internet Famous
Dev Articles Frequenter (2500 - 2999 posts)
 
Join Date: Jan 2003
Location: Toronto, Canada
Posts: 2,890 MadCowDzz User rank is Lance Corporal (50 - 100 Reputation Level)MadCowDzz User rank is Lance Corporal (50 - 100 Reputation Level)MadCowDzz User rank is Lance Corporal (50 - 100 Reputation Level) 
Time spent in forums: 1 Week 16 h 4 m 48 sec
Reputation Power: 8
Sorry, I don't have a solution to offer you... rather some commentary...

Wow, you've done some heavy research on this issue...
I like your own technique of "just try it and time the performance".

Although this will probably decrease your performance problem, why do you have 34 fields in the table for categories?
My broader question is what happens if you need to add a 35th category?

Reading up on this SET Datatype, it sounds interesting... I've never used it before.

The documentation does mention that a SET causes the table to be unnormalized, and you can't index a set.

Reply With Quote
  #3  
Old April 21st, 2005, 12:54 PM
Madpawn Madpawn is offline
My beat is correct.
Dev Articles Newbie (0 - 499 posts)
 
Join Date: Dec 2004
Posts: 339 Madpawn User rank is Private First Class (20 - 50 Reputation Level)Madpawn User rank is Private First Class (20 - 50 Reputation Level) 
Time spent in forums: 2 Days 22 h 3 m 33 sec
Reputation Power: 4
Depending on the specifics of your data, another design option is:

Code:
  TABLE1
  id
  name
  foo
  bar
  
  TABLE2
  table1_id
  category
  



Then you would insert something like:

Code:
  INSERT INTO
   table1
  (
   name,
   foo,
   bar
  ) VALUES (
   'myname',
   'myfoo',
   'mybar')
  
  INSERT INTO
   table2
  VALUES 
   (table1_id,'cat1'),
   (table1_id,'cat5'),
   (table1_id,'cat20')
  


if only cats 1, 5, and 20 are positives. Then just join the table on table1.id = table2.table1_id to only get positive results.
__________________
"A pawn is the most important piece on the chessboard -- to a pawn"


Reply With Quote
Reply

Viewing: Dev Articles Community ForumsDatabasesMySQL Development > SEt Datatype Performance Problems


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 





© 2003-2008 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway