PHP标签Tag的设计模式

没有太多时间进行全文翻译,就把重点挑出来,用自己的话串起来,名曰 选择性翻译。 以后可能会比较多的采用这种方式。

  社会书签的tag存储一直是一个比较麻烦的问题。

  一个好的数据表设计,不但要能准确查出tag,还应该支持tag的AND/OR/NOT查询。我们来看看解决方案。

  “MySQLicious” solution

  表结构

  存储实例

  Intersection (AND)

  “search+webservice+semweb”类的查询:

  SELECT *

  FROM `delicious`

  WHERE tags LIKE "%search%"

  AND tags LIKE "%webservice%"

  AND tags LIKE "%semweb%"

  Union (OR)

  “search|webservice|semweb”类的查询:

  SELECT *

  FROM `delicious`

  WHERE tags LIKE "%search%"

  OR tags LIKE "%webservice%"

  OR tags LIKE "%semweb%"

  Minus

  “search+webservice-semweb”类的查询

  SELECT *

  FROM `delicious`

  WHERE tags LIKE "%search%"

  AND tags LIKE "%webservice%"

  AND tags NOT LIKE "%semweb%"

  优点:

  只有一个表 SQL比较直接

  可以用mysql的全文检索来做,效率更高

  缺点:

  tag的数量受到限制,通常我们都用varchar,这种字段只256个字节长。否则,你需要用text类型,速度会变慢。(Easy注,PHP(PHP培训 php教程 )more的tag用的就是TinyText)

  Like ‘%things%’不精确,当然某些应用中,这反而是需要的

  “Scuttle” solution

  数据表

  Intersection (AND)

  Query for “bookmark+webservice+semweb”:

  SELECT b.*

  FROM scBookmarks b, scCategories c

  WHERE c.bId = b.bId

  AND (c.category IN (’bookmark’, ‘webservice’, ’semweb’))

  GROUP BY b.bId

  HAVING COUNT( b.bId )=3

  首先,所有书签-tag组合被搜出来 (c.category IN (‘bookmark‘, ‘webservice‘, ‘semweb‘)), ,然后选择其中包含三个的(HAVING COUNT(b.bId)=3)

  Union (OR)

  Query for “bookmark|webservice|semweb”:

  只需要去掉?AND查询中的HAVING子句:

  SELECT b.*

  FROM scBookmarks b, scCategories c

  WHERE c.bId = b.bId

  AND (c.category IN (’bookmark’, ‘webservice’, ’semweb’))

  GROUP BY b.bId

  Minus (Exclusion)

  Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.

  SELECT b. *

  FROM scBookmarks b, scCategories c

  WHERE b.bId = c.bId

  AND (c.category IN (’bookmark’, ‘webservice’))

  AND b.bId NOT

  IN (SELECT b.bId FROM scBookmarks b, scCategories c WHERE b.bId = c.bId AND c.category = ’semweb’)

  GROUP BY b.bId

  HAVING COUNT( b.bId ) =2

  好处: 我觉得这个方案比前一个方案好的最大理由是,可以有无限个tag。

  “Toxi” solution

  数据表

  Intersection (AND)

  Query for “bookmark+webservice+semweb”

  SELECT b.*

  FROM tagmap bt, bookmark b, tag t

  WHERE bt.tag_id = t.tag_id

  AND (t.name IN (’bookmark’, ‘webservice’, ’semweb’))

  AND b.id = bt.bookmark_id

  GROUP BY b.id

  HAVING COUNT( b.id )=3

  Union (OR)

  Query for “bookmark|webservice|semweb”

  SELECT b.*

  FROM tagmap bt, bookmark b, tag t

  WHERE bt.tag_id = t.tag_id

  AND (t.name IN (’bookmark’, ‘webservice’, ’semweb’))

  AND b.id = bt.bookmark_id

  GROUP BY b.id

  Minus (Exclusion)

  Query for “bookmark+webservice-semweb”, that is: bookmark AND webservice AND NOT semweb.

  SELECT b. *

  FROM bookmark b, tagmap bt, tag t

  WHERE b.id = bt.bookmark_id

  AND bt.tag_id = t.tag_id

  AND (t.name IN (’Programming’, ‘Algorithms’))

  AND b.id NOT IN (SELECT b.id FROM bookmark b, tagmap bt, tag t WHERE b.id = bt.bookmark_id AND bt.tag_id = t.tag_id AND t.name = ‘Python’)

  GROUP BY b.id

  HAVING COUNT( b.id ) =2

  Leaving out theHAVING COUNTleads to the Query for “bookmark|webservice-semweb”.

  好处:

  你可以给每个tag添加额外的信息

  这是最规范的方案,第三范式。

  坏处:

  删除tag时,你要从多个表中删除(Easy注,Mysql5的话,可以用trigger来做)

  然后我们把视线从功能转移到性能上。

  A+B

  250个tag

  999个tag

  A OR B

  250个tag

  添加速度比较

  测试代码下载?Download the source code (PHP) LGPL协议。

郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。