开发者

Which searching method with my MYSQL db

Im making a (small) site in php 开发者_高级运维& mysql. The mysql database consists of a single table with possibly tens of thousands of rows (in the future).

I only need to get results from one column which has a character limit of 200.

Considering that I don't want any stopwords or "common" word features; and the column contains filenames (some with special characters), what type of search would you recommend?

EDIT: Just some more clarification on filenames:

I don't want to do an exact match; say I have a name like this: [abc]_random_file.txt; searching for "abc random" OR "[abc] random" OR "abc txt" should all match. Hope that makes sense.

I've considered:

1----SQL like statement

2----SQL regexps (as shown on this website: http://www.iamcal.com/publish/articles/php/search/ )

3----Mysql Full-Text Search(MyISAM)

4----Third party search engines(really don't want to do this)

With #2 I can probably get the results I want, however I don't think it would work well with my table if it gets big?

Would appreciate any help; im a real beginner to all this and I've been googling all day :(


What do you need to search? Optimizations can often be found when you take in account the limitations of your searches. For instance, if you only need to find all the rows that start with a specific text, you can add a simple index and use LIKE, which will give instant results.

In general, if a simple equality or LIKE will do, those will probably perform best. A regexp will anyway do a full scan. However even if there were a 100,000 rows, it would still take up mere 20MB in RAM, so a full scan through all that will not be very slow (unless you're on an ancient server).

I'd say - try the simple approach first and see what happens. If the performance reaches unacceptable levels, you should be easily able to change your approach. After all - you did state that the website was small.

Added: Just read your update. Sounds like a job for MySQL's fulltext index. Try it to see if it works for you.


  1. SQL LIKE does not scale(Flickr also concluded)
  2. I am also affraid LIKE will fail on load.
  3. I think you should try to use this. I think this page might help you achieve your goal => http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html

'apple*'

Find rows that contain words such as “apple”, “apples”, “applesauce”, or “applet”.

  1. Even better would be 3rd-party.

I don't understand the reason for your rejection. I would probably use something third-party because no big site uses MySQL FULL TEXT for there search:

  • Also elasticsearch is a very nice product you should consider. I
  • Plurk for example implemented search via Sphinx in about one week => http://amix.dk/blog/post/19441

http://www.quora.com/Why-dont-any-of-the-popular-NoSQL-solutions-include-full-text-search

As someone who deploys and administers both full-text search and NoSQL servers for a good sized web site, here's my 2 cents:

Full-Text search is not an easy problem. It's fairly well understood, but it takes a fair amount of engineering effort and specialized knowledge to do really, really well and get all the details right. And nobody wants a half-assed search system built into their product--or one with weird limitations.

MySQL had full-text search that wasn't bad, but it was really only implemented in the MyISAM storage engine. But all the cool kids use InnoDB these days, so it's mostly useless or requires duplication of data. Yuck.

Personally, I'm a fan of Sphinx, which can be built into MySQL or used stand-alone, which is how we use it for Craigslist. We're also readying a MongoDB deployment that will be paired with Sphinx for full-text search. So you might think that I can't wait until 10gen implements full-text search into MongoDB, right?

Not really.

I subscribe to the Unix philosophy of having smaller tools that do fewer things but do them really well. What I've suggested to Dwight (and maybe Eliot too?) is that MongoDB should have a standard interface for plugging in various full-text search solutions. That is, it needs a way to notify external tools of new/changed/deleted documents so they can be handled appropriately. (Given MongoDB's oplog, this seems doable at some level.)

Now I know full well that Riak has an answer already. And I know that 10gen will build it into MongoDB anyway. As a provider of technology, they feel strongly that folks want a "complete" solution that works out of the box for a lot of people. But it (full-text) is never going to be the sort of thing that ships in version 1.0 of these products either. It's simply not a "core" feature.

It's also worth noting that from a performance and tuning point of view, search is different that document (or record) retrieval, which can complicate deployment matters too.


Did you tried Sphinx? refer this :http://www.ibm.com/developerworks/library/os-php-sphinxsearch/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜