开发者

Which is best serach technique to search records [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 10 years ago.

I have 10,000,000 records which will be the best technique to search records, currently i m using full text searc开发者_如何学Pythonh but it is slow,please suggest.


There is no one-size-fits-all solution but you can try out:

Sphinx

How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.

Sphinx is a full-text search engine, distributed under GPL version 2. Commercial license is also available for embedded use.

Generally, it's a standalone search engine, meant to provide fast, size-efficient and relevant fulltext search functions to other applications. Sphinx was specially designed to integrate well with SQL databases and scripting languages. Currently built-in data sources support fetching data either via direct connection to MySQL or PostgreSQL, or using XML pipe mechanism (a pipe to indexer in special XML-based format which Sphinx recognizes).

As for the name, Sphinx is an acronym which is officially decoded as SQL Phrase Index. Yes, I know about CMU's Sphinx project.

http://www.sphinxsearch.com/

Lucene PHP (Part of Zend Framework):

Zend_Search_Lucene is a general purpose text search engine written entirely in PHP 5. Since it stores its index on the filesystem and does not require a database server, it can add search capabilities to almost any PHP-driven website. Zend_Search_Lucene supports the following features:

  • Ranked searching - best results returned first
  • Many powerful query types: phrase queries, boolean queries, wildcard
    queries, proximity queries, range
    queries and many others.
  • Search by specific field (e.g., title, author, contents)

http://framework.zend.com/ http://framework.zend.com/manual/en/zend.search.lucene.overview.html


It depends on several simple questions:

  • what kind of data is processed? (simple entries like "Firstname, Lastname" or more complex datasets?
  • how is it structured? (plain database table? partitioned?)
  • what do you search for? (i.e. search for names in telephone directory)


Because i didn't worked with such a large datasets like this here are some ideas that may work:

First question is that these records are static (geoip's for example) or not?

  • I'd try to optimize my database as much as i can (try using EXPLAIN if you're using MySQL)
  • Look out for every kind of queries that can be possible, try to optimize your database against these queries
  • If indexes are fine i'll go with some kind of cache where i would save my previous resultsets. This will be handy when your database isn't updated regulary.
  • You can cron the job above (for example: most used search queries and their results can be precached too)
  • Try to optimize these ideas for your needs

If you can provide some more details maybe i can refine my tips.


Use Solr . It's lucene with some additions easily accessible by http protocol. It's blazing fast in comparison to any full text searches from mysql.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜