开发者

Search Engine in php? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center. Closed 12 years ago开发者_开发知识库.

I want to create a search engine in php (like Google, Ask). So, please tell me how can I create it. What is the logic for it?


<?php
    echo '
    <form action="http://www.google.com/search" method="get">
        <input type="textbox" name="q" id="q" />
        <input type="submit" value="Search" />
    </form>';
?> 


There are four basic functions that a search engine must perform:

  1. Gather a list of websites to crawl.
  2. Download the content of each of those web sites, and build up a mapping of "keywords" to pages.
  3. Allow users to type in keywords and then match those keywords against the mapping you built in step #2.
  4. Display the results from step #3 in a order that is relevant to the user.

It sounds simple, and if you have a small number of pages to search then it typically is. The difficulty comes from scaling from a 100s of pages to the billions of pages on the internet today.

Most of the difficulty - and what makes google better than many other engines - is not the technical ability to "search" billions of pages (that is, step 1-3), but deciding which of those billions of pages to show at (or near) the top of results (that's step #4).

For example, when you type "stack overflow" into google, there's 2.1 million pages in their index that matches those keywords: the thing that makes google good is it's algorithm for deciding that this stack overflow should appear as the first result (as opposed to say, the wikipedia article on the subject)

The way they do that is the subject of many university student dissertations, white papers, books and speculation. Rest assured the actual algorithm is a closely guarded secret at google and I doubt there's many who know the intimate details of every aspect of it. It's also something that's constantly changing.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜