How to make a small engine like Wolfram|Alpha?

2022-12-29 19:46 问答作者：

Lets say I have three models/tables: operating_systems, words, and programming_languages:

# operating_systems
name:string created_by:string family:string
Windows     Microsoft         MS-DOS
Mac OS X    Apple             UNIX
Linux       Linus Torvalds    UNIX
UNIX        AT&T              UNIX

# words
word:string defenitions:string
window      (serialized hash of defenitions)
hello       (serialized hash of defenitions)
UNIX        (serialized hash of defenitions)

# programming_languages
name:string created_by:string example_code:text
C++         Bjarne Stroustrup #include <iostream> etc...
HelloWorld  Jeff Skeet        h
AnotherOne  Jon Atwood        imports 'SORU开发者_如何学PythonLEZ.cs' etc...

When a user searches hello, the system shows the defenitions of 'hello'. This is relatively easy to implement. However, when a user searches UNIX, the engine must choose: word or operating_system. Also, when a user searches windows (small letter 'w'), the engine chooses word, but should also show Assuming 'windows' is a word. Use as an <a href="etc..">operating system</a> instead.

Can anyone point me in the right direction with parsing and choosing the topic of the search query? Thanks.

Note: it doesn't need to be able to perform calculations as WA can do.

Have a new index table called terms that contains a tokenised version of each valid term. That way, you only have to search one table.

# terms
Id Name     Type               Priority
1  window   word               false
2  Windows  operating_system   true

Then you can see how close a match the users search term is. I.e. "Windows" would be a 100% match with 2 - so assume that, but a close match to 1 also, so suggest that as an alternative. You've have to write your own rules engine that decided how close a word matches (i.e. what gets assumed with "windows" vs "Windows"?) The Priority field could be the final decider if the rules engine can't decide, and could in theory be driven by user activity so it learns what users are more likely referring to.

And what about to make a cache in form of a database table where all the keywords would be.

The search query would be something like this:

SELECT * FROM keywords WHERE keyword = '<YourKeyWord>'   /* mysql */

the keywords table would contain some kind of references to your modules.

The advantage of this approarch is of course fast searching.

You may use two queries in order to simulate the behaviour you ask for:

Exact match (no problem in mysql)
Case insensitive search

Wolfram Alpha is far more complex than your example... I'm not certain of its inner workings (I have done very little reading on it), but I believe it is a very large and complex automated inference system. They're rather trivial to implement (Prolog is basically a general purpose one you can put whatever data you need into), but they're very hard to make useful.

继续阅读：parsing prediction ruby-on-rails wolframalpha

How to make a small engine like Wolfram|Alpha?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？