开发者

How is wordweb english dictionary implemented?

We need to keep some in-memory data structure to keep english word dictionary in memory. When the computer/wordweb starts,we need to read dictionary from disk into an in-memory data structure.

This question asks how do we populate from disk to in-memory data structure in typical real world dictionaries say wordweb?

Ideally we would like to keep dictionary in disk in the way, we require it in in-memory, so that we don't have to spend time buil开发者_开发百科ding in-memory data structure, we just read it off the disk. But for linked lists, pointers etc, how do we store the same image in disk. Some relative addresses etc would help here?

Typically, is the entire dictionary read and stored in memory. or only part/handlers and leaf page IOs are done, when searching for a specific word.

If somebody wants to help with what that in-memory data structure is typically, please go ahead.

Thanks,


You mentioned pointers, so I'm assuming you're using C++; if that's the case and you want to read directly from disk into memory without having to "rebuild" your data structure, then you might want to look into serialization: How do you serialize an object in C++?

However, you generally don't want to load the entire dictionary anyway, especially if it's a user application. If the user is looking up dictionary words, then reading from disk happens so fast that the user will never notice the "delay." If you're servicing hundreds or thousands of requests, then it might make sense to cache the dictionary into memory.

So how many users do you have?
What kind of load are you expecting to have on the application?


Wordweb is using Sqlite Database at backend. It makes sense to me to use a Database system to store the content so its easier to GET the content which the user is looking for quickly.

Wordweb has Word prediction as well... so it will be a query to database like

select word from table where word='ab%';

on the other hand, when the user presses enter for the word

select meaning from table where word='abandon'

You do not want to be Serializing the content from disk to memory while the user is typing or after he has pressed Enter to search. Since the data will be large (Dictionary), Serialization will probably take time more then the user will tolerate for every word search.


Else why don't you create a JSON format File containing all the meaning creating a short form of Dictionary ?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜