Web Programming language for very large lists?
In your experience, what is the best web programming language used to handle sorting and comparison of very large lists (ie tens of thousands of email addresses)?
I am most familiar with PHP. I think that it could get the job done, but I'm unsure of other languages and if there might be a bettor suitor开发者_Go百科.
Thanks!
Is it possible to do the sorting inside of a database? They are designed to do dynamic sorting and comparison. I would suggest you move to a model that lets the DB handle this sort of activity.
If you really really can't use a DB for some reason then you should focus on algorithms over languages. Pick a language based on other criteria (personal familiarity, does it support your other tasks, does it have an active support community, etc etc) and figure out the best algorithm given that language's quirks. For instance, according to some of the discussion in https://stackoverflow.com/questions/309300/defend-php-convince-me-it-isnt-horrible, PHP has relatively poor recursion performance.
But seriously, use a database for this.
I would store the emails in a database, and use SQL to perform sorts and searches. That is what databases were designed for, and they will have intelligent solutions that will outperform anything most people could write in code.
This doesn't depend on the programming language , it depends on the logic ,lets say be it indexes or table schemas and caching mechanism.
Language usually doesn't matter TOO much. Pick the one you are comfortable most with.
The final product is shaped by the builder, not the tools.
You can also use a trie which is a prefix tree data structure - for sorting in memory.
Email addresses have restrictive character set (a-z
, 0-9
, _
, .
etc.), so the trieNode would only have those characters. This topcoder tutorial on trie is a good starting point if you don't already know about trie.
You have to go through all the strings to construct the trie.
Searching / Comparison takes O(l) time where l is the length of the string you are comparing.
Sorting requires you traversing all the trieNodes of the trie tree using DFS (depth first search) - O(|V| + |E|) time.
Your fastest would be a compiled cgi.
精彩评论