PHP Syllable Detection [closed]

2023-02-25 16:09 问答作者：

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

开发者_如何学Go

Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.

Closed 9 years ago.

Improve this question

I would like to find a way to be able to split a word into syllables with PHP. For example, the word "nevermore" ran through detect_syllables(), would return "nev-er-more." Are there any good APIs or something out there?

There's a useful PHd thesis paper by Frank Liang that describes an exceptionally accurate algorithm for this: written over 25 years ago, it's still valid. But I'm not aware of any implementation in PHP

EDIT

A quick google has identified this link to a Text Statistics library in PHP, which includes algorithms for syllable counting within words (among other readability measuring algorithms). You should be able to find the code for syllable splitting here.

I'm actually in the finishing stages of making a PHP Hyphenator class based upon Frank Liang's algorithm and the TeX dictionaries, which pretty much seems to be the appoach taken by all office suites. (Actually I found this topic while looking for a good name for it that wasn't already taken). With slowly improving support from browsers for the entity, it's becoming a realistic option to hyphenate content in websites.

Core functionality is working; splitting (and thus counting) and/or hyphenating text and/or HTML, parsing TeX hyphen dictionaries, caching those parsed dictionaries. Some planned features are still missing but nothing that stops you from using it. Also there's no good documentation, samples, formal unittest or vanity site yet.

I've created a github site for it here and will post the current version on it ASAP, so check back in a few days.

I've only tested it with Dutch (my native language) and US English, so it may still have some issues with languages using different character sets.

Note that Frank Liang's paper is on hyphenation, NOT on syllable detection. In addition, his thesis paper itself states that its success rate is around 89% for the dictionary he used, which is not going to be good enough for everyone. There really is no substitute for manually doing it for every single word it seems. It's not that efficient to have to require a complete one-to-one lookup table wordlist in order to do it, but these days storage space is far less expensive than CPU time anyways.

Perhaps someone might consider making a CAPTCHA-like service so that many users could be asked to provide the solution to every known word, with the results checked against each other, so that one person wouldn't have to do it all themselves. I'd hope the results would be released freely once complete.

继续阅读：php

PHP Syllable Detection [closed]

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？