FAST Search for Sharepoint Crawler issue with Dokuwiki pages

2023-03-15 04:59 问答作者：

My level of frustion is maxxing out over crawling Dokuwiki sites.

I have a content source using FAST search for SharePoint that i have set up to crawl a dokuwiki/doku.php site. My crawler rules are set to: http://servername/* , match case and include all items in this path with crawl complex urls.. testing the content source in the crawl rules shows that it will be crawled by the crawler. However..... The crawl always last for under 2 minutes and completes having only crawled the page I pointed to and no other link on that page. I have check with the Dokuwki admin and he has the robots text set to allow. when I look at the source on the pages I see that it says meta name="robots" content="index,follow"

so in order to test that the other linked pages were not a problem, I added those links to the content souce manually and recrawled.. examp开发者_Python百科le source page has three links

site A
site B
site C.

I added Site A,B and C urls to the crawl source. The results of this crawl are 4 successes, the primary souce page and the other links A,B, and C i manually added.

So my question is why wont the crawler crawl the link on the page? is this something I need to do with the crawler on my end or is it something to do with how namespaces are defined and links constructed with Dokuwiki?

Any help would be appreciated

Eric

Did you disable the delayed indexing options and rel=nofollow options?

The issue was around authentication even though no issues were reported suggesting it was authentication in the FAST Crawl Logs. The fix was adding a $freepass setting for the IP address of the Search indexing server so that Appache would not go through the authentication process for each page hit.

Thanks for the reply

Eric

继续阅读：dokuwiki sharepoint-2010

FAST Search for Sharepoint Crawler issue with Dokuwiki pages

My level of frustion is maxxing out over crawling Dokuwiki sites.

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

My level of frustion is maxxing out over crawling Dokuwiki sites.

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？