开发者

How to open varbinary word doc as HTML

I have a problem which I have not been able to find an answer to in months. I store word doc resumes as varbinary(max). I can retrieve the resumes based on a full-text search – no problem. But the resumes are retrieved as word documents in a .ashx file with the following code. I really need to implement hit highlighting on the site so that users can see if the returned resume is a good fit or not. I don’t think this can be done from an .ashx file, so I think I need to be able to open the resume as html in an aspx page and maybe use javascript to do the hit highlighting or perhaps return the text only content of the word document somehow and manipulate the text before display with html tags. I cant find anything anywhere which addresses the problem. I am really hoping that someone can point me in the right direction.

Thanks in advance for any advice.

Imports System.io
Imports System.Web
Imports System.Data
Imports System.Data.SqlClient

 

Public Class ReadResume : Implements IHttpHandler

Const conString As String = "Data Source=tcp:sql2k804.discountasp.net;Initial Catalog=SQL2008R2_284060_resumedata;User ID=SQL2008R2_284060_resumedata_user;Password=mypwd2314;"

Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest

Dim con As SqlConnection = New SqlConnection(conString)

Dim cmd As SqlCommand = New SqlCommand("Select ResumeDoc, DocTypeExtension From ResumeTable WHERE CandidateId=@CandidateId", con)

Dim CId As String = System.Web.HttpContext.Current.Request.QueryString("Para")
cmd.Parameters.AddWithValue("@CandidateId", CId)

Using con
con.Open()
Dim myReader As SqlDataReader = cmd.ExecuteReader
If myReader.Read() Then
   context.Response.Clear()
   context.Response.ClearContent()
   context.Response.ClearHeaders()
   Dim file() As Byte = CType(myReader("ResumeDoc"), Byte())
   Dim doc_type As String = CType(myReader("DocTypeExtension"), String)
   context.Res开发者_StackOverflow社区ponse.ContentEncoding = System.Text.Encoding.UTF8
   context.Response.ContentType = "Application/msword"
   context.Response.AddHeader("content-disposition", "Candidate Resume")
   context.Response.BinaryWrite(file)
End If

End Using

End Sub



Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable

Get

Return False

End Get

End Property



End Class


You can use Microsoft Office COM components to deal with Word documents. For example, that is the way to convert Word to HTML: http://rongchaua.net/blog/c-convert-word-to-html/

UPDATE: There are other solutions.

If you have only .docx (not .doc) documents then you can use this simple code to extract plain text from docx documents: http://www.codeproject.com/KB/office/ExtractTextFromDOCXs.aspx This is the same code: http://conceptdev.blogspot.com/2007/03/open-docx-using-c-to-extract-text-for.html

There are some commercial libraries for reading/writing Word documents: http://www.aspose.com/categories/.net-components/aspose.words-for-.net/default.aspx http://www.cellbi.com/Products.aspx

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜