开发者

ICD-9 Code List in XML, CSV, or Database format [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

开发者_高级运维

Closed 8 years ago.

Improve this question

I am looking for a complete list of ICD-9 Codes (Medical Codes) for Diseases and Procedures in a format that can be imported into a database and referenced programmatically. My question is basically exactly the same as Looking for resources for ICD-9 codes, but the original poster neglected to mention where exactly he "got ahold of" his complete list.

Google is definitely not my friend here as I have spent many hours googling the problem and have found many rich text type lists (such as the CDC) or websites where I can drill down to the complete list interactively, but I cannot find where to get the list that would populate these websites and can be parsed into a Database. I believe the files here ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/ have what I am looking for but the files are rich text format and contain a lot of garbage and formatting that would be difficult to remove accurately.

I know this has to have been done by others and I am trying to avoid duplicating other peoples effort but I just cannot find an xml/CSV/Excel list.


Centers for Medicaid & Medicare services provides excel files which contain just the codes and diagnosis, which can be imported directly into some SQL databases, sans conversion.

Zipped Excel files, by version number

(Update: New link based on comment below)


After removing the RTF it wasn't too hard to parse the file and turn it into a CSV. My resulting parsed files containing all 2009 ICD-9 codes for Diseases and Procedures are here: http://www.jacotay.com/files/Disease_and_ProcedureCodes_Parsed.zip My parser that I wrote is here: http://www.jacotay.com/files/RTFApp.zip Basically it is a two step process - take the files from the CDC FTP site, and remove the RTF from them, then select the RTF-free files and parse them into the CSV files. The code here is pretty rough because I only needed to get the results out once.

Here is the code for the parsing app in case the external links go down (back end to a form that lets you select a filename and click the buttons to make it go)

Public Class Form1

Private Sub btnBrowse_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnBrowse.Click
    Dim p As New OpenFileDialog With {.CheckFileExists = True, .Multiselect = False}
    Dim pResult = p.ShowDialog()
    If pResult = Windows.Forms.DialogResult.Cancel OrElse pResult = Windows.Forms.DialogResult.Abort Then
        Exit Sub
    End If
    txtFileName.Text = p.FileName
End Sub

Private Sub btnGo_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnGo.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    FileText = RemoveRTF(FileText)
    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_fixed" & pFile.Extension), FileText)

End Sub


Function RemoveRTF(ByVal rtfText As String)
    Dim rtBox As System.Windows.Forms.RichTextBox = New System.Windows.Forms.RichTextBox

    '// Get the contents of the RTF file. Note that when it is
    '// stored in the string, it is encoded as UTF-16.
    rtBox.Rtf = rtfText
    Dim plainText = rtBox.Text

    Return plainText
End Function


Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Dim pFile = New IO.FileInfo(txtFileName.Text)
    Dim FileText = IO.File.ReadAllText(pFile.FullName)
    Dim DestFileLine As String = ""
    Dim DestFileText As New System.Text.StringBuilder

    'Need to parse at lines with numbers, lines with all caps are thrown away until next number
    FileText = Strings.Replace(FileText, vbCr, "")
    Dim pFileLines = FileText.Split(vbLf)
    Dim CurCode As String = ""
    For Each pLine In pFileLines
        If pLine.Length = 0 Then
            Continue For
        End If
        pLine = pLine.Replace(ChrW(9), " ")
        pLine = pLine.Trim

        Dim NonCodeLine As Boolean = False
        If IsNumeric(pLine.Substring(0, 1)) OrElse (pLine.Length > 3 AndAlso (pLine.Substring(0, 1) = "E" OrElse pLine.Substring(0, 1) = "V") AndAlso IsNumeric(pLine.Substring(1, 1))) Then
            Dim SpacePos As Int32
            SpacePos = InStr(pLine, " ")
            Dim NewCode As String
            NewCode = ""
            If SpacePos >= 3 Then
                NewCode = Strings.Left(pLine, SpacePos - 1)
            End If

            If SpacePos < 3 OrElse Strings.Mid(pLine, SpacePos - 1, 1) = "." OrElse InStr(NewCode, "-") > 0 Then
                NonCodeLine = True
            Else
                If CurCode <> "" Then
                    DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                    DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                    DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                    CurCode = ""
                    DestFileLine = ""
                End If

                CurCode = NewCode
                DestFileLine = Strings.Mid(pLine, SpacePos + 1)
            End If
        Else
            NonCodeLine = True
        End If


        If NonCodeLine = True AndAlso CurCode <> "" Then 'If we are not on a code keep going, otherwise check it
            Dim pReg As New System.Text.RegularExpressions.Regex("[a-z]")
            Dim pRegCaps As New System.Text.RegularExpressions.Regex("[A-Z]")
            If pReg.IsMatch(pLine) OrElse pLine.Length <= 5 OrElse pRegCaps.IsMatch(pLine) = False OrElse (Strings.Left(pLine, 3) = "NOS" OrElse Strings.Left(pLine, 2) = "IQ") Then
                DestFileLine &= " " & pLine
            Else 'Is all caps word
                DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
                DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
                DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
                CurCode = ""
                DestFileLine = ""
            End If
        End If
    Next

    If CurCode <> "" Then
        DestFileLine = Strings.Replace(DestFileLine, ",", "&#44;")
        DestFileLine = Strings.Replace(DestFileLine, """", "&quot;").Trim
        DestFileText.AppendLine(CurCode & ",""" & DestFileLine & """")
        CurCode = ""
        DestFileLine = ""
    End If

    IO.File.WriteAllText(Replace(pFile.FullName, pFile.Extension, "_parsed" & pFile.Extension), DestFileText.ToString)
End Sub

End Class


Center for Medicare Services (CMS) is actually charged with ICD, so I think the CDC versions you guys reference may just be copies or reprocessed copies. Here is the (~hard to find) medicare page which i think contains the original raw data ("source of truth").

http://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes.html

It looks like as of this post the latest version is v32. The zip you download will contain 4 plain-text files which map code-to-description (one file for every combination of DIAG|PROC and SHORT|LONG). It also contains two excel files (one each for DIAG_PROC) which have three columns so map code to both descriptions (long and short).


You can get the orginal RTF code files from here http://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/ICD9-CM/2009/

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜