Website search on Excel
I have a spreadsheet on excel with a list of product names. What I want to do is (1) seperate each of these product names by 5 rows, and (2) setup a website search that extracts data from a given website (clinicaltrials.gov) and populates it in the rows underneath each spreadsheet.
(2) is much more important and challenging for me at the moment. I know I have to run a loop that goes through all the product names. But before I focus on the loop, I need help figuring out how to write code that executes a website search.
Some help that I received:
The following Excel VBA code snipet will take a Cell with a constructed URL in the form of:
开发者_运维知识库="URL;http://clinicaltrials.gov/ct2/show?term="& [Cell Reference to Drug name here] &"&rank=1"
And output 4 lines such as:
Estimated Enrollment: 40
Study Start Date: Jan-11
Estimated Study Completion Date: Apr-12
Estimated Primary Completion Date: April 2012 (Final data collection date for primary outcome measure)
With ActiveSheet.QueryTables.Add(Connection:= _ ActiveCell.Text, Destination:=Cells(ActiveCell.Row, ActiveCell.Column + 1)) .Name = "Clinical Trials" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "12" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With
That URL you provided won't work. You need the NCT ID to get to the right page, not the drug name. Assume you have two drugs listed in A1:B2 and the proper NCT id is in column B
celebrex NCT00571701
naproxen NCT00586365
To use this code, set a reference to the Microsoft XML 5.0 library and the Microsoft Forms 2.0 library.
Sub GetClinical()
Dim i As Long
Dim lLast As Long
Dim oHttp As MSXML2.XMLHTTP50
Dim sHtml As String
Dim lDataStart As Long, lTblStart As Long, lTblEnd As Long
Dim doClip As DataObject
'Find the last cell in column A
lLast = Sheet1.Cells(Sheet1.Rows.Count, 1).End(xlUp).Row
Set oHttp = New MSXML2.XMLHTTP50
'Loop from the last cell to row 1 in column A
For i = lLast To 1 Step -1
'Insert 5 rows below
Sheet1.Cells(i, 1).Offset(1, 0).Resize(5).EntireRow.Insert
'get the web page
oHttp.Open "GET", "http://clinicaltrials.gov/ct2/show/" & Sheet1.Cells(i, 2).Value & "?rank=1"
oHttp.send
sHtml = oHttp.responseText
'Find the start and end to the table
lDataStart = InStr(1, sHtml, "Estimated Enrollment:")
lTblStart = InStr(lDataStart - 200, sHtml, "<table")
lTblEnd = InStr(lDataStart, sHtml, "</table>") + 8
'put the table in the clipboard
Set doClip = New DataObject
doClip.SetText Mid$(sHtml, lTblStart, lTblEnd - lTblStart)
doClip.PutInClipboard
'paste the table as text
Sheet1.Cells(i, 1).Offset(1, 0).Select
Sheet1.PasteSpecial "Text", , , , , , True
Next i
End Sub
If you don't have the NCT numbers, I don't think you'll be able to construct a workable URL. Also note that I find the table by looking for a particular string (Estimated Enrollment: - note the two spaces in between) and back up 200 characters. The 200 is arbitrary, but worked for both celebrex and naproxen. I can't guarantee their formatting will be consistent. They don't use table ids, so that makes it tough to find the right one.
Always make a backup of your data before running code that alters it.
If you run a search and look at the bottom of the results page, you'll see there's an option to download the results in various formats. For example this url will download all the fluoxetine results in tab-delimited format:
http://clinicaltrials.gov/ct2/results/download?down_stds=all&down_flds=all&down_fmt=tsv&term=fluoxetine
The only complication is that the results are zipped, so you'll need to save the file and unzip it first. Luckily for you I'd already had to do this... Create a folder called "files" in the same folder as your workbook, then add this code and test it out. Works OK for me.
Option Explicit
Sub Tester()
FetchUnzipOpen "fluoxetine"
End Sub
Sub FetchUnzipOpen(DrugName As String)
Dim s, sz 'don't dim these as strings-must be variants!
s = ThisWorkbook.Path & "\files"
sz = s & "\test.zip"
FetchFile "http://clinicaltrials.gov/ct2/results/download?down_stds=all&" & _
"down_flds=all&down_fmt=tsv&term=" & DrugName, sz
Unzip s, sz
'now you just need to open the data file (files/search_result.txt)
End Sub
Sub FetchFile(sURL As String, sPath)
Dim oXHTTP As Object
Dim oStream As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
Set oStream = CreateObject("ADODB.Stream")
Application.StatusBar = "Fetching " & sURL & " as " & sPath
oXHTTP.Open "GET", sURL, False
oXHTTP.send
With oStream
.Type = 1 'adTypeBinary
.Open
.Write oXHTTP.responseBody
.SaveToFile sPath, 2 'adSaveCreateOverWrite
.Close
End With
Set oXHTTP = Nothing
Set oStream = Nothing
Application.StatusBar = False
End Sub
Sub Unzip(sDest, sZip)
Dim o
Set o = CreateObject("Shell.Application")
o.NameSpace(sDest).CopyHere o.NameSpace(sZip).Items
End Sub
精彩评论