GetFormFieldNames not always working
I am trying to find out which form and element belongs too. The code that I now understand from this开发者_C百科 website:
http://www.cryer.co.uk/brian/delphi/twebbrowser/read_write_form_elements.htm
containing this code
function GetFormFieldNames(fromForm: IHTMLFormElement): TStringList;
var
index: integer;
field: IHTMLElement;
input: IHTMLInputElement;
select: IHTMLSelectElement;
text: IHTMLTextAreaElement;
begin
result := TStringList.Create;
for index := 0 to fromForm.length do
begin
field := fromForm.Item(index,'') as IHTMLElement;
if Assigned(field) then
begin
if field.tagName = 'INPUT' then
begin
// Input field.
input := field as IHTMLInputElement;
result.Add(input.name);
end
else if field.tagName = 'SELECT' then
begin
// Select field.
select := field as IHTMLSelectElement;
result.Add(select.name);
end
else if field.tagName = 'TEXTAREA' then
begin
// TextArea field.
text := field as IHTMLTextAreaElement;
result.Add(text.name);
end;
end;
end;
end;
seems to be working fine for most sites. However there are a few websites such as this one:
http://service.mail.com/registration.html#.1258-bluestripe-product1-undef
By looking at that code and comparing it with the active id, I can find the form it is in. However it does not work for that website. for some reason I think it has to do with htmldocument3 adn that this code is for htmldocument2. But I am not sure.
so my question is How can I extract a tstringlist from this website with all the elements names in them? hope you can help!
Edited: Added some code
begin
theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2,
0);
fields := GetFormFieldNames(theForm);
num := fields.IndexOf(theid);
end;
until (num <> -1);
One complication with locating form elements in a web page is that the page may contain frames and there may be forms in any of the frames. Basically, you have to iterate through all the frames and the forms in each frame. Once you get the form as an IHTMLFormElement, use Cryer's function to get the form element names.
The example link you gave does not have any frames and you should have had no problems getting your list of form elements, unless you tried to get the form by name because it had no name assigned. I had no problem getting the form element names and values using the following procedure
procedure GetForms(doc1: IHTMLDocument2; var sl: TStringList);
var
i, j, n: integer;
docForm: IHTMLFormElement;
slt: TStringList;
s: string;
begin
if doc1 = nil then
begin
ShowMessage('doc1 is empty [GetForms]');
Exit;
end;
slt := TStringList.Create;
n := NumberOfForms(doc1);
sl.Add('Forms: ' + IntToStr(n));
for i := 0 to n - 1 do
begin
docForm := GetFormByNumber(doc1, i);
sl.Add('Form Name: ' + docForm.Name);
slt.Clear;
slt := GetFormFieldNames(docForm);
for j := 0 to slt.Count - 1 do
begin
s := GetFieldValue(docForm, slt[j]);
sl.Add('Field Name: ' + slt[j] + ' value: "' + s + '"');
end;
end;
sl.Add('');
slt.Free;
end;
Cryer's example for navigating a frameset will not work for all web sites, see http://support.microsoft.com/support/kb/articles/Q196/3/40.ASP. The following function successfuly extracts a frame as an IHTMLDocument2 on all sites I have tried
function GetFrameByNumber(Doc:IHTMLDocument2; n:integer):IHTMLDocument2;
var
Container: IOleContainer;
Enumerator: ActiveX.IEnumUnknown;
Unknown: IUnknown;
Browser: IWebBrowser2;
Fetched: Longint;
NewDoc: IHTMLDocument2;
i : integer;
begin
// We cannot use the document's frames collection here, because
// it does not work in every case (i.e. Documents from a foreign domain).
// From: http://support.microsoft.com/support/kb/articles/Q196/3/40.ASP
i := 0;
if (Supports(Doc, IOleContainer, Container)) and
(Container.EnumObjects(OLECONTF_EMBEDDINGS, Enumerator) = S_OK) then
begin
while Enumerator.Next(1, Unknown, @Fetched) = S_OK do
begin
if (Supports(Unknown, IWebBrowser2, Browser)) and
(Supports(Browser.Document, IHTMLDocument2, NewDoc)) then
begin
// Here, NewDoc is an IHTMLDocument2 that you can query for
// all the links, text edits, etc.
if i=n then
begin
Result := NewDoc;
Exit;
end;
i := i+1;
end;
end;
end;
end;
Here is an example of how I have used GetForms and GetFrameByNumber
// from the TForm1 declaration
{ Public declarations }
wdoc: IHTMLDocument2;
procedure TForm1.btnAnalyzeClick(Sender: TObject);
begin
wdoc := WebBrowser.Document as IHTMLDocument2;
GetDoc(wdoc);
end;
procedure TForm1.GetDoc(doc1: IHTMLDocument2);
var
i, n: integer;
doc2: IHTMLDocument2;
frame_dispatch: IDispatch;
frame_win: IHTMLWindow2;
ole_index: olevariant;
sl: TStringList;
begin
if doc1 = nil then
begin
ShowMessage('Web doc is empty');
Exit;
end;
Form2.Memo1.Lines.Clear;
sl := TStringList.Create;
n := doc1.frames.length;
sl.Add('Frames: ' + IntToStr(n));
// check each frame for the data
if n = 0 then
GetForms(doc1, sl)
else
for i := 0 to n - 1 do
begin
sl.Add('--Frame: ' + IntToStr(i));
ole_index := i;
frame_dispatch := doc1.Frames.Item(ole_index);
if frame_dispatch <> nil then
begin
frame_win := frame_dispatch as IHTMLWindow2;
doc2 := frame_win.document;
// sl.Add(doc2.body.outerHTML);
GetForms(doc2,sl);
GetDoc(doc2);
end;
end;
// Form2 just contains a TMemo
Form2.Memo1.Lines.AddStrings(sl);
Form2.Show;
sl.Free;
end;
The logic in your example is faulty, 1. when there is only 1 form on the web page the list of form elements is never extracted, 2. the repeat loop will result in a access violation unless the the tag in "theid" is found
Here is your example cut down to successfully extract the form elements.
var
i : integer;
nforms : integer;
document : IHTMLDocument2;
theForm : IHTMLFormElement;
fields : TStringList;
theform1 : integer;
num : integer;
theid : string;
begin
fields := TStringList.Create;
theid := 'xx';
// original code follows
i := -1;
// nforms := NumberOfForms(webbrowser1.document as IHTMLDocument2);
// document := webbrowser1.document as IHTMLDocument2;
// if nforms = 1 then
// begin
// theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2, 0);
// theform1 := 0;
// end
// else
begin
// repeat
begin
inc(i);
theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2,
i);
fields := GetFormFieldNames(theForm);
num := fields.IndexOf(theid);
theform1 := i;
end;
// until (num <> -1);
end;
// end of original code
Memo1.Lines.Text := fields.Text;
fields.Free;
end;
Hm, are you sure this link contains any form elements? At least I did not see any visible ones. Perhaps they are hidden - did not check this myself, however.
Michael
精彩评论