开发者

GetFormFieldNames not always working

I am trying to find out which form and element belongs too. The code that I now understand from this开发者_C百科 website:

http://www.cryer.co.uk/brian/delphi/twebbrowser/read_write_form_elements.htm

containing this code

function GetFormFieldNames(fromForm: IHTMLFormElement): TStringList;
var
  index: integer;
  field: IHTMLElement;
  input: IHTMLInputElement;
  select: IHTMLSelectElement;
  text: IHTMLTextAreaElement;
begin
  result := TStringList.Create;
  for index := 0 to fromForm.length do
  begin
    field := fromForm.Item(index,'') as IHTMLElement;
    if Assigned(field) then
    begin
      if field.tagName = 'INPUT' then
      begin
        // Input field.
        input := field as IHTMLInputElement;
        result.Add(input.name);
      end
      else if field.tagName = 'SELECT' then
      begin
        // Select field.
        select := field as IHTMLSelectElement;
        result.Add(select.name);
      end
      else if field.tagName = 'TEXTAREA' then
      begin
        // TextArea field.
        text := field as IHTMLTextAreaElement;
        result.Add(text.name);
      end;
    end;
  end;

end;

seems to be working fine for most sites. However there are a few websites such as this one:

http://service.mail.com/registration.html#.1258-bluestripe-product1-undef

By looking at that code and comparing it with the active id, I can find the form it is in. However it does not work for that website. for some reason I think it has to do with htmldocument3 adn that this code is for htmldocument2. But I am not sure.

so my question is How can I extract a tstringlist from this website with all the elements names in them? hope you can help!

Edited: Added some code

              begin

                theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2,
                  0);
                fields := GetFormFieldNames(theForm);
                num := fields.IndexOf(theid);
              end;
              until (num <> -1);


One complication with locating form elements in a web page is that the page may contain frames and there may be forms in any of the frames. Basically, you have to iterate through all the frames and the forms in each frame. Once you get the form as an IHTMLFormElement, use Cryer's function to get the form element names.

The example link you gave does not have any frames and you should have had no problems getting your list of form elements, unless you tried to get the form by name because it had no name assigned. I had no problem getting the form element names and values using the following procedure

procedure GetForms(doc1: IHTMLDocument2; var sl: TStringList);
var
  i, j, n: integer;
  docForm: IHTMLFormElement;
  slt:  TStringList;
  s: string;
begin
  if doc1 = nil then
  begin
    ShowMessage('doc1 is empty [GetForms]');
    Exit;
  end;
  slt := TStringList.Create;

  n := NumberOfForms(doc1);
  sl.Add('Forms: ' + IntToStr(n));
  for i := 0 to n - 1 do
  begin
    docForm := GetFormByNumber(doc1, i);
    sl.Add('Form Name: ' + docForm.Name);
    slt.Clear;
    slt := GetFormFieldNames(docForm);
    for j := 0 to slt.Count - 1 do
    begin
      s := GetFieldValue(docForm, slt[j]);
      sl.Add('Field Name: ' + slt[j] + '  value: "' + s + '"');
    end;
  end;
  sl.Add('');
  slt.Free;
end;

Cryer's example for navigating a frameset will not work for all web sites, see http://support.microsoft.com/support/kb/articles/Q196/3/40.ASP. The following function successfuly extracts a frame as an IHTMLDocument2 on all sites I have tried

function GetFrameByNumber(Doc:IHTMLDocument2; n:integer):IHTMLDocument2;
var
  Container: IOleContainer;
  Enumerator: ActiveX.IEnumUnknown;
  Unknown: IUnknown;
  Browser: IWebBrowser2;
  Fetched: Longint;
  NewDoc: IHTMLDocument2;
  i : integer;
begin
  // We cannot use the document's frames collection here, because
  // it does not work in every case (i.e. Documents from a foreign domain).
  // From: http://support.microsoft.com/support/kb/articles/Q196/3/40.ASP
  i := 0;
  if (Supports(Doc, IOleContainer, Container)) and
     (Container.EnumObjects(OLECONTF_EMBEDDINGS, Enumerator) = S_OK) then
  begin
    while Enumerator.Next(1, Unknown, @Fetched) = S_OK do
    begin
      if (Supports(Unknown, IWebBrowser2, Browser)) and
         (Supports(Browser.Document, IHTMLDocument2, NewDoc)) then
      begin
        // Here, NewDoc is an IHTMLDocument2 that you can query for
        // all the links, text edits, etc.
        if i=n then
        begin
          Result := NewDoc;
          Exit;
        end;
        i := i+1;
      end;
    end;
  end;
end;

Here is an example of how I have used GetForms and GetFrameByNumber

// from the TForm1 declaration
    { Public declarations }
    wdoc: IHTMLDocument2;


procedure TForm1.btnAnalyzeClick(Sender: TObject);
begin
  wdoc := WebBrowser.Document as IHTMLDocument2;
  GetDoc(wdoc);
end;

procedure TForm1.GetDoc(doc1: IHTMLDocument2);
var
  i, n: integer;
  doc2: IHTMLDocument2;
  frame_dispatch: IDispatch;
  frame_win: IHTMLWindow2;
  ole_index: olevariant;
  sl: TStringList;
begin
  if doc1 = nil then
  begin
    ShowMessage('Web doc is empty');
    Exit;
  end;
  Form2.Memo1.Lines.Clear;
  sl := TStringList.Create;

  n := doc1.frames.length;
  sl.Add('Frames: ' + IntToStr(n));
  // check each frame for the data
  if n = 0 then
    GetForms(doc1, sl)
  else
    for i := 0 to n - 1 do
    begin
      sl.Add('--Frame: ' + IntToStr(i));
      ole_index := i;
      frame_dispatch := doc1.Frames.Item(ole_index);
      if frame_dispatch <> nil then
      begin
        frame_win := frame_dispatch as IHTMLWindow2;
        doc2 := frame_win.document;
//        sl.Add(doc2.body.outerHTML);
        GetForms(doc2,sl);
        GetDoc(doc2);
      end;
    end;

// Form2 just contains a TMemo
  Form2.Memo1.Lines.AddStrings(sl);
  Form2.Show;
  sl.Free;
end;

The logic in your example is faulty, 1. when there is only 1 form on the web page the list of form elements is never extracted, 2. the repeat loop will result in a access violation unless the the tag in "theid" is found

Here is your example cut down to successfully extract the form elements.

var
  i : integer;
  nforms : integer;
  document : IHTMLDocument2;
  theForm : IHTMLFormElement;
  fields : TStringList;
  theform1 : integer;
  num : integer;
  theid : string;
begin
  fields := TStringList.Create;
  theid := 'xx';

// original code follows
i := -1;
//    nforms := NumberOfForms(webbrowser1.document as IHTMLDocument2);
//    document := webbrowser1.document as IHTMLDocument2;
//    if nforms = 1 then
//    begin
//      theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2, 0);
//      theform1 := 0;
//    end
//    else
    begin
//              repeat
              begin
                inc(i);
                theForm := GetFormByNumber(webbrowser1.document as IHTMLDocument2,
                  i);
                fields := GetFormFieldNames(theForm);
                num := fields.IndexOf(theid);
                theform1 := i;
              end;
//              until (num <> -1);
    end;
// end of original code

  Memo1.Lines.Text := fields.Text;
  fields.Free;
end;


Hm, are you sure this link contains any form elements? At least I did not see any visible ones. Perhaps they are hidden - did not check this myself, however.

Michael

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜