开发者

How to get inner HTML with INDY

I got stuck on this problem: I need to get time and date from page presnycas.eu (to sync). Date is fine, but I cannot get the time. The problem is that when I call IdHTTP.Get(..) method, as a result I get the HTML of the page, but the time is missing. Like this:

<div class="boxik"> 
<table style="text-align: left; width: 700px; height: 116px;" border="0" cellpadding="2" cellspacing="0"> 
  <tbody> 
    <tr> 
      <td style="width: 400px;" colspan="1" 开发者_开发百科rowspan="5"> 
            <div class="hodinyhlavni"> 

            <span id="servertime"></span> 
              // This is where the time should be - when viewed with 
              // developer tools in Chrome, it does show the time
              // (picture here http://img684.imageshack.us/img684/166/pagem.png)
            </div> 
      </td> 
      <td style="width: 0px;"> &nbsp;      
           07.07.2011
      </td> 

Now I am using an awkward approach - I load a TWebBrowser and then call

Time:=StrToTime(WebBrowser1.OleObject.Document.GetElementByID('servertime').innerhtml);

but well, it's rather slow and I would rather not use the TWebBrowser at all.

So, how can I get the innerhtml of an element with a call of function?

Thanks in advance


The most important part of this answer would be "you need to understand HTML and JavaScript and figure out how the site works". Open the web site, right-click and do "Show Source". You'll notice this at the top:

<script type="text/javascript">var currenttime = 'July 07, 2011 11:51:14'</script>

That looks like the time, and in my case, the time is correct but not adjusted to MY time zone. You can easily grab the plain HTML using Indy, and apparently that's enough. This quick code sample shows you how to grab the HTML and parse the date and time using a little piece of RegEx. If you're on Delphi XE, you'll have to replace the TPerlRegEx class name and the PerlRegEx unit name to whatever XE wants. If you're on older Delphi, that's no excuse to NOT use RegEx! Download TPerlRegEx, it's free and compatible with the XE stuff.

program Project29;

{$APPTYPE CONSOLE}

uses
  SysUtils, IdHTTP, PerlRegEx, SysConst;

function ExtractDayTime: TDateTime;
var H: TIdHTTP;
    Response: string;
    RegEx: TPerlRegEx;

    s: string;

    Month, Year, Day, Hour, Minute, Second: Word;
begin
  H := TIdHttp.Create(Nil);
  try
    Response := H.Get('http://presnycas.eu/');
    RegEx := TPerlRegEx.Create;
    try
      RegEx.RegEx := 'var\ currenttime\ \=\ \''(\w+)\ (\d{1,2})\,\ (\d{4})\ (\d{1,2})\:(\d{1,2})\:(\d{1,2})\''';
      RegEx.Subject := Response;
      if RegEx.Match then
        begin

          // Translate month
          s := RegEx.Groups[1];
          if s = SShortMonthNameJan then Month := 1

          else if s = SShortMonthNameFeb then Month := 2
          else if s = SShortMonthNameMar then Month := 3
          else if s = SShortMonthNameApr then Month := 4
          else if s = SShortMonthNameMay then Month := 5
          else if s = SShortMonthNameJun then Month := 6
          else if s = SShortMonthNameJul then Month := 7
          else if s = SShortMonthNameAug then Month := 8
          else if s = SShortMonthNameSep then Month := 9
          else if s = SShortMonthNameOct then Month := 10
          else if s = SShortMonthNameNov then Month := 11
          else if s = SShortMonthNameDec then Month := 12

          else if s = SLongMonthNameJan then Month := 1
          else if s = SLongMonthNameFeb then Month := 2
          else if s = SLongMonthNameMar then Month := 3
          else if s = SLongMonthNameApr then Month := 4
          else if s = SLongMonthNameMay then Month := 5
          else if s = SLongMonthNameJun then Month := 6
          else if s = SLongMonthNameJul then Month := 7
          else if s = SLongMonthNameAug then Month := 8
          else if s = SLongMonthNameSep then Month := 9
          else if s = SLongMonthNameOct then Month := 10
          else if s = SLongMonthNameNov then Month := 11
          else if s = SLongMonthNameDec then Month := 12

          else
            raise Exception.CreateFmt('Don''t know what month is: %s', [s]);

          // Day, Year, Hour, Minute, Second
          Day := StrToInt(RegEx.Groups[2]);
          Year := StrToInt(RegEx.Groups[3]);
          Hour := StrToInt(RegEx.Groups[4]);
          Minute := StrToInt(RegEx.Groups[5]);
          Second := StrToInt(RegEx.Groups[6]);

          Result := EncodeDate(Year, Month, Day) + EncodeTime(Hour, Minute, Second, 0);

        end
      else
        raise Exception.Create('Can''t get time!');
    finally RegEx.Free;
    end;
  finally H.Free;
  end;
end;

begin
  WriteLn(DateTimeToStr(ExtractDayTime));
  ReadLn;
end.


I tried the link you specified (http://presnycas.eu/) and from the HTML I can see that the actual time is returned at another location in the HTML, and then later increased with a JavaScript locally so you need to "fetch" the new time periodically if you want to sync.

Look for this in the HTML (inside the HEAD element):

<head>
...
<script type="text/javascript">var currenttime = 'July 07, 2011 12:01:26'</script>
...
</head>


How to get inner html using indy TidHTTP

var
  Form2: TForm2;
  xpto:tmemorystream;
  xx:string;
  implementation

{$R *.fmx}

procedure TForm2.Button1Click(Sender: TObject);

begin
xpto:=tmemorystream.Create;
idhttp1.Get('http://google.com',xpto);
xpto.Position:=0;

end;


procedure TForm2.IdHTTP1WorkEnd(ASender: TObject; AWorkMode: TWorkMode);
var x:string;
begin

SetString(x, PAnsiChar(xpto.Memory), xpto.Size);

memo1.Lines.add(x);
end;

// For Android Firemonkey usage please replace Pansichar with MarshaledAString

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜