HTML Component Library now supports XPath.

HTML Component Library now supports XPath.
It can return enumerable list of nodes (interface) or get an anonymous proc as parameter.

for Node in XN.XPath('//img') do
  Memo1.Lines.Add(Node.Attr['src']);
- list addresses of all images

   XN.XPath('//*[@style]',
      procedure(Node: THtmlNode)
      begin
        Memo1.Lines.Add(Node.Attr['style'])
      end
    );

- list styles of all elements having inline styles

   XN.XPath('//p[@align="right"]',
      procedure(Node: THtmlNode)
      begin
        Node.Attr['align']:='left';
      end
    );

- change alignment of all right-aligned paragraphs.

XPath is implemented in base class (no GUI-dependencies) and  optimized for speed and memory, so it is possible to use it in a server-side application.

HTML parsing speed (DOM): 50 Mb/sec.
Avg. XPath processing speed: 20 millions nodes / sec.

Parser is tolerant to almost any errors in HTML- missed quotes or <>, unclosed tags, incorrect tags order, incorrect attributes, etc.

Comments

  1. In my experience XPath is something with a weird syntax like RegEx,  unless you use it every day, the next time you use it again, you learn it again. I guess that's why JQuery dominates the web.

    ReplyDelete
  2. Edwin Yip Good idea, I'll add JQuery syntax too.

    ReplyDelete

Post a Comment