Html Agility Pack怎么通过类获得所有元素?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (48)

例如:

var findclasses = _doc.DocumentNode.Descendants("div").Where(d => d.Attributes.Contains("class"));

显然可以添加更多的div,所以我尝试了这个..

var allLinksWithDivAndClass = _doc.DocumentNode.SelectNodes("//*[@class=\"float\"]");

但是,这并不处理添加多个类的情况,“float”只是其中之一。

class="className float anotherclassName"

有没有办法处理这些?

提问于
用户回答回答于

在谓语中添加更多的子句:

var findclasses = _doc.DocumentNode
    .Descendants( "div" )
    .Where( d => 
        d.Attributes.Contains("class")
        &&
        d.Attributes["class"].Value.Contains("float")
    );

我建议创建一个扩展方法HasClass并像这样使用它:

IEnumerable<HtmlNode> hasFloatClass = _doc.DocumentNode
    .Descendants( "div" )
    .Where( div => div.HasClass( "float" ) );

public static Boolean HasClass(this HtmlNode element, String className)
{
    if( element == null ) throw new ArgumentNullException( nameof(element) );
    if( String.IsNullOrWhitespace( className ) ) throw new ArgumentNullException( nameof(className) );
    if( element.NodeType != HtmlNodeType.Element ) return false;

    HtmlAttribute classAttrib = element.Attributes["class"];
    if( classAttrib == null ) return false;

    Boolean hasClass = CheapClassListContains( classAttrib.Value, className, StringComparison.Ordinal );
    return hasClass;
}

/// <summary>Performs optionally-whitespace-padded string search without new string allocations.</summary>
/// <remarks>A regex might also work, but constructing a new regex every time this method is called would be expensive.</remarks>
private static Boolean CheapClassListContains(String haystack, String needle, StringComparison comparison)
{
    if( String.Equals( haystack, needle, comparison ) ) return true;
    Int32 idx = 0;
    while( idx + needle.Length <= haystack.Length )
    {
        idx = haystack.IndexOf( needle, idx, comparison );
        if( idx == -1 ) return false;

        Int32 end = idx + needle.Length;

        // Needle must be enclosed in whitespace or be at the start/end of string
        Boolean validStart = idx == 0               || Char.IsWhiteSpace( haystack[idx - 1] );
        Boolean validEnd   = end == haystack.Length || Char.IsWhiteSpace( haystack[end] );
        if( validStart && validEnd ) return true;

        idx++;
    }
    return false;
}
用户回答回答于

可以通过在Xpath查询中使用'contains'功能来解决问题,如下所示:

var allElementsWithClassFloat = 
   _doc.DocumentNode.SelectNodes("//*[contains(@class,'float')]")

要在函数中用此操作,执行以下操作:

string classToFind = "float";    
var allElementsWithClassFloat = 
   _doc.DocumentNode.SelectNodes(string.Format("//*[contains(@class,'{0}')]", classToFind));

扫码关注云+社区

领取腾讯云代金券