我将用示例代码解释我想要的是什么。我的函数GetDox
看起来很接近,但仍然不完整。这是一个测试代码。
'test begin...
'<dox>
' <member type="Public Sub" name="Increment" return="void">
' <param type="Integer" name="nBase" out="true" />
' <param type="Integer" name="nStep" out="false" />
' <purpose>
' purpose here...
' </purpose>
' </member>
' <member ... />
'</dox>
'other comments here...
Public Sub Increment(nBase, nStep) 'some example content
nBase = nBase + nStep
End Sub
'<Unwonted_Item />
Dim source 'reading the same file just for simplification
With CreateObject("Scripting.FileSystemObject")
With .OpenTextFile(WScript.ScriptFullName, 1, False)
source = .ReadAll
End With
End With
result = GetDox(source)
WScript.Echo result 'display our result
Function GetDox(sCode) 'unfinished function
Dim regEx, Match, Matches, mVal, sEnd
sEnd = "</dox>" & vbNewLine
Set regEx = New RegExp
regEx.Pattern = "('<dox>\n|'\s*<.*)" 'my ugly pattern
regEx.IgnoreCase = True
regEx.Global = True
Set Matches = regEx.Execute(sCode)
For Each Match In Matches
mVal = Match.Value
mVal = Replace(mVal, vbCr, vbNewLine)
mVal = Right(mVal, Len(mVal) - 1)
GetDox = GetDox & mVal
If mVal = sEnd Then Exit For
Next
End Function
我得到的是:
<dox>
<member type="Public Sub" name="Increment" return="void">
<param type="Integer" name="nBase" out="true" />
<param type="Integer" name="nStep" out="false" />
<purpose>
</purpose>
</member>
<member ... />
</dox>
这就是我需要的:
<dox>
<member type="Public Sub" name="Increment" return="void">
<param type="Integer" name="nBase" out="true" />
<param type="Integer" name="nStep" out="false" />
<purpose>
purpose here...
</purpose>
</member>
<member ... />
</dox>
这句话的意思是“目的在这里”我知道整个RegExp.Pattern
语法都很弱。我只想选择以<dox>
开头,以</dox>
结尾的全部内容,包括所有内容,但我仍然停留在模式语法上。
P.S.提供了如此出色的帮助(感谢所有的人),这是我现在的工作职责:
Function GetDox(sCode)
GetDox = vbNullString
With New RegExp
.Pattern = "<dox>[\s\S]*?</dox>"
.IgnoreCase = True
.Global = False
With .Execute(sCode)
If .Count = 0 Then Exit Function
GetDox = .Item(0).Value
End With
.Pattern = "^'"
.Global = True
.Multiline = True
GetDox = .Replace(GetDox, "")
End With
End Function
发布于 2013-03-17 10:53:52
我首先删除前面的单引号:
regEx.Pattern = "^'"
regEx.Global = True
sCode = regEx.Replace(sCode, "")
然后提取XML文本:
regEx.Pattern = "<dox>[\s\S]*?</dox>"
regEx.Global = False
regEx.IgnoreCase = True
Set m = regEx.Execute(sCode)
If m.Count > 0 Then GetDox = m(0).Value
在此之后,您应该将XML读入DOM树以进行进一步处理:
Set xml = CreateObject("Msxml2.DOMDocument.6.0")
xml.async = False
xml.loadXML result
如果您的XML位于一个单独的文件中,那么您应该直接从文件中加载XML,并使用XPath表达式提取节点,就像@FrankSchmitt在他的评论中所建议的那样。
Set xml = CreateObject("Msxml2.DOMDocument.6.0")
xml.async = False
xml.load "C:\path\to\your.xml"
Set nodes = xml.selectNodes("//dox")
XML不是面向行的,不应该被解析为它是面向行的。如果你处理不好,事情可能会以有趣的方式破裂。
https://stackoverflow.com/questions/15456505
复制相似问题