我似乎找不出RegExp来提取word文档中两组单词之间的要点。
例如:
风险评估:
内部审计
在这种情况下,我想提取“风险评估”和“内部审计”之间的要点,每次一个项目,并将该项目分配给Excel单元格。如下面的代码所示,我已经完成了几乎所有的工作,除了找不到正确的Regex模式之外。任何帮助都会很好。提前感谢!
Sub PopulateExcelTable()
Dim fd As Office.FileDialog
Set fd = Application.FileDialog(msoFileDialogFilePicker)
With fd
.AllowMultiSelect = False
.Title = "Please select the file."
.Filters.Clear
.Filters.Add "Word 2007-2013", "*.docx"
If .Show = True Then
txtFileName = .SelectedItems(1)
End If
End With
Dim WordApp As Word.Application
Set WordApp = CreateObject("Word.Application")
Dim WordDoc As Word.Document
Set WordDoc = WordApp.Documents.Open(txtFileName)
Dim str As String: str = WordDoc.Content.Text ' Assign entire document content to string
Dim rex As New RegExp
rex.Pattern = "\b[^Risk Assessment\s].*[^Internal Audit\s]"
Dim i As long : i = 1
rex.Global = True
For Each mtch In rex.Execute(str)
Debug.Print mtch
Range("A" & i).Value = mtch
i = i + 1
Next mtch
WordDoc.Close
WordApp.Quit
End Sub发布于 2014-04-23 22:52:48
这可能是一个很长的路围绕问题,但它的工作。
我正在采取的步骤:
(Group) regexp模式,这样您就可以提取单词之间的所有内容。注意:我没有看到指向Excel工作簿的链接的代码。我假设这部分起作用了。
Dim rex As New RegExp
rex.Pattern = "(\bRisk Assessment\s)(.*)(Internal\sAudit\s)"
rex.Global = True
rex.MultiLine = True
rex.IgnoreCase = True
Dim lineArray() As String
Dim myMatches As Object
Set myMatches = rex.Execute(str)
For Each mtch In rex.Execute(str)
'Debug.Print mtch.SubMatches(1)
lineArray = Split(mtch.SubMatches(1), vbLf)
For x = LBound(lineArray) To UBound(lineArray)
'Debug.Print lineArray(x)
Range("A" & i).Value = lineArray(x)
i = i + 1
Next
Next mtch我的测试页面如下所示:

从内部Debug.Print行返回的结果如下:
Item 1
Item 2
Item 3https://stackoverflow.com/questions/23254201
复制相似问题