我想使用Excel 2010中的VBA查询具有以下数据库连接的UTF-8编码的CSV文件:
provider=Microsoft.Jet.OLEDB.4.0;;data source='xyz';Extended Properties="text;HDR=Yes;FMT=Delimited(,);CharacterSet=65001"
所有CSV文件都以BOM \xEF\xBB\xBF和头行开始。不知何故,BOM没有被正确识别,第一列标题被读取为"?header_name",即问号被加在前面。我尝试过不同的CharacterSets,我也尝试使用Microsoft.ACE.OLEDB.12.0,但是到目前为止一切都没有成功。
这是已知的错误吗?还是有任何方法可以在不更改源文件编码的情况下获得正确的第一列标题名称?
发布于 2015-11-24 23:48:39
下面的过程将整个CSV
文件提取到一个新的Sheet
中,从标头中清除BOM
。它将Path、Filename和BOM字符串作为变量来提供灵活性。
使用此过程调用查询过程。
Sub Qry_Csv_Utf8()
Const kFile As String = "UTF8 .csv"
Const kPath As String = "D:\StackOverFlow\Temp\"
Const kBOM As String = "\xEF\xBB\xBF"
Call Ado_Qry_Csv(kPath, kFile, kBOM)
End Sub
这是查询过程。
Sub Ado_Qry_Csv(sPath As String, sFile As String, sBOM As String)
Dim Wsh As Worksheet
Dim AdoConnect As ADODB.Connection
Dim AdoRcrdSet As ADODB.Recordset
Dim i As Integer
Rem Add New Sheet - Select option required
'With ThisWorkbook 'Use this if procedure is resident in workbook receiving csv data
'With Workbooks(WbkName) 'Use this if procedure is not in workbook receiving csv data
With ActiveWorkbook 'I used this for testing purposes
Set Wsh = .Sheets.Add(After:=.Sheets(.Sheets.Count))
'Wsh.Name = NewSheetName 'rename new Sheet
End With
Set AdoConnect = New ADODB.Connection
AdoConnect.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & sPath & ";" & _
"Extended Properties='text;HDR=Yes;FMT=Delimited(,);CharacterSet=65001'"
Set AdoRcrdSet = New ADODB.Recordset
AdoRcrdSet.Open Source:="SELECT * FROM [" & sFile & "]", _
ActiveConnection:=AdoConnect, _
CursorType:=adOpenDynamic, _
LockType:=adLockReadOnly, _
Options:=adCmdText
Rem Enter Csv Records in Worksheet
For i = 0 To -1 + AdoRcrdSet.Fields.Count
Wsh.Cells(1, 1 + i).Value = _
WorksheetFunction.Substitute(AdoRcrdSet.Fields(i).Name, sBOM, "")
Next
Wsh.Cells(2, 1).CopyFromRecordset AdoRcrdSet
End Sub
发布于 2015-11-23 14:07:57
对于这个问题,我发现的唯一解决方案是使用Schema.ini
文件。
我的测试csv文件
Col_A;Col_B;Col_C
Some text example;123456789;3,14
用于测试csv文件的Schema.ini
[UTF-8_Csv_With_BOM.csv]
Format=Delimited(;)
Col1=Col_A Text
Col2=Col_B Long
Col3=Col_C Double
此Schema.ini
文件包含源csv文件的名称,并描述我的列。每个列都由其名称和类型指定,但您可以指定更多的信息。此文件必须与csv文件位于同一个文件夹中。更多信息,这里。
最后,读取csv文件的VBA代码。注意,HDR=No
。这是因为列标题是在Schema.ini
中定义的。
' Add reference to Microsoft ActiveX Data Objects 6.1 Library
Sub ReadCsv()
Const filePath As String = "c:\Temp\StackOverflow\"
Const fileName As String = "UTF-8_Csv_With_BOM.csv"
Dim conn As ADODB.Connection
Dim rs As New ADODB.Recordset
Set conn = New ADODB.Connection
conn.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='" & filePath & _
"';Extended Properties='text;HDR=No;FMT=Delimited()';"
With rs
.ActiveConnection = conn
.Open "SELECT * FROM [" & fileName & "]"
If Not .BOF And Not .EOF Then
While (Not .EOF)
Debug.Print rs.Fields("Col_A") & " " & _
rs.Fields("Col_B") & " " & _
rs.Fields("Col_C")
.MoveNext
Wend
End If
.Close
End With
conn.Close
Set conn = Nothing
End Sub
输出
Some text example 123456789 3,14
https://stackoverflow.com/questions/33820866
复制相似问题