我目前正在适应一份新的工作,我与同事分享的大部分工作都是通过MS Excel完成的。我经常使用数据透视表,因此需要“堆叠”的数据,这正是R中reshape
(reshape2)包中melt()
函数的输出,我已经开始依赖它了。
谁能让我开始使用VBA宏来完成这项任务,或者已经有了这样的宏吗?
宏的大纲将是:
换句话说,输出看起来与简单地在R中执行以下两行的输出完全相同:
require(reshape)
melt(your.unstacked.dataframe, id.vars = 1:4)
下面是一个例子:
# unstacked data
> df1
Year Month Country Sport No_wins No_losses High_score Total_games
2 2010 5 USA Soccer 4 3 5 9
3 2010 6 USA Soccer 5 3 4 8
4 2010 5 CAN Soccer 2 9 7 11
5 2010 6 CAN Soccer 4 8 4 13
6 2009 5 USA Soccer 8 1 4 9
7 2009 6 USA Soccer 0 0 3 2
8 2009 5 CAN Soccer 2 0 6 3
9 2009 6 CAN Soccer 3 0 8 3
# stacking the data
> require(reshape)
> melt(df1, id.vars=1:4)
Year Month Country Sport variable value
1 2010 5 USA Soccer No_wins 4
2 2010 6 USA Soccer No_wins 5
3 2010 5 CAN Soccer No_wins 2
4 2010 6 CAN Soccer No_wins 4
5 2009 5 USA Soccer No_wins 8
6 2009 6 USA Soccer No_wins 0
7 2009 5 CAN Soccer No_wins 2
8 2009 6 CAN Soccer No_wins 3
9 2010 5 USA Soccer No_losses 3
10 2010 6 USA Soccer No_losses 3
11 2010 5 CAN Soccer No_losses 9
12 2010 6 CAN Soccer No_losses 8
13 2009 5 USA Soccer No_losses 1
14 2009 6 USA Soccer No_losses 0
15 2009 5 CAN Soccer No_losses 0
16 2009 6 CAN Soccer No_losses 0
17 2010 5 USA Soccer High_score 5
18 2010 6 USA Soccer High_score 4
19 2010 5 CAN Soccer High_score 7
20 2010 6 CAN Soccer High_score 4
21 2009 5 USA Soccer High_score 4
22 2009 6 USA Soccer High_score 3
23 2009 5 CAN Soccer High_score 6
24 2009 6 CAN Soccer High_score 8
25 2010 5 USA Soccer Total_games 9
26 2010 6 USA Soccer Total_games 8
27 2010 5 CAN Soccer Total_games 11
28 2010 6 CAN Soccer Total_games 13
29 2009 5 USA Soccer Total_games 9
30 2009 6 USA Soccer Total_games 2
31 2009 5 CAN Soccer Total_games 3
32 2009 6 CAN Soccer Total_games 3
发布于 2012-06-07 05:11:02
我在我的博客上有两篇文章,有可用的代码和可下载的工作簿,关于在Excel/VBA中做到这一点:
http://yoursumbuddy.com/data-normalizer
http://yoursumbuddy.com/data-normalizer-the-sql/
代码如下:
'Arguments
'List: The range to be normalized.
'RepeatingColsCount: The number of columns, starting with the leftmost,
' whose headings remain the same.
'NormalizedColHeader: The column header for the rolled-up category.
'DataColHeader: The column header for the normalized data.
'NewWorkbook: Put the sheet with the data in a new workbook?
'
'NOTE: The data must be in a contiguous range and the
'columns that will be repeated must be to the left,
'with the columns to be normalized to the right.
Sub NormalizeList(List As Excel.Range, RepeatingColsCount As Long, _
NormalizedColHeader As String, DataColHeader As String, _
Optional NewWorkbook As Boolean = False)
Dim FirstNormalizingCol As Long, NormalizingColsCount As Long
Dim ColsToRepeat As Excel.Range, ColsToNormalize As Excel.Range
Dim NormalizedRowsCount As Long
Dim RepeatingList() As String
Dim NormalizedList() As Variant
Dim ListIndex As Long, i As Long, j As Long
Dim wbSource As Excel.Workbook, wbTarget As Excel.Workbook
Dim wsTarget As Excel.Worksheet
With List
'If the normalized list won't fit, you must quit.
If .Rows.Count * (.Columns.Count - RepeatingColsCount) > .Parent.Rows.Count Then
MsgBox "The normalized list will be too many rows.", _
vbExclamation + vbOKOnly, "Sorry"
Exit Sub
End If
'You have the range to be normalized and the count of leftmost rows to be repeated.
'This section uses those arguments to set the two ranges to parse
'and the two corresponding arrays to fill
FirstNormalizingCol = RepeatingColsCount + 1
NormalizingColsCount = .Columns.Count - RepeatingColsCount
Set ColsToRepeat = .Cells(1).Resize(.Rows.Count, RepeatingColsCount)
Set ColsToNormalize = .Cells(1, FirstNormalizingCol).Resize(.Rows.Count, NormalizingColsCount)
NormalizedRowsCount = ColsToNormalize.Columns.Count * .Rows.Count
ReDim RepeatingList(1 To NormalizedRowsCount, 1 To RepeatingColsCount)
ReDim NormalizedList(1 To NormalizedRowsCount, 1 To 2)
End With
'Fill in every i elements of the repeating array with the repeating row labels.
For i = 1 To NormalizedRowsCount Step NormalizingColsCount
ListIndex = ListIndex + 1
For j = 1 To RepeatingColsCount
RepeatingList(i, j) = List.Cells(ListIndex, j).Value2
Next j
Next i
'We stepped over most rows above, so fill in other repeating array elements.
For i = 1 To NormalizedRowsCount
For j = 1 To RepeatingColsCount
If RepeatingList(i, j) = "" Then
RepeatingList(i, j) = RepeatingList(i - 1, j)
End If
Next j
Next i
'Fill in each element of the first dimension of the normalizing array
'with the former column header (which is now another row label) and the data.
With ColsToNormalize
For i = 1 To .Rows.Count
For j = 1 To .Columns.Count
NormalizedList(((i - 1) * NormalizingColsCount) + j, 1) = .Cells(1, j)
NormalizedList(((i - 1) * NormalizingColsCount) + j, 2) = .Cells(i, j)
Next j
Next i
End With
'Put the normal data in the same workbook, or a new one.
If NewWorkbook Then
Set wbTarget = Workbooks.Add
Set wsTarget = wbTarget.Worksheets(1)
Else
Set wbSource = List.Parent.Parent
With wbSource.Worksheets
Set wsTarget = .Add(after:=.Item(.Count))
End With
End If
With wsTarget
'Put the data from the two arrays in the new worksheet.
.Range("A1").Resize(NormalizedRowsCount, RepeatingColsCount) = RepeatingList
.Cells(1, FirstNormalizingCol).Resize(NormalizedRowsCount, 2) = NormalizedList
'At this point there will be repeated header rows, so delete all but one.
.Range("1:" & NormalizingColsCount - 1).EntireRow.Delete
'Add the headers for the new label column and the data column.
.Cells(1, FirstNormalizingCol).Value = NormalizedColHeader
.Cells(1, FirstNormalizingCol + 1).Value = DataColHeader
End With
End Sub
你可以这样称呼它:
Sub TestIt()
NormalizeList ActiveSheet.UsedRange, 4, "Variable", "Value", False
End Sub
发布于 2014-08-27 00:27:37
微软最近推出了Power Query,这是一个Excel插件,它为Excel中的数据操作添加了许多有趣的功能和功能,包括您正在寻找的功能。
插件中的实际功能称为“取消透视表列”,in this article对此进行了解释。这里是它的要点:
从表格中选择要取消透视的表格/区域“
columns you to melt/reshape
发布于 2012-08-16 13:57:48
如果任何人正在寻找一种可视化的方法来标准化excel数据,请参阅此视频教程:
https://stackoverflow.com/questions/10921791
复制相似问题