文章/答案/技术大牛

发布

社区首页 >问答首页 >ValueError:在尝试从字典创建熊猫DataFrame时，每个列数组都必须是一维的。为什么？

问ValueError:在尝试从字典创建熊猫DataFrame时，每个列数组都必须是一维的。为什么？
EN

Stack Overflow用户

提问于 2022-03-22 18:54:55

回答 1查看 9.7K关注 0票数 7

我正试图从字典中创建一个非常简单的Pandas DataFrame。字典有3项，DataFrame也有。它们是：

带有“shape”(3，)的列表
一个具有形状(3，3)的列表/np.Array(在不同的尝试中)
常数为100 (与整列相同的值)

下面是成功并显示首选df的代码

# from a dicitionary
>>>dict1 = {"x": [1, 2, 3],
...         "y": list(
...             [
...                 [2, 4, 6], 
...                 [3, 6, 9], 
...                 [4, 8, 12]
...             ]
...             ),
...         "z": 100}

>>>df1 = pd.DataFrame(dict1)
>>>df1
   x           y    z
0  1   [2, 4, 6]  100
1  2   [3, 6, 9]  100
2  3  [4, 8, 12]  100

但是，我将Numpy ndarray (shape 3，3)分配给键y，并尝试从字典中创建一个DataFrame。我试图创建DataFrame错误的行。下面是我试图运行的代码，以及我得到的错误(为了便于阅读，在单独的代码块中)。

代码

>>>dict2 = {"x": [1, 2, 3],
...         "y": np.array(
...             [
...                 [2, 4, 6], 
...                 [3, 6, 9], 
...                 [4, 8, 12]
...             ]
...             ),
...         "z": 100}

>>>df2 = pd.DataFrame(dict2)  # see the below block for error

错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
d:\studies\compsci\pyscripts\study\pandas-realpython\data-delightful\01.intro.ipynb Cell 10' in <module>
      1 # from a dicitionary
      2 dict1 = {"x": [1, 2, 3],
      3          "y": np.array(
      4              [
   (...)
      9              ),
     10          "z": 100}
---> 12 df1 = pd.DataFrame(dict1)

File ~\anaconda3\envs\dst\lib\site-packages\pandas\core\frame.py:636, in DataFrame.__init__(self, data, index, columns, dtype, copy)
    630     mgr = self._init_mgr(
    631         data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy
    632     )
    634 elif isinstance(data, dict):
    635     # GH#38939 de facto copy defaults to False only in non-dict cases
--> 636     mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
    637 elif isinstance(data, ma.MaskedArray):
    638     import numpy.ma.mrecords as mrecords

File ~\anaconda3\envs\dst\lib\site-packages\pandas\core\internals\construction.py:502, in dict_to_mgr(data, index, columns, dtype, typ, copy)
    494     arrays = [
    495         x
    496         if not hasattr(x, "dtype") or not isinstance(x.dtype, ExtensionDtype)
    497         else x.copy()
    498         for x in arrays
    499     ]
    500     # TODO: can we get rid of the dt64tz special case above?
--> 502 return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)

File ~\anaconda3\envs\dst\lib\site-packages\pandas\core\internals\construction.py:120, in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate)
    117 if verify_integrity:
    118     # figure out the index, if necessary
    119     if index is None:
--> 120         index = _extract_index(arrays)
    121     else:
    122         index = ensure_index(index)

File ~\anaconda3\envs\dst\lib\site-packages\pandas\core\internals\construction.py:661, in _extract_index(data)
    659         raw_lengths.append(len(val))
    660     elif isinstance(val, np.ndarray) and val.ndim > 1:
--> 661         raise ValueError("Per-column arrays must each be 1-dimensional")
    663 if not indexes and not raw_lengths:
    664     raise ValueError("If using all scalar values, you must pass an index")

ValueError: Per-column arrays must each be 1-dimensional

为什么会像第二次尝试那样错误地结束，即使这两个数组的尺寸是相同的？解决这个问题的方法是什么？

dataframe

numpy

python

pandas

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-03-22 23:28:37

如果您仔细查看错误消息并快速查看源代码这里

    elif isinstance(val, np.ndarray) and val.ndim > 1:
        raise ValueError("Per-column arrays must each be 1-dimensional")

您会发现，如果dictionay值是一个numpy数组，并且有多个维度作为示例，它将根据源代码抛出一个错误。因此，它对list非常有效，因为一个列表没有超过一个维度，即使它是一个列表列表。

lst = [[1,2,3],[4,5,6],[7,8,9]]
len(lst) # print 3 elements or (3,) not (3,3) like numpy array.

您可以尝试使用np.array(1,2,3)，它将工作，因为维度数为1，并尝试：

arr = np.array([1,2,3])
print(arr.ndim)  # output is 1

如果有必要在字典中使用numpy数组，则可以使用.tolist()将numpy数组转换为列表。

票数 7

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/71577514

复制

相似问题

问ValueError:在尝试从字典创建熊猫DataFrame时，每个列数组都必须是一维的。为什么？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ValueError:在尝试从字典创建熊猫DataFrame时，每个列数组都必须是一维的。为什么？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问ValueError:在尝试从字典创建熊猫DataFrame时，每个列数组都必须是一维的。为什么？
EN