前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >GreenPlum7聚合操作结构体之间关系

GreenPlum7聚合操作结构体之间关系

作者头像
yzsDBA
发布2022-03-29 11:36:28
2820
发布2022-03-29 11:36:28
举报

聚合的执行需要状态描述信息,由AggState结构体管理。该结构体如下:

代码语言:javascript
复制
typedef struct AggState
{
  ScanState  ss;        /* its first field is NodeTag */
  List     *aggs;      /* all Aggref nodes in targetlist & quals */
  int      numaggs;    /* length of list (could be zero!) */
  int      numtrans;    /* number of pertrans items */
  AggStrategy aggstrategy;  /* strategy mode */
  AggSplit  aggsplit;    /* agg-splitting mode, see nodes.h */
  AggStatePerPhase phase;    /* pointer to current phase data */
  int      numphases;    /* number of phases (including phase 0) */
  int      current_phase;  /* current phase number */
  AggStatePerAgg peragg;    /* per-Aggref information */
  AggStatePerTrans pertrans;  /* per-Trans state information */
  ExprContext *hashcontext;  /* econtexts for long-lived data (hashtable) */
  ExprContext **aggcontexts;  /* econtexts for long-lived data (per GS) */
  ExprContext *tmpcontext;  /* econtext for input expressions */
#define FIELDNO_AGGSTATE_CURAGGCONTEXT 14
  ExprContext *curaggcontext; /* currently active aggcontext */
  AggStatePerAgg curperagg;  /* currently active aggregate, if any */
#define FIELDNO_AGGSTATE_CURPERTRANS 16
  AggStatePerTrans curpertrans;  /* currently active trans state, if any */
  bool    input_done;    /* indicates end of input */
  bool    agg_done;    /* indicates completion of Agg scan */
  int      projected_set;  /* The last projected grouping set */
#define FIELDNO_AGGSTATE_CURRENT_SET 20
  int      current_set;  /* The current grouping set being evaluated */
  Bitmapset  *grouped_cols;  /* grouped cols in current projection */
  List     *all_grouped_cols;  /* list of all grouped cols in DESC order */
  /* These fields are for grouping set phase data */
  int      maxsets;    /* The max number of sets in any phase */
  AggStatePerPhase phases;  /* array of all phases */
  Tuplesortstate *sort_in;  /* sorted input to phases > 1 */
  Tuplesortstate *sort_out;  /* input is copied here for next phase */
  TupleTableSlot *sort_slot;  /* slot for sort results */
  /* these fields are used in AGG_PLAIN and AGG_SORTED modes: */
  AggStatePerGroup *pergroups;  /* grouping set indexed array of per-group
                   * pointers */
  HeapTuple  grp_firstTuple; /* copy of first tuple of current group */
  /* these fields are used in AGG_HASHED and AGG_MIXED modes: */
  bool    table_filled;  /* hash table filled yet? */
  int      num_hashes;
  MemoryContext  hash_metacxt;  /* memory for hash table itself */
  struct HashTapeInfo *hash_tapeinfo; /* metadata for spill tapes */
  struct HashAggSpill *hash_spills; /* HashAggSpill for each grouping set,
                     exists only during first pass */
  TupleTableSlot *hash_spill_slot; /* slot for reading from spill files */
  List     *hash_batches;  /* hash batches remaining to be processed */
  bool    hash_ever_spilled;  /* ever spilled during this execution? */
  bool    hash_spill_mode;  /* we hit a limit during the current batch
                     and we must not create new groups */
  Size    hash_mem_limit;  /* limit before spilling hash table */
  uint64    hash_ngroups_limit;  /* limit before spilling hash table */
  int      hash_planned_partitions; /* number of partitions planned
                      for first pass */
  double    hashentrysize;  /* estimate revised during execution */
  Size    hash_mem_peak;  /* peak hash table memory usage */
  uint64    hash_ngroups_current;  /* number of groups currently in
                       memory in all hash tables */
  uint64    hash_disk_used; /* kB of disk space used */
  int      hash_batches_used;  /* batches used during entire execution */

  AggStatePerHash perhash;  /* array of per-hashtable data */
  AggStatePerGroup *hash_pergroup;  /* grouping set indexed array of
                     * per-group pointers */

  /* support for evaluation of agg input expressions: */
#define FIELDNO_AGGSTATE_ALL_PERGROUPS 49
  AggStatePerGroup *all_pergroups;  /* array of first ->pergroups, than
                     * ->hash_pergroup */
  ProjectionInfo *combinedproj;  /* projection machinery */
  int      group_id;    /* GROUP_ID in current projection. This is passed
                 * to GroupingSetId expressions, similar to the
                 * 'grouped_cols' value. */
  int      gset_id;
  /* if input tuple has an AggExprId, save the Attribute Number */
  Index       AggExprId_AttrNum;
} AggState;

他们之间的关系如下图所示:以投影中有聚合操作为例

下面分布对AggState中成员进行介绍。

ScanState中存储有聚合算子操作的计划节点描述信息PlanState。PlanState中有投影信息和执行计划树节点。计划节点Plan里的targetlist链表为聚合操作的一些相关信息。比如Aggref,aggref.args链表有针对哪一列进行聚合操作的信息。

AggState中的aggs链表存储有所有聚合操作函数的描述信息,最终aggref指向Plan的targetlist中。

aggstrategy指定聚合模式:有3中:

代码语言:javascript
复制
typedef enum AggStrategy
{
  AGG_PLAIN,          /* simple agg across all input rows */
  AGG_SORTED,          /* grouped agg, input must be sorted */
  AGG_HASHED,          /* grouped agg, use internal hashtable */
  AGG_MIXED          /* grouped agg, hash and sort both used */
} AggStrategy;

phase:聚合操作中间函数,比如avg的求和函数,的计算步骤。针对最终函数,并未为其进行表达式生成计算步骤,而是在finalize_aggregate函数中直接调用其函数进行计算。

peragg:聚合操作最终计算函数的元数据信息。这是一个数组,描述所有聚合操作的最终计算函数

pertrans:聚合操作中间函数的元数据信息。这也是一个数组。

pergroups:每个中间操作函数的返回值

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2022-03-18,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 yanzongshuaiDBA 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档