On my previous post I have shown how to investigate particular function execution, actually I’ve just borrowed a ready to use piece of code here: Profiling with perf
In this post you will see how to go much deeper
Let’s consider this fuction ExecInitSeqScan as an example
You can see this function exported in ../gpdb/src/include/executor/nodeSeqscan.h and having a signature as
extern SeqScanState *ExecInitSeqScan(SeqScan *node, EState *estate, int eflags);parameters of the function are having the following definitions in turn
typedef Scan SeqScan; --plannodes.h --======================================================= typedef struct Scan --plannodes.h { Plan plan; Index scanrelid; /* relid is index into the range table */ } Scan; --=================================================== typedef struct Plan --plannodes.h { NodeTag type; /* * estimated execution costs for plan (see costsize.c for more info) */ Cost startup_cost; /* cost expended before fetching any tuples */ Cost total_cost; /* total cost (assuming all tuples fetched) */ /* * planner's estimate of result size of this plan step */ double plan_rows; /* number of rows plan is expected to emit */ int plan_width; /* average row width in bytes */ /* * information needed for parallel query */ bool parallel_aware; /* engage parallel-aware logic? */ bool parallel_safe; /* OK to use as part of parallel plan? */ /* * Common structural data for all Plan types. */ int plan_node_id; /* unique across entire final plan tree */ List *targetlist; /* target list to be computed at this node */ List *qual; /* implicitly-ANDed qual conditions */ struct Plan *lefttree; /* input plan tree(s) */ struct Plan *righttree; List *initPlan; /* Init Plan nodes (un-correlated expr * subselects) */ /* * Information for management of parameter-change-driven rescanning * * extParam includes the paramIDs of all external PARAM_EXEC params * affecting this plan node or its children. setParam params from the * node's initPlans are not included, but their extParams are. * * allParam includes all the extParam paramIDs, plus the IDs of local * params that affect the node (i.e., the setParams of its initplans). * These are _all_ the PARAM_EXEC params that affect this node. */ Bitmapset *extParam; Bitmapset *allParam; /* * MPP needs to keep track of the characteristics of flow of output * tuple of Plan nodes. */ Flow *flow; /* Flow description. Initially NULL. * Set during parallelization. */ /** * How much memory (in KB) should be used to execute this plan node? */ uint64 operatorMemKB; } Plan;When you have you code compiled with debug information – you can actually “perf” not only entry point of the fuction but rather many other lines of code, you can check which ones by
sudo perf probe --line ExecInitSeqScan -x /usr/local/gpdb/bin/postgres
Let’s proceed with our function of choice though
-- At first we delete potentially already set probe for ExecInitSeqScan sudo perf probe -d ExecInitSeqScan -- we tranverse those SeqScan node parameter via SeqScan->Plan-> Plan.plan_rows sudo perf probe -x /usr/local/gpdb/bin/postgres ExecInitSeqScan 'node->plan.plan_rows:s64'after that we can start probing (on running gpdb instance)
sudo perf record -e probe_postgres:ExecInitSeqScan -u gpadmin -- -u gbadmin means all processes under gpadmin userwe are starting some workload in a separate session/terminal, then stop recording via Ctrl-C
several probe results are saved in perf.data(default) file.
we can see “formatted “results as
sudo perf script -i perf.dataYou will see many rows like
postgres 25226 [003] 9568.804588: probe_postgres:ExecInitSeqScan: (9a7960) plan_rows=0 postgres 25313 [003] 9585.033424: probe_postgres:ExecInitSeqScan: (9a7960) plan_rows=0 postgres 25330 [000] 9585.193666: probe_postgres:ExecInitSeqScan: (9a7960) plan_rows=0 postgres 25335 [002] 9585.329603: probe_postgres:ExecInitSeqScan: (9a7960) plan_rows=0 postgres 25335 [002] 9585.362406: probe_postgres:ExecInitSeqScan: (9a7960) plan_rows=0