Re: Vacuum statistics - Alena Rybakina

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Alena Rybakina <[email protected]>
To: pgsql-hackers <[email protected]>
Cc: Alexander Korotkov <[email protected]>
Cc: Jim Nasby <[email protected]>
Cc: Ilia Evdokimov <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Andrei Zubkov <[email protected]>
Cc: Masahiko Sawada <[email protected]>
Cc: Melanie Plageman <[email protected]>
Cc: jian he <[email protected]>
Cc: [email protected]
Cc: Sami Imseih <[email protected]>
Subject: Re: Vacuum statistics
Date: Mon, 17 Feb 2025 17:46:16 +0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<CAPpHfduwY8-fp34CuO9O57ouCs1K=Gn1rTnuG4AaWYhEo6nXyw@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAPpHfds=woPcB9nPtMmu=g=U9q6-FHFh7fF_x=uhU3k2Oi03sA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CAPpHfduoJEuoixPTTg2tjhnXqrdobuMaQGxriqxJ9TjN1uxOuA@mail.gmail.com>
	<[email protected]>

On 04.02.2025 18:22, Alena Rybakina wrote:
> Hi! Thank you for your review!
>
> On 02.02.2025 23:43, Alexander Korotkov wrote:
>> On Mon, Jan 13, 2025 at 3:26 PM Alena Rybakina
>> <[email protected]> wrote:
>>> I noticed that the cfbot is bad, the reason seems to be related to 
>>> the lack of a parameter in 
>>> src/backend/utils/misc/postgresql.conf.sample. I added it, it should 
>>> help.
>> The patch doesn't apply cleanly.  Please rebase.
> I rebased them.
The patch needed a rebase again. There is nothing new since version 18, 
only a rebase.

-- 
Regards,
Alena Rybakina
Postgres Professional


Attachments:

  [text/x-patch] v19-0001-Implement-Self-Join-Elimination.patch (131.4K, 2-v19-0001-Implement-Self-Join-Elimination.patch)
  download | inline diff:
From fc069a3a6319b5bf40d2f0f1efceae1c9b7a68a8 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <[email protected]>
Date: Thu, 13 Feb 2025 00:56:03 +0200
Subject: [PATCH 1/4] Implement Self-Join Elimination
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The Self-Join Elimination (SJE) feature removes an inner join of a plain
table to itself in the query tree if it is proven that the join can be
replaced with a scan without impacting the query result.  Self-join and
inner relation get replaced with the outer in query, equivalence classes,
and planner info structures.  Also, the inner restrictlist moves to the
outer one with the removal of duplicated clauses.  Thus, this optimization
reduces the length of the range table list (this especially makes sense for
partitioned relations), reduces the number of restriction clauses and,
in turn, selectivity estimations, and potentially improves total planner
prediction for the query.

This feature is dedicated to avoiding redundancy, which can appear after
pull-up transformations or the creation of an EquivalenceClass-derived clause
like the below.

  SELECT * FROM t1 WHERE x IN (SELECT t3.x FROM t1 t3);
  SELECT * FROM t1 WHERE EXISTS (SELECT t3.x FROM t1 t3 WHERE t3.x = t1.x);
  SELECT * FROM t1,t2, t1 t3 WHERE t1.x = t2.x AND t2.x = t3.x;

In the future, we could also reduce redundancy caused by subquery pull-up
after unnecessary outer join removal in cases like the one below.

  SELECT * FROM t1 WHERE x IN
    (SELECT t3.x FROM t1 t3 LEFT JOIN t2 ON t2.x = t1.x);

Also, it can drastically help to join partitioned tables, removing entries
even before their expansion.

The SJE proof is based on innerrel_is_unique() machinery.

We can remove a self-join when for each outer row:

 1. At most, one inner row matches the join clause;
 2. Each matched inner row must be (physically) the same as the outer one;
 3. Inner and outer rows have the same row mark.

In this patch, we use the next approach to identify a self-join:

 1. Collect all merge-joinable join quals which look like a.x = b.x;
 2. Add to the list above the baseretrictinfo of the inner table;
 3. Check innerrel_is_unique() for the qual list.  If it returns false, skip
    this pair of joining tables;
 4. Check uniqueness, proved by the baserestrictinfo clauses. To prove the
    possibility of self-join elimination, the inner and outer clauses must
    match exactly.

The relation replacement procedure is not trivial and is partly combined
with the one used to remove useless left joins.  Tests covering this feature
were added to join.sql.  Some of the existing regression tests changed due
to self-join removal logic.

Discussion: https://postgr.es/m/flat/64486b0b-0404-e39e-322d-0801154901f3%40postgrespro.ru
Author: Andrey Lepikhov <[email protected]>
Author: Alexander Kuzmenkov <[email protected]>
Co-authored-by: Alexander Korotkov <[email protected]>
Co-authored-by: Alena Rybakina <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Simon Riggs <[email protected]>
Reviewed-by: Jonathan S. Katz <[email protected]>
Reviewed-by: David Rowley <[email protected]>
Reviewed-by: Thomas Munro <[email protected]>
Reviewed-by: Konstantin Knizhnik <[email protected]>
Reviewed-by: Heikki Linnakangas <[email protected]>
Reviewed-by: Hywel Carver <[email protected]>
Reviewed-by: Laurenz Albe <[email protected]>
Reviewed-by: Ronan Dunklau <[email protected]>
Reviewed-by: vignesh C <[email protected]>
Reviewed-by: Zhihong Yu <[email protected]>
Reviewed-by: Greg Stark <[email protected]>
Reviewed-by: Jaime Casanova <[email protected]>
Reviewed-by: Michał Kłeczek <[email protected]>
Reviewed-by: Alena Rybakina <[email protected]>
Reviewed-by: Alexander Korotkov <[email protected]>
---
 doc/src/sgml/config.sgml                  |   16 +
 src/backend/optimizer/path/equivclass.c   |    3 +-
 src/backend/optimizer/path/indxpath.c     |   39 +
 src/backend/optimizer/plan/analyzejoins.c | 1240 +++++++++++++++++++--
 src/backend/optimizer/plan/planmain.c     |    5 +
 src/backend/optimizer/prep/prepunion.c    |    9 +-
 src/backend/rewrite/rewriteManip.c        |  126 ++-
 src/backend/utils/misc/guc_tables.c       |   10 +
 src/include/nodes/pathnodes.h             |   40 +-
 src/include/optimizer/optimizer.h         |    2 +
 src/include/optimizer/paths.h             |    3 +
 src/include/optimizer/planmain.h          |    6 +
 src/include/rewrite/rewriteManip.h        |    4 +
 src/test/regress/expected/equivclass.out  |   30 +
 src/test/regress/expected/join.out        | 1083 ++++++++++++++++++
 src/test/regress/expected/sysviews.out    |    3 +-
 src/test/regress/sql/equivclass.sql       |   16 +
 src/test/regress/sql/join.sql             |  494 ++++++++
 src/tools/pgindent/typedefs.list          |    2 +
 19 files changed, 2983 insertions(+), 148 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 60829b79d83..336630ce417 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5544,6 +5544,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable_self_join_elimination" xreflabel="enable_self_join_elimination">
+      <term><varname>enable_self_join_elimination</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_self_join_elimination</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's optimization which analyses
+        the query tree and replaces self joins with semantically equivalent
+        single scans.  Takes into consideration only plain tables.
+        The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-seqscan" xreflabel="enable_seqscan">
       <term><varname>enable_seqscan</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 7cafaca33c5..0f9ecf5ee8b 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -852,7 +852,8 @@ find_computable_ec_member(PlannerInfo *root,
 	exprvars = pull_var_clause((Node *) exprs,
 							   PVC_INCLUDE_AGGREGATES |
 							   PVC_INCLUDE_WINDOWFUNCS |
-							   PVC_INCLUDE_PLACEHOLDERS);
+							   PVC_INCLUDE_PLACEHOLDERS |
+							   PVC_INCLUDE_CONVERTROWTYPES);
 
 	foreach(lc, ec->ec_members)
 	{
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 6e2051efc65..a43ca16d683 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4162,6 +4162,22 @@ bool
 relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 							  List *restrictlist,
 							  List *exprlist, List *oprlist)
+{
+	return relation_has_unique_index_ext(root, rel, restrictlist,
+										 exprlist, oprlist, NULL);
+}
+
+/*
+ * relation_has_unique_index_ext
+ *	  Same as relation_has_unique_index_for(), but supports extra_clauses
+ *	  parameter.  If extra_clauses isn't NULL, return baserestrictinfo clauses
+ *	  which were used to derive uniqueness.
+ */
+bool
+relation_has_unique_index_ext(PlannerInfo *root, RelOptInfo *rel,
+							  List *restrictlist,
+							  List *exprlist, List *oprlist,
+							  List **extra_clauses)
 {
 	ListCell   *ic;
 
@@ -4217,6 +4233,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 	{
 		IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
 		int			c;
+		List	   *exprs = NIL;
 
 		/*
 		 * If the index is not unique, or not immediately enforced, or if it's
@@ -4268,6 +4285,24 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 				if (match_index_to_operand(rexpr, c, ind))
 				{
 					matched = true; /* column is unique */
+
+					if (bms_membership(rinfo->clause_relids) == BMS_SINGLETON)
+					{
+						MemoryContext oldMemCtx =
+							MemoryContextSwitchTo(root->planner_cxt);
+
+						/*
+						 * Add filter clause into a list allowing caller to
+						 * know if uniqueness have made not only by join
+						 * clauses.
+						 */
+						Assert(bms_is_empty(rinfo->left_relids) ||
+							   bms_is_empty(rinfo->right_relids));
+						if (extra_clauses)
+							exprs = lappend(exprs, rinfo);
+						MemoryContextSwitchTo(oldMemCtx);
+					}
+
 					break;
 				}
 			}
@@ -4310,7 +4345,11 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 
 		/* Matched all key columns of this index? */
 		if (c == ind->nkeycolumns)
+		{
+			if (extra_clauses)
+				*extra_clauses = exprs;
 			return true;
+		}
 	}
 
 	return false;
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index b33fc671775..3aa04d0d4e1 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,6 +22,7 @@
  */
 #include "postgres.h"
 
+#include "catalog/pg_class.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/optimizer.h"
@@ -30,27 +31,48 @@
 #include "optimizer/placeholder.h"
 #include "optimizer/planmain.h"
 #include "optimizer/restrictinfo.h"
+#include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
 
+/*
+ * Utility structure.  A sorting procedure is needed to simplify the search
+ * of SJE-candidate baserels referencing the same database relation.  Having
+ * collected all baserels from the query jointree, the planner sorts them
+ * according to the reloid value, groups them with the next pass and attempts
+ * to remove self-joins.
+ *
+ * Preliminary sorting prevents quadratic behavior that can be harmful in the
+ * case of numerous joins.
+ */
+typedef struct
+{
+	int			relid;
+	Oid			reloid;
+} SelfJoinCandidate;
+
+bool		enable_self_join_elimination;
+
 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
-static void remove_rel_from_query(PlannerInfo *root, int relid,
-								  SpecialJoinInfo *sjinfo);
+static void remove_leftjoinrel_from_query(PlannerInfo *root, int relid,
+										  SpecialJoinInfo *sjinfo);
 static void remove_rel_from_restrictinfo(RestrictInfo *rinfo,
 										 int relid, int ojrelid);
 static void remove_rel_from_eclass(EquivalenceClass *ec,
-								   int relid, int ojrelid);
+								   int relid, int ojrelid, int subst);
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static bool rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel);
 static bool rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel,
-								List *clause_list);
+								List *clause_list, List **extra_clauses);
 static Oid	distinct_col_search(int colno, List *colnos, List *opids);
 static bool is_innerrel_unique_for(PlannerInfo *root,
 								   Relids joinrelids,
 								   Relids outerrelids,
 								   RelOptInfo *innerrel,
 								   JoinType jointype,
-								   List *restrictlist);
+								   List *restrictlist,
+								   List **extra_clauses);
+static int	self_join_candidates_cmp(const void *a, const void *b);
 
 
 /*
@@ -88,7 +110,7 @@ restart:
 		 */
 		innerrelid = bms_singleton_member(sjinfo->min_righthand);
 
-		remove_rel_from_query(root, innerrelid, sjinfo);
+		remove_leftjoinrel_from_query(root, innerrelid, sjinfo);
 
 		/* We verify that exactly one reference gets removed from joinlist */
 		nremoved = 0;
@@ -276,7 +298,7 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 	 * Now that we have the relevant equality join clauses, try to prove the
 	 * innerrel distinct.
 	 */
-	if (rel_is_distinct_for(root, innerrel, clause_list))
+	if (rel_is_distinct_for(root, innerrel, clause_list, NULL))
 		return true;
 
 	/*
@@ -288,36 +310,31 @@ join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo)
 
 
 /*
- * Remove the target relid and references to the target join from the
+ * Remove the target rel->relid and references to the target join from the
  * planner's data structures, having determined that there is no need
- * to include them in the query.
+ * to include them in the query. Optionally replace them with subst if subst
+ * is non-negative.
  *
- * We are not terribly thorough here.  We only bother to update parts of
- * the planner's data structures that will actually be consulted later.
+ * This function updates only parts needed for both left-join removal and
+ * self-join removal.
  */
 static void
-remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
+remove_rel_from_query(PlannerInfo *root, RelOptInfo *rel,
+					  int subst, SpecialJoinInfo *sjinfo,
+					  Relids joinrelids)
 {
-	RelOptInfo *rel = find_base_rel(root, relid);
-	int			ojrelid = sjinfo->ojrelid;
-	Relids		joinrelids;
-	Relids		join_plus_commute;
-	List	   *joininfos;
+	int			relid = rel->relid;
+	int			ojrelid = (sjinfo != NULL) ? sjinfo->ojrelid : -1;
 	Index		rti;
 	ListCell   *l;
 
-	/* Compute the relid set for the join we are considering */
-	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
-	Assert(ojrelid != 0);
-	joinrelids = bms_add_member(joinrelids, ojrelid);
-
 	/*
 	 * Update all_baserels and related relid sets.
 	 */
-	root->all_baserels = bms_del_member(root->all_baserels, relid);
-	root->outer_join_rels = bms_del_member(root->outer_join_rels, ojrelid);
-	root->all_query_rels = bms_del_member(root->all_query_rels, relid);
-	root->all_query_rels = bms_del_member(root->all_query_rels, ojrelid);
+	root->all_baserels = adjust_relid_set(root->all_baserels, relid, subst);
+	root->outer_join_rels = adjust_relid_set(root->outer_join_rels, ojrelid, subst);
+	root->all_query_rels = adjust_relid_set(root->all_query_rels, relid, subst);
+	root->all_query_rels = adjust_relid_set(root->all_query_rels, ojrelid, subst);
 
 	/*
 	 * Likewise remove references from SpecialJoinInfo data structures.
@@ -341,20 +358,33 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
 		sjinf->min_righthand = bms_copy(sjinf->min_righthand);
 		sjinf->syn_lefthand = bms_copy(sjinf->syn_lefthand);
 		sjinf->syn_righthand = bms_copy(sjinf->syn_righthand);
-		/* Now remove relid and ojrelid bits from the sets: */
-		sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, relid);
-		sjinf->min_righthand = bms_del_member(sjinf->min_righthand, relid);
-		sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, relid);
-		sjinf->syn_righthand = bms_del_member(sjinf->syn_righthand, relid);
-		sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, ojrelid);
-		sjinf->min_righthand = bms_del_member(sjinf->min_righthand, ojrelid);
-		sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, ojrelid);
-		sjinf->syn_righthand = bms_del_member(sjinf->syn_righthand, ojrelid);
-		/* relid cannot appear in these fields, but ojrelid can: */
-		sjinf->commute_above_l = bms_del_member(sjinf->commute_above_l, ojrelid);
-		sjinf->commute_above_r = bms_del_member(sjinf->commute_above_r, ojrelid);
-		sjinf->commute_below_l = bms_del_member(sjinf->commute_below_l, ojrelid);
-		sjinf->commute_below_r = bms_del_member(sjinf->commute_below_r, ojrelid);
+		/* Now remove relid from the sets: */
+		sjinf->min_lefthand = adjust_relid_set(sjinf->min_lefthand, relid, subst);
+		sjinf->min_righthand = adjust_relid_set(sjinf->min_righthand, relid, subst);
+		sjinf->syn_lefthand = adjust_relid_set(sjinf->syn_lefthand, relid, subst);
+		sjinf->syn_righthand = adjust_relid_set(sjinf->syn_righthand, relid, subst);
+
+		if (sjinfo != NULL)
+		{
+			Assert(subst <= 0 && ojrelid > 0);
+
+			/* Remove ojrelid bits from the sets: */
+			sjinf->min_lefthand = bms_del_member(sjinf->min_lefthand, ojrelid);
+			sjinf->min_righthand = bms_del_member(sjinf->min_righthand, ojrelid);
+			sjinf->syn_lefthand = bms_del_member(sjinf->syn_lefthand, ojrelid);
+			sjinf->syn_righthand = bms_del_member(sjinf->syn_righthand, ojrelid);
+			/* relid cannot appear in these fields, but ojrelid can: */
+			sjinf->commute_above_l = bms_del_member(sjinf->commute_above_l, ojrelid);
+			sjinf->commute_above_r = bms_del_member(sjinf->commute_above_r, ojrelid);
+			sjinf->commute_below_l = bms_del_member(sjinf->commute_below_l, ojrelid);
+			sjinf->commute_below_r = bms_del_member(sjinf->commute_below_r, ojrelid);
+		}
+		else
+		{
+			Assert(subst > 0 && ojrelid == -1);
+
+			ChangeVarNodes((Node *) sjinf->semi_rhs_exprs, relid, subst, 0);
+		}
 	}
 
 	/*
@@ -375,10 +405,10 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
 	{
 		PlaceHolderInfo *phinfo = (PlaceHolderInfo *) lfirst(l);
 
-		Assert(!bms_is_member(relid, phinfo->ph_lateral));
+		Assert(sjinfo == NULL || !bms_is_member(relid, phinfo->ph_lateral));
 		if (bms_is_subset(phinfo->ph_needed, joinrelids) &&
 			bms_is_member(relid, phinfo->ph_eval_at) &&
-			!bms_is_member(ojrelid, phinfo->ph_eval_at))
+			(sjinfo == NULL || !bms_is_member(ojrelid, phinfo->ph_eval_at)))
 		{
 			root->placeholder_list = foreach_delete_current(root->placeholder_list,
 															l);
@@ -388,21 +418,112 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
 		{
 			PlaceHolderVar *phv = phinfo->ph_var;
 
-			phinfo->ph_eval_at = bms_del_member(phinfo->ph_eval_at, relid);
-			phinfo->ph_eval_at = bms_del_member(phinfo->ph_eval_at, ojrelid);
+			phinfo->ph_eval_at = adjust_relid_set(phinfo->ph_eval_at, relid, subst);
+			phinfo->ph_eval_at = adjust_relid_set(phinfo->ph_eval_at, ojrelid, subst);
 			Assert(!bms_is_empty(phinfo->ph_eval_at));	/* checked previously */
 			/* Reduce ph_needed to contain only "relation 0"; see below */
 			if (bms_is_member(0, phinfo->ph_needed))
 				phinfo->ph_needed = bms_make_singleton(0);
 			else
 				phinfo->ph_needed = NULL;
-			phv->phrels = bms_del_member(phv->phrels, relid);
-			phv->phrels = bms_del_member(phv->phrels, ojrelid);
+
+			phinfo->ph_lateral = adjust_relid_set(phinfo->ph_lateral, relid, subst);
+
+			/*
+			 * ph_lateral might contain rels mentioned in ph_eval_at after the
+			 * replacement, remove them.
+			 */
+			phinfo->ph_lateral = bms_difference(phinfo->ph_lateral, phinfo->ph_eval_at);
+			/* ph_lateral might or might not be empty */
+
+			phv->phrels = adjust_relid_set(phv->phrels, relid, subst);
+			phv->phrels = adjust_relid_set(phv->phrels, ojrelid, subst);
 			Assert(!bms_is_empty(phv->phrels));
+
+			ChangeVarNodes((Node *) phv->phexpr, relid, subst, 0);
+
 			Assert(phv->phnullingrels == NULL); /* no need to adjust */
 		}
 	}
 
+	/*
+	 * Likewise remove references from EquivalenceClasses.
+	 */
+	foreach(l, root->eq_classes)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
+
+		if (bms_is_member(relid, ec->ec_relids) ||
+			(sjinfo == NULL || bms_is_member(ojrelid, ec->ec_relids)))
+			remove_rel_from_eclass(ec, relid, ojrelid, subst);
+	}
+
+	/*
+	 * Finally, we must recompute per-Var attr_needed and per-PlaceHolderVar
+	 * ph_needed relid sets.  These have to be known accurately, else we may
+	 * fail to remove other now-removable outer joins.  And our removal of the
+	 * join clause(s) for this outer join may mean that Vars that were
+	 * formerly needed no longer are.  So we have to do this honestly by
+	 * repeating the construction of those relid sets.  We can cheat to one
+	 * small extent: we can avoid re-examining the targetlist and HAVING qual
+	 * by preserving "relation 0" bits from the existing relid sets.  This is
+	 * safe because we'd never remove such references.
+	 *
+	 * So, start by removing all other bits from attr_needed sets and
+	 * lateral_vars lists.  (We already did this above for ph_needed.)
+	 */
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *otherrel = root->simple_rel_array[rti];
+		int			attroff;
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (otherrel == NULL)
+			continue;
+
+		Assert(otherrel->relid == rti); /* sanity check on array */
+
+		for (attroff = otherrel->max_attr - otherrel->min_attr;
+			 attroff >= 0;
+			 attroff--)
+		{
+			if (bms_is_member(0, otherrel->attr_needed[attroff]))
+				otherrel->attr_needed[attroff] = bms_make_singleton(0);
+			else
+				otherrel->attr_needed[attroff] = NULL;
+		}
+
+		if (subst > 0)
+			ChangeVarNodes((Node *) otherrel->lateral_vars, relid, subst, 0);
+	}
+}
+
+/*
+ * Remove the target relid and references to the target join from the
+ * planner's data structures, having determined that there is no need
+ * to include them in the query.
+ *
+ * We are not terribly thorough here.  We only bother to update parts of
+ * the planner's data structures that will actually be consulted later.
+ */
+static void
+remove_leftjoinrel_from_query(PlannerInfo *root, int relid,
+							  SpecialJoinInfo *sjinfo)
+{
+	RelOptInfo *rel = find_base_rel(root, relid);
+	int			ojrelid = sjinfo->ojrelid;
+	Relids		joinrelids;
+	Relids		join_plus_commute;
+	List	   *joininfos;
+	ListCell   *l;
+
+	/* Compute the relid set for the join we are considering */
+	joinrelids = bms_union(sjinfo->min_lefthand, sjinfo->min_righthand);
+	Assert(ojrelid != 0);
+	joinrelids = bms_add_member(joinrelids, ojrelid);
+
+	remove_rel_from_query(root, rel, -1, sjinfo, joinrelids);
+
 	/*
 	 * Remove any joinquals referencing the rel from the joininfo lists.
 	 *
@@ -465,18 +586,6 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
 		}
 	}
 
-	/*
-	 * Likewise remove references from EquivalenceClasses.
-	 */
-	foreach(l, root->eq_classes)
-	{
-		EquivalenceClass *ec = (EquivalenceClass *) lfirst(l);
-
-		if (bms_is_member(relid, ec->ec_relids) ||
-			bms_is_member(ojrelid, ec->ec_relids))
-			remove_rel_from_eclass(ec, relid, ojrelid);
-	}
-
 	/*
 	 * There may be references to the rel in root->fkey_list, but if so,
 	 * match_foreign_keys_to_quals() will get rid of them.
@@ -492,42 +601,6 @@ remove_rel_from_query(PlannerInfo *root, int relid, SpecialJoinInfo *sjinfo)
 	/* And nuke the RelOptInfo, just in case there's another access path */
 	pfree(rel);
 
-	/*
-	 * Finally, we must recompute per-Var attr_needed and per-PlaceHolderVar
-	 * ph_needed relid sets.  These have to be known accurately, else we may
-	 * fail to remove other now-removable outer joins.  And our removal of the
-	 * join clause(s) for this outer join may mean that Vars that were
-	 * formerly needed no longer are.  So we have to do this honestly by
-	 * repeating the construction of those relid sets.  We can cheat to one
-	 * small extent: we can avoid re-examining the targetlist and HAVING qual
-	 * by preserving "relation 0" bits from the existing relid sets.  This is
-	 * safe because we'd never remove such references.
-	 *
-	 * So, start by removing all other bits from attr_needed sets.  (We
-	 * already did this above for ph_needed.)
-	 */
-	for (rti = 1; rti < root->simple_rel_array_size; rti++)
-	{
-		RelOptInfo *otherrel = root->simple_rel_array[rti];
-		int			attroff;
-
-		/* there may be empty slots corresponding to non-baserel RTEs */
-		if (otherrel == NULL)
-			continue;
-
-		Assert(otherrel->relid == rti); /* sanity check on array */
-
-		for (attroff = otherrel->max_attr - otherrel->min_attr;
-			 attroff >= 0;
-			 attroff--)
-		{
-			if (bms_is_member(0, otherrel->attr_needed[attroff]))
-				otherrel->attr_needed[attroff] = bms_make_singleton(0);
-			else
-				otherrel->attr_needed[attroff] = NULL;
-		}
-	}
-
 	/*
 	 * Now repeat construction of attr_needed bits coming from all other
 	 * sources.
@@ -607,13 +680,13 @@ remove_rel_from_restrictinfo(RestrictInfo *rinfo, int relid, int ojrelid)
  * level(s).
  */
 static void
-remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid)
+remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid, int subst)
 {
 	ListCell   *lc;
 
 	/* Fix up the EC's overall relids */
-	ec->ec_relids = bms_del_member(ec->ec_relids, relid);
-	ec->ec_relids = bms_del_member(ec->ec_relids, ojrelid);
+	ec->ec_relids = adjust_relid_set(ec->ec_relids, relid, subst);
+	ec->ec_relids = adjust_relid_set(ec->ec_relids, ojrelid, subst);
 
 	/*
 	 * Fix up the member expressions.  Any non-const member that ends with
@@ -625,11 +698,11 @@ remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid)
 		EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc);
 
 		if (bms_is_member(relid, cur_em->em_relids) ||
-			bms_is_member(ojrelid, cur_em->em_relids))
+			(ojrelid != -1 && bms_is_member(ojrelid, cur_em->em_relids)))
 		{
 			Assert(!cur_em->em_is_const);
-			cur_em->em_relids = bms_del_member(cur_em->em_relids, relid);
-			cur_em->em_relids = bms_del_member(cur_em->em_relids, ojrelid);
+			cur_em->em_relids = adjust_relid_set(cur_em->em_relids, relid, subst);
+			cur_em->em_relids = adjust_relid_set(cur_em->em_relids, ojrelid, subst);
 			if (bms_is_empty(cur_em->em_relids))
 				ec->ec_members = foreach_delete_current(ec->ec_members, lc);
 		}
@@ -640,7 +713,10 @@ remove_rel_from_eclass(EquivalenceClass *ec, int relid, int ojrelid)
 	{
 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
 
-		remove_rel_from_restrictinfo(rinfo, relid, ojrelid);
+		if (ojrelid == -1)
+			ChangeVarNodes((Node *) rinfo, relid, subst, 0);
+		else
+			remove_rel_from_restrictinfo(rinfo, relid, ojrelid);
 	}
 
 	/*
@@ -844,9 +920,15 @@ rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel)
  * Note that the passed-in clause_list may be destructively modified!  This
  * is OK for current uses, because the clause_list is built by the caller for
  * the sole purpose of passing to this function.
+ *
+ * (*extra_clauses) to be set to the right sides of baserestrictinfo clauses,
+ * looking like "x = const" if distinctness is derived from such clauses, not
+ * joininfo clauses.  Pass NULL to the extra_clauses if this value is not
+ * needed.
  */
 static bool
-rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list)
+rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list,
+					List **extra_clauses)
 {
 	/*
 	 * We could skip a couple of tests here if we assume all callers checked
@@ -859,10 +941,11 @@ rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel, List *clause_list)
 	{
 		/*
 		 * Examine the indexes to see if we have a matching unique index.
-		 * relation_has_unique_index_for automatically adds any usable
+		 * relation_has_unique_index_ext automatically adds any usable
 		 * restriction clauses for the rel, so we needn't do that here.
 		 */
-		if (relation_has_unique_index_for(root, rel, clause_list, NIL, NIL))
+		if (relation_has_unique_index_ext(root, rel, clause_list, NIL, NIL,
+										  extra_clauses))
 			return true;
 	}
 	else if (rel->rtekind == RTE_SUBQUERY)
@@ -1176,9 +1259,35 @@ innerrel_is_unique(PlannerInfo *root,
 				   JoinType jointype,
 				   List *restrictlist,
 				   bool force_cache)
+{
+	return innerrel_is_unique_ext(root, joinrelids, outerrelids, innerrel,
+								  jointype, restrictlist, force_cache, NULL);
+}
+
+/*
+ * innerrel_is_unique_ext
+ *	  Do the same as innerrel_is_unique(), but also set to (*extra_clauses)
+ *	  additional clauses from a baserestrictinfo list used to prove the
+ *	  uniqueness.
+ *
+ * A non-NULL extra_clauses indicates that we're checking for self-join and
+ * correspondingly dealing with filtered clauses.
+ */
+bool
+innerrel_is_unique_ext(PlannerInfo *root,
+					   Relids joinrelids,
+					   Relids outerrelids,
+					   RelOptInfo *innerrel,
+					   JoinType jointype,
+					   List *restrictlist,
+					   bool force_cache,
+					   List **extra_clauses)
 {
 	MemoryContext old_context;
 	ListCell   *lc;
+	UniqueRelInfo *uniqueRelInfo;
+	List	   *outer_exprs = NIL;
+	bool		self_join = (extra_clauses != NULL);
 
 	/* Certainly can't prove uniqueness when there are no joinclauses */
 	if (restrictlist == NIL)
@@ -1193,17 +1302,28 @@ innerrel_is_unique(PlannerInfo *root,
 
 	/*
 	 * Query the cache to see if we've managed to prove that innerrel is
-	 * unique for any subset of this outerrel.  We don't need an exact match,
-	 * as extra outerrels can't make the innerrel any less unique (or more
-	 * formally, the restrictlist for a join to a superset outerrel must be a
-	 * superset of the conditions we successfully used before).
+	 * unique for any subset of this outerrel.  For non-self-join search, we
+	 * don't need an exact match, as extra outerrels can't make the innerrel
+	 * any less unique (or more formally, the restrictlist for a join to a
+	 * superset outerrel must be a superset of the conditions we successfully
+	 * used before). For self-join search, we require an exact match of
+	 * outerrels because we need extra clauses to be valid for our case. Also,
+	 * for self-join checking we've filtered the clauses list.  Thus, we can
+	 * match only the result cached for a self-join search for another
+	 * self-join check.
 	 */
 	foreach(lc, innerrel->unique_for_rels)
 	{
-		Relids		unique_for_rels = (Relids) lfirst(lc);
+		uniqueRelInfo = (UniqueRelInfo *) lfirst(lc);
 
-		if (bms_is_subset(unique_for_rels, outerrelids))
+		if ((!self_join && bms_is_subset(uniqueRelInfo->outerrelids, outerrelids)) ||
+			(self_join && bms_equal(uniqueRelInfo->outerrelids, outerrelids) &&
+			 uniqueRelInfo->self_join))
+		{
+			if (extra_clauses)
+				*extra_clauses = uniqueRelInfo->extra_clauses;
 			return true;		/* Success! */
+		}
 	}
 
 	/*
@@ -1220,7 +1340,8 @@ innerrel_is_unique(PlannerInfo *root,
 
 	/* No cached information, so try to make the proof. */
 	if (is_innerrel_unique_for(root, joinrelids, outerrelids, innerrel,
-							   jointype, restrictlist))
+							   jointype, restrictlist,
+							   self_join ? &outer_exprs : NULL))
 	{
 		/*
 		 * Cache the positive result for future probes, being sure to keep it
@@ -1233,10 +1354,16 @@ innerrel_is_unique(PlannerInfo *root,
 		 * supersets of them anyway.
 		 */
 		old_context = MemoryContextSwitchTo(root->planner_cxt);
+		uniqueRelInfo = makeNode(UniqueRelInfo);
+		uniqueRelInfo->outerrelids = bms_copy(outerrelids);
+		uniqueRelInfo->self_join = self_join;
+		uniqueRelInfo->extra_clauses = outer_exprs;
 		innerrel->unique_for_rels = lappend(innerrel->unique_for_rels,
-											bms_copy(outerrelids));
+											uniqueRelInfo);
 		MemoryContextSwitchTo(old_context);
 
+		if (extra_clauses)
+			*extra_clauses = outer_exprs;
 		return true;			/* Success! */
 	}
 	else
@@ -1282,7 +1409,8 @@ is_innerrel_unique_for(PlannerInfo *root,
 					   Relids outerrelids,
 					   RelOptInfo *innerrel,
 					   JoinType jointype,
-					   List *restrictlist)
+					   List *restrictlist,
+					   List **extra_clauses)
 {
 	List	   *clause_list = NIL;
 	ListCell   *lc;
@@ -1312,17 +1440,895 @@ is_innerrel_unique_for(PlannerInfo *root,
 			continue;			/* not mergejoinable */
 
 		/*
-		 * Check if clause has the form "outer op inner" or "inner op outer",
-		 * and if so mark which side is inner.
+		 * Check if the clause has the form "outer op inner" or "inner op
+		 * outer", and if so mark which side is inner.
 		 */
 		if (!clause_sides_match_join(restrictinfo, outerrelids,
 									 innerrel->relids))
 			continue;			/* no good for these input relations */
 
-		/* OK, add to list */
+		/* OK, add to the list */
 		clause_list = lappend(clause_list, restrictinfo);
 	}
 
 	/* Let rel_is_distinct_for() do the hard work */
-	return rel_is_distinct_for(root, innerrel, clause_list);
+	return rel_is_distinct_for(root, innerrel, clause_list, extra_clauses);
+}
+
+/*
+ * Update EC members to point to the remaining relation instead of the removed
+ * one, removing duplicates.
+ *
+ * Restriction clauses for base relations are already distributed to
+ * the respective baserestrictinfo lists (see
+ * generate_implied_equalities_for_column). The above code has already processed
+ * this list and updated these clauses to reference the remaining
+ * relation, so that we can skip them here based on their relids.
+ *
+ * Likewise, we have already processed the join clauses that join the
+ * removed relation to the remaining one.
+ *
+ * Finally, there might be join clauses tying the removed relation to
+ * some third relation.  We can't just delete the source clauses and
+ * regenerate them from the EC because the corresponding equality
+ * operators might be missing (see the handling of ec_broken).
+ * Therefore, we will update the references in the source clauses.
+ *
+ * Derived clauses can be generated again, so it is simpler just to
+ * delete them.
+ */
+static void
+update_eclasses(EquivalenceClass *ec, int from, int to)
+{
+	List	   *new_members = NIL;
+	List	   *new_sources = NIL;
+
+	foreach_node(EquivalenceMember, em, ec->ec_members)
+	{
+		bool		is_redundant = false;
+
+		if (!bms_is_member(from, em->em_relids))
+		{
+			new_members = lappend(new_members, em);
+			continue;
+		}
+
+		em->em_relids = adjust_relid_set(em->em_relids, from, to);
+		em->em_jdomain->jd_relids = adjust_relid_set(em->em_jdomain->jd_relids, from, to);
+
+		/* We only process inner joins */
+		ChangeVarNodes((Node *) em->em_expr, from, to, 0);
+
+		foreach_node(EquivalenceMember, other, new_members)
+		{
+			if (!equal(em->em_relids, other->em_relids))
+				continue;
+
+			if (equal(em->em_expr, other->em_expr))
+			{
+				is_redundant = true;
+				break;
+			}
+		}
+
+		if (!is_redundant)
+			new_members = lappend(new_members, em);
+	}
+
+	list_free(ec->ec_members);
+	ec->ec_members = new_members;
+
+	list_free(ec->ec_derives);
+	ec->ec_derives = NULL;
+
+	/* Update EC source expressions */
+	foreach_node(RestrictInfo, rinfo, ec->ec_sources)
+	{
+		bool		is_redundant = false;
+
+		if (!bms_is_member(from, rinfo->required_relids))
+		{
+			new_sources = lappend(new_sources, rinfo);
+			continue;
+		}
+
+		ChangeVarNodes((Node *) rinfo, from, to, 0);
+
+		/*
+		 * After switching the clause to the remaining relation, check it for
+		 * redundancy with existing ones. We don't have to check for
+		 * redundancy with derived clauses, because we've just deleted them.
+		 */
+		foreach_node(RestrictInfo, other, new_sources)
+		{
+			if (!equal(rinfo->clause_relids, other->clause_relids))
+				continue;
+
+			if (equal(rinfo->clause, other->clause))
+			{
+				is_redundant = true;
+				break;
+			}
+		}
+
+		if (!is_redundant)
+			new_sources = lappend(new_sources, rinfo);
+	}
+
+	list_free(ec->ec_sources);
+	ec->ec_sources = new_sources;
+	ec->ec_relids = adjust_relid_set(ec->ec_relids, from, to);
+}
+
+/*
+ * "Logically" compares two RestrictInfo's ignoring the 'rinfo_serial' field,
+ * which makes almost every RestrictInfo unique.  This type of comparison is
+ * useful when removing duplicates while moving RestrictInfo's from removed
+ * relation to remaining relation during self-join elimination.
+ *
+ * XXX: In the future, we might remove the 'rinfo_serial' field completely and
+ * get rid of this function.
+ */
+static bool
+restrict_infos_logically_equal(RestrictInfo *a, RestrictInfo *b)
+{
+	int			saved_rinfo_serial = a->rinfo_serial;
+	bool		result;
+
+	a->rinfo_serial = b->rinfo_serial;
+	result = equal(a, b);
+	a->rinfo_serial = saved_rinfo_serial;
+
+	return result;
+}
+
+/*
+ * This function adds all non-redundant clauses to the keeping relation
+ * during self-join elimination.  That is a contradictory operation. On the
+ * one hand, we reduce the length of the `restrict` lists, which can
+ * impact planning or executing time.  Additionally, we improve the
+ * accuracy of cardinality estimation.  On the other hand, it is one more
+ * place that can make planning time much longer in specific cases.  It
+ * would have been better to avoid calling the equal() function here, but
+ * it's the only way to detect duplicated inequality expressions.
+ *
+ * (*keep_rinfo_list) is given by pointer because it might be altered by
+ * distribute_restrictinfo_to_rels().
+ */
+static void
+add_non_redundant_clauses(PlannerInfo *root,
+						  List *rinfo_candidates,
+						  List **keep_rinfo_list,
+						  Index removed_relid)
+{
+	foreach_node(RestrictInfo, rinfo, rinfo_candidates)
+	{
+		bool		is_redundant = false;
+
+		Assert(!bms_is_member(removed_relid, rinfo->required_relids));
+
+		foreach_node(RestrictInfo, src, (*keep_rinfo_list))
+		{
+			if (!bms_equal(src->clause_relids, rinfo->clause_relids))
+				/* Can't compare trivially different clauses */
+				continue;
+
+			if (src == rinfo ||
+				(rinfo->parent_ec != NULL &&
+				 src->parent_ec == rinfo->parent_ec) ||
+				restrict_infos_logically_equal(rinfo, src))
+			{
+				is_redundant = true;
+				break;
+			}
+		}
+		if (!is_redundant)
+			distribute_restrictinfo_to_rels(root, rinfo);
+	}
+}
+
+/*
+ * Remove a relation after we have proven that it participates only in an
+ * unneeded unique self-join.
+ *
+ * Replace any links in planner info structures.
+ *
+ * Transfer join and restriction clauses from the removed relation to the
+ * remaining one. We change the Vars of the clause to point to the
+ * remaining relation instead of the removed one. The clauses that require
+ * a subset of joinrelids become restriction clauses of the remaining
+ * relation, and others remain join clauses. We append them to
+ * baserestrictinfo and joininfo, respectively, trying not to introduce
+ * duplicates.
+ *
+ * We also have to process the 'joinclauses' list here, because it
+ * contains EC-derived join clauses which must become filter clauses. It
+ * is not enough to just correct the ECs because the EC-derived
+ * restrictions are generated before join removal (see
+ * generate_base_implied_equalities).
+ *
+ * NOTE: Remember to keep the code in sync with PlannerInfo to be sure all
+ * cached relids and relid bitmapsets can be correctly cleaned during the
+ * self-join elimination procedure.
+ */
+static void
+remove_self_join_rel(PlannerInfo *root, PlanRowMark *kmark, PlanRowMark *rmark,
+					 RelOptInfo *toKeep, RelOptInfo *toRemove,
+					 List *restrictlist)
+{
+	List	   *joininfos;
+	ListCell   *lc;
+	int			i;
+	List	   *jinfo_candidates = NIL;
+	List	   *binfo_candidates = NIL;
+
+	Assert(toKeep->relid > 0);
+	Assert(toRemove->relid > 0);
+
+	/*
+	 * Replace the index of the removing table with the keeping one. The
+	 * technique of removing/distributing restrictinfo is used here to attach
+	 * just appeared (for keeping relation) join clauses and avoid adding
+	 * duplicates of those that already exist in the joininfo list.
+	 */
+	joininfos = list_copy(toRemove->joininfo);
+	foreach_node(RestrictInfo, rinfo, joininfos)
+	{
+		remove_join_clause_from_rels(root, rinfo, rinfo->required_relids);
+		ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+
+		if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
+			jinfo_candidates = lappend(jinfo_candidates, rinfo);
+		else
+			binfo_candidates = lappend(binfo_candidates, rinfo);
+	}
+
+	/*
+	 * Concatenate restrictlist to the list of base restrictions of the
+	 * removing table just to simplify the replacement procedure: all of them
+	 * weren't connected to any keeping relations and need to be added to some
+	 * rels.
+	 */
+	toRemove->baserestrictinfo = list_concat(toRemove->baserestrictinfo,
+											 restrictlist);
+	foreach_node(RestrictInfo, rinfo, toRemove->baserestrictinfo)
+	{
+		ChangeVarNodes((Node *) rinfo, toRemove->relid, toKeep->relid, 0);
+
+		if (bms_membership(rinfo->required_relids) == BMS_MULTIPLE)
+			jinfo_candidates = lappend(jinfo_candidates, rinfo);
+		else
+			binfo_candidates = lappend(binfo_candidates, rinfo);
+	}
+
+	/*
+	 * Now, add all non-redundant clauses to the keeping relation.
+	 */
+	add_non_redundant_clauses(root, binfo_candidates,
+							  &toKeep->baserestrictinfo, toRemove->relid);
+	add_non_redundant_clauses(root, jinfo_candidates,
+							  &toKeep->joininfo, toRemove->relid);
+
+	list_free(binfo_candidates);
+	list_free(jinfo_candidates);
+
+	/*
+	 * Arrange equivalence classes, mentioned removing a table, with the
+	 * keeping one: varno of removing table should be replaced in members and
+	 * sources lists. Also, remove duplicated elements if this replacement
+	 * procedure created them.
+	 */
+	i = -1;
+	while ((i = bms_next_member(toRemove->eclass_indexes, i)) >= 0)
+	{
+		EquivalenceClass *ec = (EquivalenceClass *) list_nth(root->eq_classes, i);
+
+		update_eclasses(ec, toRemove->relid, toKeep->relid);
+		toKeep->eclass_indexes = bms_add_member(toKeep->eclass_indexes, i);
+	}
+
+	/*
+	 * Transfer the targetlist and attr_needed flags.
+	 */
+
+	foreach(lc, toRemove->reltarget->exprs)
+	{
+		Node	   *node = lfirst(lc);
+
+		ChangeVarNodes(node, toRemove->relid, toKeep->relid, 0);
+		if (!list_member(toKeep->reltarget->exprs, node))
+			toKeep->reltarget->exprs = lappend(toKeep->reltarget->exprs, node);
+	}
+
+	for (i = toKeep->min_attr; i <= toKeep->max_attr; i++)
+	{
+		int			attno = i - toKeep->min_attr;
+
+		toRemove->attr_needed[attno] = adjust_relid_set(toRemove->attr_needed[attno],
+														toRemove->relid, toKeep->relid);
+		toKeep->attr_needed[attno] = bms_add_members(toKeep->attr_needed[attno],
+													 toRemove->attr_needed[attno]);
+	}
+
+	/*
+	 * If the removed relation has a row mark, transfer it to the remaining
+	 * one.
+	 *
+	 * If both rels have row marks, just keep the one corresponding to the
+	 * remaining relation because we verified earlier that they have the same
+	 * strength.
+	 */
+	if (rmark)
+	{
+		if (kmark)
+		{
+			Assert(kmark->markType == rmark->markType);
+
+			root->rowMarks = list_delete_ptr(root->rowMarks, rmark);
+		}
+		else
+		{
+			/* Shouldn't have inheritance children here. */
+			Assert(rmark->rti == rmark->prti);
+
+			rmark->rti = rmark->prti = toKeep->relid;
+		}
+	}
+
+	/*
+	 * Replace varno in all the query structures, except nodes RangeTblRef
+	 * otherwise later remove_rel_from_joinlist will yield errors.
+	 */
+	ChangeVarNodesExtended((Node *) root->parse, toRemove->relid, toKeep->relid, 0, false);
+
+	/* Replace links in the planner info */
+	remove_rel_from_query(root, toRemove, toKeep->relid, NULL, NULL);
+
+	/* At last, replace varno in root targetlist and HAVING clause */
+	ChangeVarNodes((Node *) root->processed_tlist, toRemove->relid, toKeep->relid, 0);
+	ChangeVarNodes((Node *) root->processed_groupClause, toRemove->relid, toKeep->relid, 0);
+
+	adjust_relid_set(root->all_result_relids, toRemove->relid, toKeep->relid);
+	adjust_relid_set(root->leaf_result_relids, toRemove->relid, toKeep->relid);
+
+	/*
+	 * There may be references to the rel in root->fkey_list, but if so,
+	 * match_foreign_keys_to_quals() will get rid of them.
+	 */
+
+	/*
+	 * Finally, remove the rel from the baserel array to prevent it from being
+	 * referenced again.  (We can't do this earlier because
+	 * remove_join_clause_from_rels will touch it.)
+	 */
+	root->simple_rel_array[toRemove->relid] = NULL;
+
+	/* And nuke the RelOptInfo, just in case there's another access path. */
+	pfree(toRemove);
+
+	/*
+	 * Now repeat construction of attr_needed bits coming from all other
+	 * sources.
+	 */
+	rebuild_placeholder_attr_needed(root);
+	rebuild_joinclause_attr_needed(root);
+	rebuild_eclass_attr_needed(root);
+	rebuild_lateral_attr_needed(root);
+}
+
+/*
+ * split_selfjoin_quals
+ *		Processes 'joinquals' by building two lists: one containing the quals
+ *		where the columns/exprs are on either side of the join match and
+ *		another one containing the remaining quals.
+ *
+ * 'joinquals' must only contain quals for a RTE_RELATION being joined to
+ * itself.
+ */
+static void
+split_selfjoin_quals(PlannerInfo *root, List *joinquals, List **selfjoinquals,
+					 List **otherjoinquals, int from, int to)
+{
+	List	   *sjoinquals = NIL;
+	List	   *ojoinquals = NIL;
+
+	foreach_node(RestrictInfo, rinfo, joinquals)
+	{
+		OpExpr	   *expr;
+		Node	   *leftexpr;
+		Node	   *rightexpr;
+
+		/* In general, clause looks like F(arg1) = G(arg2) */
+		if (!rinfo->mergeopfamilies ||
+			bms_num_members(rinfo->clause_relids) != 2 ||
+			bms_membership(rinfo->left_relids) != BMS_SINGLETON ||
+			bms_membership(rinfo->right_relids) != BMS_SINGLETON)
+		{
+			ojoinquals = lappend(ojoinquals, rinfo);
+			continue;
+		}
+
+		expr = (OpExpr *) rinfo->clause;
+
+		if (!IsA(expr, OpExpr) || list_length(expr->args) != 2)
+		{
+			ojoinquals = lappend(ojoinquals, rinfo);
+			continue;
+		}
+
+		leftexpr = get_leftop(rinfo->clause);
+		rightexpr = copyObject(get_rightop(rinfo->clause));
+
+		if (leftexpr && IsA(leftexpr, RelabelType))
+			leftexpr = (Node *) ((RelabelType *) leftexpr)->arg;
+		if (rightexpr && IsA(rightexpr, RelabelType))
+			rightexpr = (Node *) ((RelabelType *) rightexpr)->arg;
+
+		/*
+		 * Quite an expensive operation, narrowing the use case. For example,
+		 * when we have cast of the same var to different (but compatible)
+		 * types.
+		 */
+		ChangeVarNodes(rightexpr, bms_singleton_member(rinfo->right_relids),
+					   bms_singleton_member(rinfo->left_relids), 0);
+
+		if (equal(leftexpr, rightexpr))
+			sjoinquals = lappend(sjoinquals, rinfo);
+		else
+			ojoinquals = lappend(ojoinquals, rinfo);
+	}
+
+	*selfjoinquals = sjoinquals;
+	*otherjoinquals = ojoinquals;
+}
+
+/*
+ * Check for a case when uniqueness is at least partly derived from a
+ * baserestrictinfo clause. In this case, we have a chance to return only
+ * one row (if such clauses on both sides of SJ are equal) or nothing (if they
+ * are different).
+ */
+static bool
+match_unique_clauses(PlannerInfo *root, RelOptInfo *outer, List *uclauses,
+					 Index relid)
+{
+	foreach_node(RestrictInfo, rinfo, uclauses)
+	{
+		Expr	   *clause;
+		Node	   *iclause;
+		Node	   *c1;
+		bool		matched = false;
+
+		Assert(outer->relid > 0 && relid > 0);
+
+		/* Only filters like f(R.x1,...,R.xN) == expr we should consider. */
+		Assert(bms_is_empty(rinfo->left_relids) ^
+			   bms_is_empty(rinfo->right_relids));
+
+		clause = (Expr *) copyObject(rinfo->clause);
+		ChangeVarNodes((Node *) clause, relid, outer->relid, 0);
+
+		iclause = bms_is_empty(rinfo->left_relids) ? get_rightop(clause) :
+			get_leftop(clause);
+		c1 = bms_is_empty(rinfo->left_relids) ? get_leftop(clause) :
+			get_rightop(clause);
+
+		/*
+		 * Compare these left and right sides with the corresponding sides of
+		 * the outer's filters. If no one is detected - return immediately.
+		 */
+		foreach_node(RestrictInfo, orinfo, outer->baserestrictinfo)
+		{
+			Node	   *oclause;
+			Node	   *c2;
+
+			if (orinfo->mergeopfamilies == NIL)
+				/* Don't consider clauses that aren't similar to 'F(X)=G(Y)' */
+				continue;
+
+			Assert(is_opclause(orinfo->clause));
+
+			oclause = bms_is_empty(orinfo->left_relids) ?
+				get_rightop(orinfo->clause) : get_leftop(orinfo->clause);
+			c2 = (bms_is_empty(orinfo->left_relids) ?
+				  get_leftop(orinfo->clause) : get_rightop(orinfo->clause));
+
+			if (equal(iclause, oclause) && equal(c1, c2))
+			{
+				matched = true;
+				break;
+			}
+		}
+
+		if (!matched)
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * Find and remove unique self-joins in a group of base relations that have
+ * the same Oid.
+ *
+ * Returns a set of relids that were removed.
+ */
+static Relids
+remove_self_joins_one_group(PlannerInfo *root, Relids relids)
+{
+	Relids		result = NULL;
+	int			k;				/* Index of kept relation */
+	int			r = -1;			/* Index of removed relation */
+
+	while ((r = bms_next_member(relids, r)) > 0)
+	{
+		RelOptInfo *inner = root->simple_rel_array[r];
+
+		k = r;
+
+		while ((k = bms_next_member(relids, k)) > 0)
+		{
+			Relids		joinrelids = NULL;
+			RelOptInfo *outer = root->simple_rel_array[k];
+			List	   *restrictlist;
+			List	   *selfjoinquals;
+			List	   *otherjoinquals;
+			ListCell   *lc;
+			bool		jinfo_check = true;
+			PlanRowMark *omark = NULL;
+			PlanRowMark *imark = NULL;
+			List	   *uclauses = NIL;
+
+			/* A sanity check: the relations have the same Oid. */
+			Assert(root->simple_rte_array[k]->relid ==
+				   root->simple_rte_array[r]->relid);
+
+			/*
+			 * It is impossible to eliminate the join of two relations if they
+			 * belong to different rules of order. Otherwise, the planner
+			 * can't find any variants of the correct query plan.
+			 */
+			foreach(lc, root->join_info_list)
+			{
+				SpecialJoinInfo *info = (SpecialJoinInfo *) lfirst(lc);
+
+				if ((bms_is_member(k, info->syn_lefthand) ^
+					 bms_is_member(r, info->syn_lefthand)) ||
+					(bms_is_member(k, info->syn_righthand) ^
+					 bms_is_member(r, info->syn_righthand)))
+				{
+					jinfo_check = false;
+					break;
+				}
+			}
+			if (!jinfo_check)
+				continue;
+
+			/*
+			 * Check Row Marks equivalence. We can't remove the join if the
+			 * relations have row marks of different strength (e.g., one is
+			 * locked FOR UPDATE, and another just has ROW_MARK_REFERENCE for
+			 * EvalPlanQual rechecking).
+			 */
+			foreach(lc, root->rowMarks)
+			{
+				PlanRowMark *rowMark = (PlanRowMark *) lfirst(lc);
+
+				if (rowMark->rti == k)
+				{
+					Assert(imark == NULL);
+					imark = rowMark;
+				}
+				else if (rowMark->rti == r)
+				{
+					Assert(omark == NULL);
+					omark = rowMark;
+				}
+
+				if (omark && imark)
+					break;
+			}
+			if (omark && imark && omark->markType != imark->markType)
+				continue;
+
+			/*
+			 * We only deal with base rels here, so their relids bitset
+			 * contains only one member -- their relid.
+			 */
+			joinrelids = bms_add_member(joinrelids, r);
+			joinrelids = bms_add_member(joinrelids, k);
+
+			/*
+			 * PHVs should not impose any constraints on removing self-joins.
+			 */
+
+			/*
+			 * At this stage, joininfo lists of inner and outer can contain
+			 * only clauses required for a superior outer join that can't
+			 * influence this optimization. So, we can avoid to call the
+			 * build_joinrel_restrictlist() routine.
+			 */
+			restrictlist = generate_join_implied_equalities(root, joinrelids,
+															inner->relids,
+															outer, NULL);
+			if (restrictlist == NIL)
+				continue;
+
+			/*
+			 * Process restrictlist to separate the self-join quals from the
+			 * other quals. e.g., "x = x" goes to selfjoinquals and "a = b" to
+			 * otherjoinquals.
+			 */
+			split_selfjoin_quals(root, restrictlist, &selfjoinquals,
+								 &otherjoinquals, inner->relid, outer->relid);
+
+			Assert(list_length(restrictlist) ==
+				   (list_length(selfjoinquals) + list_length(otherjoinquals)));
+
+			/*
+			 * To enable SJE for the only degenerate case without any self
+			 * join clauses at all, add baserestrictinfo to this list. The
+			 * degenerate case works only if both sides have the same clause.
+			 * So doesn't matter which side to add.
+			 */
+			selfjoinquals = list_concat(selfjoinquals, outer->baserestrictinfo);
+
+			/*
+			 * Determine if the inner table can duplicate outer rows.  We must
+			 * bypass the unique rel cache here since we're possibly using a
+			 * subset of join quals. We can use 'force_cache' == true when all
+			 * join quals are self-join quals.  Otherwise, we could end up
+			 * putting false negatives in the cache.
+			 */
+			if (!innerrel_is_unique_ext(root, joinrelids, inner->relids,
+										outer, JOIN_INNER, selfjoinquals,
+										list_length(otherjoinquals) == 0,
+										&uclauses))
+				continue;
+
+			/*
+			 * 'uclauses' is the copy of outer->baserestrictinfo that are
+			 * associated with an index.  We proved by matching selfjoinquals
+			 * to a unique index that the outer relation has at most one
+			 * matching row for each inner row.  Sometimes that is not enough.
+			 * e.g. "WHERE s1.b = s2.b AND s1.a = 1 AND s2.a = 2" when the
+			 * unique index is (a,b).  Having non-empty uclauses, we must
+			 * validate that the inner baserestrictinfo contains the same
+			 * expressions, or we won't match the same row on each side of the
+			 * join.
+			 */
+			if (!match_unique_clauses(root, inner, uclauses, outer->relid))
+				continue;
+
+			/*
+			 * We can remove either relation, so remove the inner one in order
+			 * to simplify this loop.
+			 */
+			remove_self_join_rel(root, omark, imark, outer, inner, restrictlist);
+
+			result = bms_add_member(result, r);
+
+			/* We have removed the outer relation, try the next one. */
+			break;
+		}
+	}
+
+	return result;
+}
+
+/*
+ * Gather indexes of base relations from the joinlist and try to eliminate self
+ * joins.
+ */
+static Relids
+remove_self_joins_recurse(PlannerInfo *root, List *joinlist, Relids toRemove)
+{
+	ListCell   *jl;
+	Relids		relids = NULL;
+	SelfJoinCandidate *candidates = NULL;
+	int			i;
+	int			j;
+	int			numRels;
+
+	/* Collect indexes of base relations of the join tree */
+	foreach(jl, joinlist)
+	{
+		Node	   *jlnode = (Node *) lfirst(jl);
+
+		if (IsA(jlnode, RangeTblRef))
+		{
+			int			varno = ((RangeTblRef *) jlnode)->rtindex;
+			RangeTblEntry *rte = root->simple_rte_array[varno];
+
+			/*
+			 * We only consider ordinary relations as candidates to be
+			 * removed, and these relations should not have TABLESAMPLE
+			 * clauses specified.  Removing a relation with TABLESAMPLE clause
+			 * could potentially change the syntax of the query. Because of
+			 * UPDATE/DELETE EPQ mechanism, currently Query->resultRelation or
+			 * Query->mergeTargetRelation associated rel cannot be eliminated.
+			 */
+			if (rte->rtekind == RTE_RELATION &&
+				rte->relkind == RELKIND_RELATION &&
+				rte->tablesample == NULL &&
+				varno != root->parse->resultRelation &&
+				varno != root->parse->mergeTargetRelation)
+			{
+				Assert(!bms_is_member(varno, relids));
+				relids = bms_add_member(relids, varno);
+			}
+		}
+		else if (IsA(jlnode, List))
+		{
+			/* Recursively go inside the sub-joinlist */
+			toRemove = remove_self_joins_recurse(root, (List *) jlnode,
+												 toRemove);
+		}
+		else
+			elog(ERROR, "unrecognized joinlist node type: %d",
+				 (int) nodeTag(jlnode));
+	}
+
+	numRels = bms_num_members(relids);
+
+	/* Need at least two relations for the join */
+	if (numRels < 2)
+		return toRemove;
+
+	/*
+	 * In order to find relations with the same oid we first build an array of
+	 * candidates and then sort it by oid.
+	 */
+	candidates = (SelfJoinCandidate *) palloc(sizeof(SelfJoinCandidate) *
+											  numRels);
+	i = -1;
+	j = 0;
+	while ((i = bms_next_member(relids, i)) >= 0)
+	{
+		candidates[j].relid = i;
+		candidates[j].reloid = root->simple_rte_array[i]->relid;
+		j++;
+	}
+
+	qsort(candidates, numRels, sizeof(SelfJoinCandidate),
+		  self_join_candidates_cmp);
+
+	/*
+	 * Iteratively form a group of relation indexes with the same oid and
+	 * launch the routine that detects self-joins in this group and removes
+	 * excessive range table entries.
+	 *
+	 * At the end of the iteration, exclude the group from the overall relids
+	 * list. So each next iteration of the cycle will involve less and less
+	 * value of relids.
+	 */
+	i = 0;
+	for (j = 1; j < numRels + 1; j++)
+	{
+		if (j == numRels || candidates[j].reloid != candidates[i].reloid)
+		{
+			if (j - i >= 2)
+			{
+				/* Create a group of relation indexes with the same oid */
+				Relids		group = NULL;
+				Relids		removed;
+
+				while (i < j)
+				{
+					group = bms_add_member(group, candidates[i].relid);
+					i++;
+				}
+				relids = bms_del_members(relids, group);
+
+				/*
+				 * Try to remove self-joins from a group of identical entries.
+				 * Make the next attempt iteratively - if something is deleted
+				 * from a group, changes in clauses and equivalence classes
+				 * can give us a chance to find more candidates.
+				 */
+				do
+				{
+					Assert(!bms_overlap(group, toRemove));
+					removed = remove_self_joins_one_group(root, group);
+					toRemove = bms_add_members(toRemove, removed);
+					group = bms_del_members(group, removed);
+				} while (!bms_is_empty(removed) &&
+						 bms_membership(group) == BMS_MULTIPLE);
+				bms_free(removed);
+				bms_free(group);
+			}
+			else
+			{
+				/* Single relation, just remove it from the set */
+				relids = bms_del_member(relids, candidates[i].relid);
+				i = j;
+			}
+		}
+	}
+
+	Assert(bms_is_empty(relids));
+
+	return toRemove;
+}
+
+/*
+ * Compare self-join candidates by their oids.
+ */
+static int
+self_join_candidates_cmp(const void *a, const void *b)
+{
+	const SelfJoinCandidate *ca = (const SelfJoinCandidate *) a;
+	const SelfJoinCandidate *cb = (const SelfJoinCandidate *) b;
+
+	if (ca->reloid != cb->reloid)
+		return (ca->reloid < cb->reloid ? -1 : 1);
+	else
+		return 0;
+}
+
+/*
+ * Find and remove useless self joins.
+ *
+ * Search for joins where a relation is joined to itself. If the join clause
+ * for each tuple from one side of the join is proven to match the same
+ * physical row (or nothing) on the other side, that self-join can be
+ * eliminated from the query.  Suitable join clauses are assumed to be in the
+ * form of X = X, and can be replaced with NOT NULL clauses.
+ *
+ * For the sake of simplicity, we don't apply this optimization to special
+ * joins. Here is a list of what we could do in some particular cases:
+ * 'a a1 semi join a a2': is reduced to inner by reduce_unique_semijoins,
+ * and then removed normally.
+ * 'a a1 anti join a a2': could simplify to a scan with 'outer quals AND
+ * (IS NULL on join columns OR NOT inner quals)'.
+ * 'a a1 left join a a2': could simplify to a scan like inner but without
+ * NOT NULL conditions on join columns.
+ * 'a a1 left join (a a2 join b)': can't simplify this, because join to b
+ * can both remove rows and introduce duplicates.
+ *
+ * To search for removable joins, we order all the relations on their Oid,
+ * go over each set with the same Oid, and consider each pair of relations
+ * in this set.
+ *
+ * To remove the join, we mark one of the participating relations as dead
+ * and rewrite all references to it to point to the remaining relation.
+ * This includes modifying RestrictInfos, EquivalenceClasses, and
+ * EquivalenceMembers. We also have to modify the row marks. The join clauses
+ * of the removed relation become either restriction or join clauses, based on
+ * whether they reference any relations not participating in the removed join.
+ *
+ * 'joinlist' is the top-level joinlist of the query. If it has any
+ * references to the removed relations, we update them to point to the
+ * remaining ones.
+ */
+List *
+remove_useless_self_joins(PlannerInfo *root, List *joinlist)
+{
+	Relids		toRemove = NULL;
+	int			relid = -1;
+
+	if (!enable_self_join_elimination || joinlist == NIL ||
+		(list_length(joinlist) == 1 && !IsA(linitial(joinlist), List)))
+		return joinlist;
+
+	/*
+	 * Merge pairs of relations participated in self-join. Remove unnecessary
+	 * range table entries.
+	 */
+	toRemove = remove_self_joins_recurse(root, joinlist, toRemove);
+
+	if (unlikely(toRemove != NULL))
+	{
+		/* At the end, remove orphaned relation links */
+		while ((relid = bms_next_member(toRemove, relid)) >= 0)
+		{
+			int			nremoved = 0;
+
+			joinlist = remove_rel_from_joinlist(joinlist, relid, &nremoved);
+			if (nremoved != 1)
+				elog(ERROR, "failed to find relation %d in joinlist", relid);
+		}
+	}
+
+	return joinlist;
 }
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index ade23fd9d56..5467e094ca7 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -233,6 +233,11 @@ query_planner(PlannerInfo *root,
 	 */
 	reduce_unique_semijoins(root);
 
+	/*
+	 * Remove self joins on a unique column.
+	 */
+	joinlist = remove_useless_self_joins(root, joinlist);
+
 	/*
 	 * Now distribute "placeholders" to base rels as needed.  This has to be
 	 * done after join removal because removal could change whether a
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 7c27dc24e21..eab44da65b8 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -639,14 +639,17 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
 	 * otherwise do statistical estimation.
 	 *
 	 * XXX you don't really want to know about this: we do the estimation
-	 * using the subquery's original targetlist expressions, not the
+	 * using the subroot->parse's original targetlist expressions, not the
 	 * subroot->processed_tlist which might seem more appropriate.  The reason
 	 * is that if the subquery is itself a setop, it may return a
 	 * processed_tlist containing "varno 0" Vars generated by
 	 * generate_append_tlist, and those would confuse estimate_num_groups
 	 * mightily.  We ought to get rid of the "varno 0" hack, but that requires
 	 * a redesign of the parsetree representation of setops, so that there can
-	 * be an RTE corresponding to each setop's output.
+	 * be an RTE corresponding to each setop's output. Note, we use this not
+	 * subquery's targetlist but subroot->parse's targetlist, because it was
+	 * revised by self-join removal.  subquery's targetlist might contain the
+	 * references to the removed relids.
 	 */
 	if (pNumGroups)
 	{
@@ -659,7 +662,7 @@ build_setop_child_paths(PlannerInfo *root, RelOptInfo *rel,
 			*pNumGroups = rel->cheapest_total_path->rows;
 		else
 			*pNumGroups = estimate_num_groups(subroot,
-											  get_tlist_exprs(subquery->targetList, false),
+											  get_tlist_exprs(subroot->parse->targetList, false),
 											  rel->cheapest_total_path->rows,
 											  NULL,
 											  NULL);
diff --git a/src/backend/rewrite/rewriteManip.c b/src/backend/rewrite/rewriteManip.c
index a115b217c91..9433548d279 100644
--- a/src/backend/rewrite/rewriteManip.c
+++ b/src/backend/rewrite/rewriteManip.c
@@ -64,7 +64,6 @@ static bool locate_windowfunc_walker(Node *node,
 									 locate_windowfunc_context *context);
 static bool checkExprHasSubLink_walker(Node *node, void *context);
 static Relids offset_relid_set(Relids relids, int offset);
-static Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
 static Node *add_nulling_relids_mutator(Node *node,
 										add_nulling_relids_context *context);
 static Node *remove_nulling_relids_mutator(Node *node,
@@ -543,6 +542,8 @@ offset_relid_set(Relids relids, int offset)
  * (identified by sublevels_up and rt_index), and change their varno fields
  * to 'new_index'.  The varnosyn fields are changed too.  Also, adjust other
  * nodes that contain rangetable indexes, such as RangeTblRef and JoinExpr.
+ * Specifying 'change_RangeTblRef' to false allows skipping RangeTblRef.
+ * See ChangeVarNodesExtended for details.
  *
  * NOTE: although this has the form of a walker, we cheat and modify the
  * nodes in-place.  The given expression tree should have been copied
@@ -554,6 +555,7 @@ typedef struct
 	int			rt_index;
 	int			new_index;
 	int			sublevels_up;
+	bool		change_RangeTblRef;
 } ChangeVarNodes_context;
 
 static bool
@@ -586,7 +588,7 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
 			cexpr->cvarno = context->new_index;
 		return false;
 	}
-	if (IsA(node, RangeTblRef))
+	if (IsA(node, RangeTblRef) && context->change_RangeTblRef)
 	{
 		RangeTblRef *rtr = (RangeTblRef *) node;
 
@@ -633,6 +635,75 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
 		}
 		return false;
 	}
+	if (IsA(node, RestrictInfo))
+	{
+		RestrictInfo *rinfo = (RestrictInfo *) node;
+		int			relid = -1;
+		bool		is_req_equal =
+			(rinfo->required_relids == rinfo->clause_relids);
+		bool		clause_relids_is_multiple =
+			(bms_membership(rinfo->clause_relids) == BMS_MULTIPLE);
+
+		if (bms_is_member(context->rt_index, rinfo->clause_relids))
+		{
+			expression_tree_walker((Node *) rinfo->clause, ChangeVarNodes_walker, (void *) context);
+			expression_tree_walker((Node *) rinfo->orclause, ChangeVarNodes_walker, (void *) context);
+
+			rinfo->clause_relids =
+				adjust_relid_set(rinfo->clause_relids, context->rt_index, context->new_index);
+			rinfo->num_base_rels = bms_num_members(rinfo->clause_relids);
+			rinfo->left_relids =
+				adjust_relid_set(rinfo->left_relids, context->rt_index, context->new_index);
+			rinfo->right_relids =
+				adjust_relid_set(rinfo->right_relids, context->rt_index, context->new_index);
+		}
+
+		if (is_req_equal)
+			rinfo->required_relids = rinfo->clause_relids;
+		else
+			rinfo->required_relids =
+				adjust_relid_set(rinfo->required_relids, context->rt_index, context->new_index);
+
+		rinfo->outer_relids =
+			adjust_relid_set(rinfo->outer_relids, context->rt_index, context->new_index);
+		rinfo->incompatible_relids =
+			adjust_relid_set(rinfo->incompatible_relids, context->rt_index, context->new_index);
+
+		if (rinfo->mergeopfamilies &&
+			bms_get_singleton_member(rinfo->clause_relids, &relid) &&
+			clause_relids_is_multiple &&
+			relid == context->new_index && IsA(rinfo->clause, OpExpr))
+		{
+			Expr	   *leftOp;
+			Expr	   *rightOp;
+
+			leftOp = (Expr *) get_leftop(rinfo->clause);
+			rightOp = (Expr *) get_rightop(rinfo->clause);
+
+			/*
+			 * For self-join elimination, changing varnos could transform
+			 * "t1.a = t2.a" into "t1.a = t1.a".  That is always true as long
+			 * as "t1.a" is not null.  We use qual() to check for such a case,
+			 * and then we replace the qual for a check for not null
+			 * (NullTest).
+			 */
+			if (leftOp != NULL && equal(leftOp, rightOp))
+			{
+				NullTest   *ntest = makeNode(NullTest);
+
+				ntest->arg = leftOp;
+				ntest->nulltesttype = IS_NOT_NULL;
+				ntest->argisrow = false;
+				ntest->location = -1;
+				rinfo->clause = (Expr *) ntest;
+				rinfo->mergeopfamilies = NIL;
+				rinfo->left_em = NULL;
+				rinfo->right_em = NULL;
+			}
+			Assert(rinfo->orclause == NULL);
+		}
+		return false;
+	}
 	if (IsA(node, AppendRelInfo))
 	{
 		AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -665,32 +736,32 @@ ChangeVarNodes_walker(Node *node, ChangeVarNodes_context *context)
 	return expression_tree_walker(node, ChangeVarNodes_walker, context);
 }
 
+/*
+ * ChangeVarNodesExtended - similar to ChangeVarNodes, but has additional
+ *							'change_RangeTblRef' param
+ *
+ * ChangeVarNodes changes a given node and all of its underlying nodes.
+ * However, self-join elimination (SJE) needs to skip the RangeTblRef node
+ * type.  During SJE's last step, remove_rel_from_joinlist() removes
+ * remaining RangeTblRefs with target relid.  If ChangeVarNodes() replaces
+ * the target relid before, remove_rel_from_joinlist() fails to identify
+ * the nodes to delete.
+ */
 void
-ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
+ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
+					   int sublevels_up, bool change_RangeTblRef)
 {
 	ChangeVarNodes_context context;
 
 	context.rt_index = rt_index;
 	context.new_index = new_index;
 	context.sublevels_up = sublevels_up;
+	context.change_RangeTblRef = change_RangeTblRef;
 
-	/*
-	 * Must be prepared to start with a Query or a bare expression tree; if
-	 * it's a Query, go straight to query_tree_walker to make sure that
-	 * sublevels_up doesn't get incremented prematurely.
-	 */
 	if (node && IsA(node, Query))
 	{
 		Query	   *qry = (Query *) node;
 
-		/*
-		 * If we are starting at a Query, and sublevels_up is zero, then we
-		 * must also fix rangetable indexes in the Query itself --- namely
-		 * resultRelation, mergeTargetRelation, exclRelIndex  and rowMarks
-		 * entries.  sublevels_up cannot be zero when recursing into a
-		 * subquery, so there's no need to have the same logic inside
-		 * ChangeVarNodes_walker.
-		 */
 		if (sublevels_up == 0)
 		{
 			ListCell   *l;
@@ -701,7 +772,6 @@ ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
 			if (qry->mergeTargetRelation == rt_index)
 				qry->mergeTargetRelation = new_index;
 
-			/* this is unlikely to ever be used, but ... */
 			if (qry->onConflict && qry->onConflict->exclRelIndex == rt_index)
 				qry->onConflict->exclRelIndex = new_index;
 
@@ -719,15 +789,22 @@ ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
 		ChangeVarNodes_walker(node, &context);
 }
 
+void
+ChangeVarNodes(Node *node, int rt_index, int new_index, int sublevels_up)
+{
+	ChangeVarNodesExtended(node, rt_index, new_index, sublevels_up, true);
+}
+
 /*
- * Substitute newrelid for oldrelid in a Relid set
+ * adjust_relid_set - substitute newrelid for oldrelid in a Relid set
  *
- * Note: some extensions may pass a special varno such as INDEX_VAR for
- * oldrelid.  bms_is_member won't like that, but we should tolerate it.
- * (Perhaps newrelid could also be a special varno, but there had better
- * not be a reason to inject that into a nullingrels or phrels set.)
+ * Attempt to remove oldrelid from a Relid set (as long as it's not a special
+ * varno).  If oldrelid was found and removed, insert newrelid into a Relid
+ * set (as long as it's not a special varno).  Therefore, when oldrelid is
+ * a special varno, this function does nothing.  When newrelid is a special
+ * varno, this function behaves as delete.
  */
-static Relids
+Relids
 adjust_relid_set(Relids relids, int oldrelid, int newrelid)
 {
 	if (!IS_SPECIAL_VARNO(oldrelid) && bms_is_member(oldrelid, relids))
@@ -736,7 +813,8 @@ adjust_relid_set(Relids relids, int oldrelid, int newrelid)
 		relids = bms_copy(relids);
 		/* Remove old, add new */
 		relids = bms_del_member(relids, oldrelid);
-		relids = bms_add_member(relids, newrelid);
+		if (!IS_SPECIAL_VARNO(newrelid))
+			relids = bms_add_member(relids, newrelid);
 	}
 	return relids;
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 42728189322..cce73314609 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -988,6 +988,16 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_self_join_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enable removal of unique self-joins."),
+			NULL,
+			GUC_EXPLAIN | GUC_NOT_IN_SAMPLE
+		},
+		&enable_self_join_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_group_by_reordering", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables reordering of GROUP BY keys."),
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 00c700cc3e7..fbf05322c75 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -201,6 +201,11 @@ typedef struct PlannerGlobal
  * Not all fields are printed.  (In some cases, there is no print support for
  * the field type; in others, doing so would lead to infinite recursion or
  * bloat dump output more than seems useful.)
+ *
+ * NOTE: When adding new entries containing relids and relid bitmapsets,
+ * remember to check that they will be correctly processed by
+ * the remove_self_join_rel function - relid of removing relation will be
+ * correctly replaced with the keeping one.
  *----------
  */
 #ifndef HAVE_PLANNERINFO_TYPEDEF
@@ -753,7 +758,7 @@ typedef struct PartitionSchemeData *PartitionScheme;
  * populate these fields, for base rels; but someday they might be used for
  * join rels too:
  *
- *		unique_for_rels - list of Relid sets, each one being a set of other
+ *		unique_for_rels - list of UniqueRelInfo, each one being a set of other
  *					rels for which this one has been proven unique
  *		non_unique_for_rels - list of Relid sets, each one being a set of
  *					other rels for which we have tried and failed to prove
@@ -992,7 +997,7 @@ typedef struct RelOptInfo
 	/*
 	 * cache space for remembering if we have proven this relation unique
 	 */
-	/* known unique for these other relid set(s) */
+	/* known unique for these other relid set(s) given in UniqueRelInfo(s) */
 	List	   *unique_for_rels;
 	/* known not unique for these set(s) */
 	List	   *non_unique_for_rels;
@@ -3463,4 +3468,35 @@ typedef struct AggTransInfo
 	bool		initValueIsNull;
 } AggTransInfo;
 
+/*
+ * UniqueRelInfo caches a fact that a relation is unique when being joined
+ * to other relation(s).
+ */
+typedef struct UniqueRelInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/*
+	 * The relation in consideration is unique when being joined with this set
+	 * of other relation(s).
+	 */
+	Relids		outerrelids;
+
+	/*
+	 * The relation in consideration is unique when considering only clauses
+	 * suitable for self-join (passed split_selfjoin_quals()).
+	 */
+	bool		self_join;
+
+	/*
+	 * Additional clauses from a baserestrictinfo list that were used to prove
+	 * the uniqueness.   We cache it for the self-join checking procedure: a
+	 * self-join can be removed if the outer relation contains strictly the
+	 * same set of clauses.
+	 */
+	List	   *extra_clauses;
+} UniqueRelInfo;
+
 #endif							/* PATHNODES_H */
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index bcf8ed645c2..78e05d88c8e 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -192,6 +192,8 @@ extern SortGroupClause *get_sortgroupref_clause_noerr(Index sortref,
 											 * output list */
 #define PVC_RECURSE_PLACEHOLDERS	0x0020	/* recurse into PlaceHolderVar
 											 * arguments */
+#define PVC_INCLUDE_CONVERTROWTYPES	0x0040	/* include ConvertRowtypeExprs in
+											 * output list */
 
 extern Bitmapset *pull_varnos(PlannerInfo *root, Node *node);
 extern Bitmapset *pull_varnos_of_level(PlannerInfo *root, Node *node, int levelsup);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 46955d128f0..bc5dfd7db41 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -72,6 +72,9 @@ extern void create_index_paths(PlannerInfo *root, RelOptInfo *rel);
 extern bool relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 										  List *restrictlist,
 										  List *exprlist, List *oprlist);
+extern bool relation_has_unique_index_ext(PlannerInfo *root, RelOptInfo *rel,
+										  List *restrictlist, List *exprlist,
+										  List *oprlist, List **extra_clauses);
 extern bool indexcol_is_bool_constant_for_query(PlannerInfo *root,
 												IndexOptInfo *index,
 												int indexcol);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fee3378bbe3..5a930199611 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -20,6 +20,7 @@
 /* GUC parameters */
 #define DEFAULT_CURSOR_TUPLE_FRACTION 0.1
 extern PGDLLIMPORT double cursor_tuple_fraction;
+extern PGDLLIMPORT bool enable_self_join_elimination;
 
 /* query_planner callback to compute query_pathkeys */
 typedef void (*query_pathkeys_callback) (PlannerInfo *root, void *extra);
@@ -113,6 +114,11 @@ extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);
 extern bool innerrel_is_unique(PlannerInfo *root,
 							   Relids joinrelids, Relids outerrelids, RelOptInfo *innerrel,
 							   JoinType jointype, List *restrictlist, bool force_cache);
+extern bool innerrel_is_unique_ext(PlannerInfo *root, Relids joinrelids,
+								   Relids outerrelids, RelOptInfo *innerrel,
+								   JoinType jointype, List *restrictlist,
+								   bool force_cache, List **uclauses);
+extern List *remove_useless_self_joins(PlannerInfo *root, List *jointree);
 
 /*
  * prototypes for plan/setrefs.c
diff --git a/src/include/rewrite/rewriteManip.h b/src/include/rewrite/rewriteManip.h
index 512823033b9..5ec475c63e9 100644
--- a/src/include/rewrite/rewriteManip.h
+++ b/src/include/rewrite/rewriteManip.h
@@ -15,6 +15,7 @@
 #define REWRITEMANIP_H
 
 #include "nodes/parsenodes.h"
+#include "nodes/pathnodes.h"
 
 struct AttrMap;					/* avoid including attmap.h here */
 
@@ -41,11 +42,14 @@ typedef enum ReplaceVarsNoMatchOption
 } ReplaceVarsNoMatchOption;
 
 
+extern Relids adjust_relid_set(Relids relids, int oldrelid, int newrelid);
 extern void CombineRangeTables(List **dst_rtable, List **dst_perminfos,
 							   List *src_rtable, List *src_perminfos);
 extern void OffsetVarNodes(Node *node, int offset, int sublevels_up);
 extern void ChangeVarNodes(Node *node, int rt_index, int new_index,
 						   int sublevels_up);
+extern void ChangeVarNodesExtended(Node *node, int rt_index, int new_index,
+								   int sublevels_up, bool change_RangeTblRef);
 extern void IncrementVarSublevelsUp(Node *node, int delta_sublevels_up,
 									int min_sublevels_up);
 extern void IncrementVarSublevelsUp_rtable(List *rtable,
diff --git a/src/test/regress/expected/equivclass.out b/src/test/regress/expected/equivclass.out
index 56227505009..ad8ab294ff6 100644
--- a/src/test/regress/expected/equivclass.out
+++ b/src/test/regress/expected/equivclass.out
@@ -434,6 +434,36 @@ explain (costs off)
    Filter: ((unique1 IS NOT NULL) AND (unique2 IS NOT NULL))
 (2 rows)
 
+-- Test that broken ECs are processed correctly during self join removal.
+-- Disable merge joins so that we don't get an error about missing commutator.
+-- Test both orientations of the join clause, because only one of them breaks
+-- the EC.
+set enable_mergejoin to off;
+explain (costs off)
+  select * from ec0 m join ec0 n on m.ff = n.ff
+  join ec1 p on m.ff + n.ff = p.f1;
+              QUERY PLAN               
+---------------------------------------
+ Nested Loop
+   Join Filter: ((n.ff + n.ff) = p.f1)
+   ->  Seq Scan on ec0 n
+   ->  Materialize
+         ->  Seq Scan on ec1 p
+(5 rows)
+
+explain (costs off)
+  select * from ec0 m join ec0 n on m.ff = n.ff
+  join ec1 p on p.f1::int8 = (m.ff + n.ff)::int8alias1;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Nested Loop
+   Join Filter: ((p.f1)::bigint = ((n.ff + n.ff))::int8alias1)
+   ->  Seq Scan on ec0 n
+   ->  Materialize
+         ->  Seq Scan on ec1 p
+(5 rows)
+
+reset enable_mergejoin;
 -- this could be converted, but isn't at present
 explain (costs off)
   select * from tenk1 where unique1 = unique1 or unique2 = unique2;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 3ffc066b1f8..a57bb18c24f 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -5966,6 +5966,27 @@ select c.id, ss.a from c
          ->  Seq Scan on c
 (7 rows)
 
+-- check the case when the placeholder relates to an outer join and its
+-- inner in the press field but actually uses only the outer side of the join
+explain (costs off)
+SELECT q.val FROM b LEFT JOIN (
+  SELECT (q1.z IS NOT NULL) AS val
+  FROM b LEFT JOIN (
+    SELECT (t1.b_id IS NOT NULL) AS z FROM a t1 LEFT JOIN a t2 USING (id)
+    ) AS q1
+  ON true
+) AS q ON true;
+                QUERY PLAN                
+------------------------------------------
+ Nested Loop Left Join
+   ->  Seq Scan on b
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on b b_1
+               ->  Materialize
+                     ->  Seq Scan on a t1
+(7 rows)
+
 CREATE TEMP TABLE parted_b (id int PRIMARY KEY) partition by range(id);
 CREATE TEMP TABLE parted_b1 partition of parted_b for values from (0) to (10);
 -- test join removals on a partitioned table
@@ -6377,6 +6398,1068 @@ select * from
 ----+----+----+----
 (0 rows)
 
+--
+-- test that semi- or inner self-joins on a unique column are removed
+--
+-- enable only nestloop to get more predictable plans
+set enable_hashjoin to off;
+set enable_mergejoin to off;
+create table sj (a int unique, b int, c int unique);
+insert into sj values (1, null, 2), (null, 2, null), (2, 1, 1);
+analyze sj;
+-- Trivial self-join case.
+explain (costs off)
+select p.* from sj p, sj q where q.a = p.a and q.b = q.a - 1;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on sj q
+   Filter: ((a IS NOT NULL) AND (b = (a - 1)))
+(2 rows)
+
+select p.* from sj p, sj q where q.a = p.a and q.b = q.a - 1;
+ a | b | c 
+---+---+---
+ 2 | 1 | 1
+(1 row)
+
+-- Self-join removal performs after a subquery pull-up process and could remove
+-- such kind of self-join too. Check this option.
+explain (costs off)
+select * from sj p
+where exists (select * from sj q
+              where q.a = p.a and q.b < 10);
+                QUERY PLAN                
+------------------------------------------
+ Seq Scan on sj q
+   Filter: ((a IS NOT NULL) AND (b < 10))
+(2 rows)
+
+select * from sj p
+where exists (select * from sj q
+              where q.a = p.a and q.b < 10);
+ a | b | c 
+---+---+---
+ 2 | 1 | 1
+(1 row)
+
+-- Don't remove self-join for the case of equality of two different unique columns.
+explain (costs off)
+select * from sj t1, sj t2 where t1.a = t2.c and t1.b is not null;
+              QUERY PLAN               
+---------------------------------------
+ Nested Loop
+   Join Filter: (t1.a = t2.c)
+   ->  Seq Scan on sj t2
+   ->  Materialize
+         ->  Seq Scan on sj t1
+               Filter: (b IS NOT NULL)
+(6 rows)
+
+-- Ensure that relations with TABLESAMPLE clauses are not considered as
+-- candidates to be removed
+explain (costs off)
+select * from sj t1
+    join lateral
+      (select * from sj tablesample system(t1.b)) s
+    on t1.a = s.a;
+              QUERY PLAN               
+---------------------------------------
+ Nested Loop
+   ->  Seq Scan on sj t1
+   ->  Memoize
+         Cache Key: t1.a, t1.b
+         Cache Mode: binary
+         ->  Sample Scan on sj
+               Sampling: system (t1.b)
+               Filter: (t1.a = a)
+(8 rows)
+
+-- Ensure that SJE does not form a self-referential lateral dependency
+explain (costs off)
+select * from sj t1
+    left join lateral
+      (select t1.a as t1a, * from sj t2) s
+    on true
+where t1.a = s.a;
+        QUERY PLAN         
+---------------------------
+ Seq Scan on sj t2
+   Filter: (a IS NOT NULL)
+(2 rows)
+
+-- Degenerated case.
+explain (costs off)
+select * from
+  (select a as x from sj where false) as q1,
+  (select a as y from sj where false) as q2
+where q1.x = q2.y;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+-- We can't use a cross-EC generated self join qual because of current logic of
+-- the generate_join_implied_equalities routine.
+explain (costs off)
+select * from sj t1, sj t2 where t1.a = t1.b and t1.b = t2.b and t2.b = t2.a;
+          QUERY PLAN          
+------------------------------
+ Nested Loop
+   Join Filter: (t1.a = t2.b)
+   ->  Seq Scan on sj t1
+         Filter: (a = b)
+   ->  Seq Scan on sj t2
+         Filter: (b = a)
+(6 rows)
+
+explain (costs off)
+select * from sj t1, sj t2, sj t3
+where t1.a = t1.b and t1.b = t2.b and t2.b = t2.a and
+      t1.b = t3.b and t3.b = t3.a;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop
+   Join Filter: (t1.a = t3.b)
+   ->  Nested Loop
+         Join Filter: (t1.a = t2.b)
+         ->  Seq Scan on sj t1
+               Filter: (a = b)
+         ->  Seq Scan on sj t2
+               Filter: (b = a)
+   ->  Seq Scan on sj t3
+         Filter: (b = a)
+(10 rows)
+
+-- Double self-join removal.
+-- Use a condition on "b + 1", not on "b", for the second join, so that
+-- the equivalence class is different from the first one, and we can
+-- test the non-ec code path.
+explain (costs off)
+select *
+from  sj t1
+      join sj t2 on t1.a = t2.a and t1.b = t2.b
+	  join sj t3 on t2.a = t3.a and t2.b + 1 = t3.b + 1;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Seq Scan on sj t3
+   Filter: ((a IS NOT NULL) AND (b IS NOT NULL) AND ((b + 1) IS NOT NULL))
+(2 rows)
+
+-- subselect that references the removed relation
+explain (costs off)
+select t1.a, (select a from sj where a = t2.a and a = t1.a)
+from sj t1, sj t2
+where t1.a = t2.a;
+                QUERY PLAN                
+------------------------------------------
+ Seq Scan on sj t2
+   Filter: (a IS NOT NULL)
+   SubPlan 1
+     ->  Result
+           One-Time Filter: (t2.a = t2.a)
+           ->  Seq Scan on sj
+                 Filter: (a = t2.a)
+(7 rows)
+
+-- self-join under outer join
+explain (costs off)
+select * from sj x join sj y on x.a = y.a
+left join int8_tbl z on x.a = z.q1;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop Left Join
+   Join Filter: (y.a = z.q1)
+   ->  Seq Scan on sj y
+         Filter: (a IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on int8_tbl z
+(6 rows)
+
+explain (costs off)
+select * from sj x join sj y on x.a = y.a
+left join int8_tbl z on y.a = z.q1;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop Left Join
+   Join Filter: (y.a = z.q1)
+   ->  Seq Scan on sj y
+         Filter: (a IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on int8_tbl z
+(6 rows)
+
+explain (costs off)
+select * from (
+  select t1.*, t2.a as ax from sj t1 join sj t2
+  on (t1.a = t2.a and t1.c * t1.c = t2.c + 2 and t2.b is null)
+) as q1
+left join
+  (select t3.* from sj t3, sj t4 where t3.c = t4.c) as q2
+on q1.ax = q2.a;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (t2.a = t4.a)
+   ->  Seq Scan on sj t2
+         Filter: ((b IS NULL) AND (a IS NOT NULL) AND ((c * c) = (c + 2)))
+   ->  Seq Scan on sj t4
+         Filter: (c IS NOT NULL)
+(6 rows)
+
+-- Test that placeholders are updated correctly after join removal
+explain (costs off)
+select * from (values (1)) x
+left join (select coalesce(y.q1, 1) from int8_tbl y
+	right join sj j1 inner join sj j2 on j1.a = j2.a
+	on true) z
+on true;
+                QUERY PLAN                
+------------------------------------------
+ Nested Loop Left Join
+   ->  Result
+   ->  Nested Loop Left Join
+         ->  Seq Scan on sj j2
+               Filter: (a IS NOT NULL)
+         ->  Materialize
+               ->  Seq Scan on int8_tbl y
+(7 rows)
+
+-- Test that references to the removed rel in lateral subqueries are replaced
+-- correctly after join removal
+explain (verbose, costs off)
+select t3.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select t1.a offset 0) t3 on true;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop
+   Output: (t2.a)
+   ->  Seq Scan on public.sj t2
+         Output: t2.a, t2.b, t2.c
+         Filter: (t2.a IS NOT NULL)
+   ->  Result
+         Output: t2.a
+(7 rows)
+
+explain (verbose, costs off)
+select t3.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select * from (select t1.a offset 0) offset 0) t3 on true;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop
+   Output: (t2.a)
+   ->  Seq Scan on public.sj t2
+         Output: t2.a, t2.b, t2.c
+         Filter: (t2.a IS NOT NULL)
+   ->  Result
+         Output: t2.a
+(7 rows)
+
+explain (verbose, costs off)
+select t4.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select t3.a from sj t3, (select t1.a) offset 0) t4 on true;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop
+   Output: t3.a
+   ->  Seq Scan on public.sj t2
+         Output: t2.a, t2.b, t2.c
+         Filter: (t2.a IS NOT NULL)
+   ->  Seq Scan on public.sj t3
+         Output: t3.a
+(7 rows)
+
+-- Check updating of semi_rhs_exprs links from upper-level semi join to
+-- the removing relation
+explain (verbose, costs off)
+select t1.a from sj t1 where t1.b in (
+  select t2.b from sj t2 join sj t3 on t2.c=t3.c);
+                QUERY PLAN                
+------------------------------------------
+ Nested Loop Semi Join
+   Output: t1.a
+   Join Filter: (t1.b = t3.b)
+   ->  Seq Scan on public.sj t1
+         Output: t1.a, t1.b, t1.c
+   ->  Materialize
+         Output: t3.c, t3.b
+         ->  Seq Scan on public.sj t3
+               Output: t3.c, t3.b
+               Filter: (t3.c IS NOT NULL)
+(10 rows)
+
+--
+-- SJE corner case: uniqueness of an inner is [partially] derived from
+-- baserestrictinfo clauses.
+-- XXX: We really should allow SJE for these corner cases?
+--
+INSERT INTO sj VALUES (3, 1, 3);
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 3;
+          QUERY PLAN          
+------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j1
+         Filter: (a = 2)
+   ->  Seq Scan on sj j2
+         Filter: (a = 3)
+(6 rows)
+
+-- Return one row
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 3;
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+ 2 | 1 | 1 | 3 | 1 | 3
+(1 row)
+
+-- Remove SJ, define uniqueness by a constant
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 2;
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND (a = 2))
+(2 rows)
+
+-- Return one row
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 2;
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+ 2 | 1 | 1 | 2 | 1 | 1
+(1 row)
+
+-- Remove SJ, define uniqueness by a constant expression
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND j1.a = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = j2.a;
+                                                         QUERY PLAN                                                         
+----------------------------------------------------------------------------------------------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND (a = (((EXTRACT(dow FROM CURRENT_TIMESTAMP(0)) / '15'::numeric) + '3'::numeric))::integer))
+(2 rows)
+
+-- Return one row
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND j1.a = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = j2.a;
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+ 3 | 1 | 3 | 3 | 1 | 3
+(1 row)
+
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 1 AND j2.a = 1;
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND (a = 1))
+(2 rows)
+
+-- Return no rows
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 1 AND j2.a = 1;
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+(0 rows)
+
+-- Shuffle a clause. Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 1 = j1.a AND j2.a = 1;
+               QUERY PLAN                
+-----------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND (a = 1))
+(2 rows)
+
+-- Return no rows
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 1 = j1.a AND j2.a = 1;
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+(0 rows)
+
+-- SJE Corner case: a 'a.x=a.x' clause, have replaced with 'a.x IS NOT NULL'
+-- after SJ elimination it shouldn't be a mergejoinable clause.
+EXPLAIN (COSTS OFF)
+SELECT t4.*
+FROM (SELECT t1.*, t2.a AS a1 FROM sj t1, sj t2 WHERE t1.b = t2.b) AS t3
+JOIN sj t4 ON (t4.a = t3.a) WHERE t3.a1 = 42;
+           QUERY PLAN            
+---------------------------------
+ Nested Loop
+   Join Filter: (t1.b = t2.b)
+   ->  Seq Scan on sj t2
+         Filter: (a = 42)
+   ->  Seq Scan on sj t1
+         Filter: (a IS NOT NULL)
+(6 rows)
+
+SELECT t4.*
+FROM (SELECT t1.*, t2.a AS a1 FROM sj t1, sj t2 WHERE t1.b = t2.b) AS t3
+JOIN sj t4 ON (t4.a = t3.a) WHERE t3.a1 = 42;
+ a | b | c 
+---+---+---
+(0 rows)
+
+-- Functional index
+CREATE UNIQUE INDEX sj_fn_idx ON sj((a * a));
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a*j1.a = 1 AND j2.a*j2.a = 1;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND ((a * a) = 1))
+(2 rows)
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a*j1.a = 1 AND j2.a*j2.a = 2;
+          QUERY PLAN           
+-------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j1
+         Filter: ((a * a) = 1)
+   ->  Seq Scan on sj j2
+         Filter: ((a * a) = 2)
+(6 rows)
+
+-- Restriction contains expressions in both sides, Remove SJ.
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.a) = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = (j2.a*j2.a);
+                                                            QUERY PLAN                                                            
+----------------------------------------------------------------------------------------------------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND ((a * a) = (((EXTRACT(dow FROM CURRENT_TIMESTAMP(0)) / '15'::numeric) + '3'::numeric))::integer))
+(2 rows)
+
+-- Empty set of rows should be returned
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.a) = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = (j2.a*j2.a);
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+(0 rows)
+
+-- Restriction contains volatile function - disable SJE feature.
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.c/3) = (random()/3 + 3)::int
+  AND (random()/3 + 3)::int = (j2.a*j2.c/3);
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j1
+         Filter: (((a * c) / 3) = (((random() / '3'::double precision) + '3'::double precision))::integer)
+   ->  Seq Scan on sj j2
+         Filter: ((((random() / '3'::double precision) + '3'::double precision))::integer = ((a * c) / 3))
+(6 rows)
+
+-- Return one row
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.c/3) = (random()/3 + 3)::int
+  AND (random()/3 + 3)::int = (j2.a*j2.c/3);
+ a | b | c | a | b | c 
+---+---+---+---+---+---
+ 3 | 1 | 3 | 3 | 1 | 3
+(1 row)
+
+-- Multiple filters
+CREATE UNIQUE INDEX sj_temp_idx1 ON sj(a,b,c);
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a = 2 AND j1.c = 3 AND j2.a = 2 AND 3 = j2.c;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on sj j2
+   Filter: ((b IS NOT NULL) AND (a = 2) AND (c = 3))
+(2 rows)
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+	SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND 2 = j1.a AND j1.c = 3 AND j2.a = 1 AND 3 = j2.c;
+              QUERY PLAN               
+---------------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j1
+         Filter: ((2 = a) AND (c = 3))
+   ->  Seq Scan on sj j2
+         Filter: ((c = 3) AND (a = 1))
+(6 rows)
+
+CREATE UNIQUE INDEX sj_temp_idx ON sj(a,b);
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2;
+          QUERY PLAN          
+------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j1
+         Filter: (a = 2)
+   ->  Seq Scan on sj j2
+(5 rows)
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 2 = j2.a;
+          QUERY PLAN          
+------------------------------
+ Nested Loop
+   Join Filter: (j1.b = j2.b)
+   ->  Seq Scan on sj j2
+         Filter: (2 = a)
+   ->  Seq Scan on sj j1
+(5 rows)
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND (j1.a = 1 OR j2.a = 1);
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Nested Loop
+   Join Filter: ((j1.b = j2.b) AND ((j1.a = 1) OR (j2.a = 1)))
+   ->  Seq Scan on sj j1
+   ->  Materialize
+         ->  Seq Scan on sj j2
+(5 rows)
+
+DROP INDEX sj_fn_idx, sj_temp_idx1, sj_temp_idx;
+-- Test that OR predicated are updated correctly after join removal
+CREATE TABLE tab_with_flag ( id INT PRIMARY KEY, is_flag SMALLINT);
+CREATE INDEX idx_test_is_flag ON tab_with_flag (is_flag);
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM tab_with_flag
+WHERE
+	(is_flag IS NULL OR is_flag = 0)
+	AND id IN (SELECT id FROM tab_with_flag WHERE id IN (2, 3));
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on tab_with_flag
+         Recheck Cond: (id = ANY ('{2,3}'::integer[]))
+         Filter: ((is_flag IS NULL) OR (is_flag = 0))
+         ->  Bitmap Index Scan on tab_with_flag_pkey
+               Index Cond: (id = ANY ('{2,3}'::integer[]))
+(6 rows)
+
+DROP TABLE tab_with_flag;
+-- HAVING clause
+explain (costs off)
+select p.b from sj p join sj q on p.a = q.a group by p.b having sum(p.a) = 1;
+           QUERY PLAN            
+---------------------------------
+ HashAggregate
+   Group Key: q.b
+   Filter: (sum(q.a) = 1)
+   ->  Seq Scan on sj q
+         Filter: (a IS NOT NULL)
+(5 rows)
+
+-- update lateral references and range table entry reference
+explain (verbose, costs off)
+select 1 from (select x.* from sj x, sj y where x.a = y.a) q,
+  lateral generate_series(1, q.a) gs(i);
+                      QUERY PLAN                      
+------------------------------------------------------
+ Nested Loop
+   Output: 1
+   ->  Seq Scan on public.sj y
+         Output: y.a, y.b, y.c
+         Filter: (y.a IS NOT NULL)
+   ->  Function Scan on pg_catalog.generate_series gs
+         Output: gs.i
+         Function Call: generate_series(1, y.a)
+(8 rows)
+
+explain (verbose, costs off)
+select 1 from (select y.* from sj x, sj y where x.a = y.a) q,
+  lateral generate_series(1, q.a) gs(i);
+                      QUERY PLAN                      
+------------------------------------------------------
+ Nested Loop
+   Output: 1
+   ->  Seq Scan on public.sj y
+         Output: y.a, y.b, y.c
+         Filter: (y.a IS NOT NULL)
+   ->  Function Scan on pg_catalog.generate_series gs
+         Output: gs.i
+         Function Call: generate_series(1, y.a)
+(8 rows)
+
+-- Test that a non-EC-derived join clause is processed correctly. Use an
+-- outer join so that we can't form an EC.
+explain (costs off) select * from sj p join sj q on p.a = q.a
+  left join sj r on p.a + q.a = r.a;
+             QUERY PLAN             
+------------------------------------
+ Nested Loop Left Join
+   Join Filter: ((q.a + q.a) = r.a)
+   ->  Seq Scan on sj q
+         Filter: (a IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on sj r
+(6 rows)
+
+-- FIXME this constant false filter doesn't look good. Should we merge
+-- equivalence classes?
+explain (costs off)
+select * from sj p, sj q where p.a = q.a and p.b = 1 and q.b = 2;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Seq Scan on sj q
+   Filter: ((a IS NOT NULL) AND (b = 2) AND (b = 1))
+(2 rows)
+
+-- Check that attr_needed is updated correctly after self-join removal. In this
+-- test, the join of j1 with j2 is removed. k1.b is required at either j1 or j2.
+-- If this info is lost, join targetlist for (k1, k2) will not contain k1.b.
+-- Use index scan for k1 so that we don't get 'b' from physical tlist used for
+-- seqscan. Also disable reordering of joins because this test depends on a
+-- particular join tree.
+create table sk (a int, b int);
+create index on sk(a);
+set join_collapse_limit to 1;
+set enable_seqscan to off;
+explain (costs off) select 1 from
+	(sk k1 join sk k2 on k1.a = k2.a)
+	join (sj j1 join sj j2 on j1.a = j2.a) on j1.b = k1.b;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop
+   Join Filter: (k1.b = j2.b)
+   ->  Nested Loop
+         ->  Index Scan using sk_a_idx on sk k1
+         ->  Index Only Scan using sk_a_idx on sk k2
+               Index Cond: (a = k1.a)
+   ->  Materialize
+         ->  Index Scan using sj_a_key on sj j2
+               Index Cond: (a IS NOT NULL)
+(9 rows)
+
+explain (costs off) select 1 from
+	(sk k1 join sk k2 on k1.a = k2.a)
+	join (sj j1 join sj j2 on j1.a = j2.a) on j2.b = k1.b;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Nested Loop
+   Join Filter: (k1.b = j2.b)
+   ->  Nested Loop
+         ->  Index Scan using sk_a_idx on sk k1
+         ->  Index Only Scan using sk_a_idx on sk k2
+               Index Cond: (a = k1.a)
+   ->  Materialize
+         ->  Index Scan using sj_a_key on sj j2
+               Index Cond: (a IS NOT NULL)
+(9 rows)
+
+reset join_collapse_limit;
+reset enable_seqscan;
+-- Check that clauses from the join filter list is not lost on the self-join removal
+CREATE TABLE emp1 (id SERIAL PRIMARY KEY NOT NULL, code int);
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT * FROM emp1 e1, emp1 e2 WHERE e1.id = e2.id AND e2.code <> e1.code;
+                QUERY PLAN                
+------------------------------------------
+ Seq Scan on public.emp1 e2
+   Output: e2.id, e2.code, e2.id, e2.code
+   Filter: (e2.code <> e2.code)
+(3 rows)
+
+-- Shuffle self-joined relations. Only in the case of iterative deletion
+-- attempts explains of these queries will be identical.
+CREATE UNIQUE INDEX ON emp1((id*id));
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c1.id=c2.id AND c1.id*c2.id=c3.id*c3.id;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   ->  Seq Scan on emp1 c3
+         Filter: ((id * id) IS NOT NULL)
+(3 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c1.id=c3.id AND c1.id*c3.id=c2.id*c2.id;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   ->  Seq Scan on emp1 c3
+         Filter: ((id * id) IS NOT NULL)
+(3 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   ->  Seq Scan on emp1 c3
+         Filter: ((id * id) IS NOT NULL)
+(3 rows)
+
+-- Check the usage of a parse tree by the set operations (bug #18170)
+EXPLAIN (COSTS OFF)
+SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
+WHERE c2.id IS NOT NULL
+EXCEPT ALL
+SELECT c3.code FROM emp1 c3;
+        QUERY PLAN         
+---------------------------
+ HashSetOp Except All
+   ->  Seq Scan on emp1 c2
+   ->  Seq Scan on emp1 c3
+(3 rows)
+
+-- Check that SJE removes references from PHVs correctly
+explain (costs off)
+select * from emp1 t1 left join
+    (select coalesce(t3.code, 1) from emp1 t2
+        left join (emp1 t3 join emp1 t4 on t3.id = t4.id)
+        on true)
+on true;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Left Join
+   ->  Seq Scan on emp1 t1
+   ->  Materialize
+         ->  Nested Loop Left Join
+               ->  Seq Scan on emp1 t2
+               ->  Materialize
+                     ->  Seq Scan on emp1 t4
+(7 rows)
+
+-- Check that SJE removes the whole PHVs correctly
+explain (verbose, costs off)
+select 1 from emp1 t1 left join
+    ((select 1 as x, * from emp1 t2) s1 inner join
+        (select * from emp1 t3) s2 on s1.id = s2.id)
+    on true
+where s1.x = 1;
+               QUERY PLAN               
+----------------------------------------
+ Nested Loop
+   Output: 1
+   ->  Seq Scan on public.emp1 t1
+         Output: t1.id, t1.code
+   ->  Materialize
+         Output: t3.id
+         ->  Seq Scan on public.emp1 t3
+               Output: t3.id
+               Filter: (1 = 1)
+(9 rows)
+
+-- Check that PHVs do not impose any constraints on removing self joins
+explain (verbose, costs off)
+select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
+    lateral (select t1.id as t1id, * from generate_series(1,1) t3) s on true;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Nested Loop Left Join
+   Output: t2.id, t2.code, t2.id, t2.code, (t2.id), t3.t3
+   ->  Seq Scan on public.emp1 t2
+         Output: t2.id, t2.code
+   ->  Function Scan on pg_catalog.generate_series t3
+         Output: t3.t3, t2.id
+         Function Call: generate_series(1, 1)
+(7 rows)
+
+explain (verbose, costs off)
+select * from generate_series(1,10) t1(id) left join
+    lateral (select t1.id as t1id, t2.id from emp1 t2 join emp1 t3 on t2.id = t3.id)
+on true;
+                      QUERY PLAN                      
+------------------------------------------------------
+ Nested Loop Left Join
+   Output: t1.id, (t1.id), t3.id
+   ->  Function Scan on pg_catalog.generate_series t1
+         Output: t1.id
+         Function Call: generate_series(1, 10)
+   ->  Seq Scan on public.emp1 t3
+         Output: t3.id, t1.id
+(7 rows)
+
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+                  QUERY PLAN                  
+----------------------------------------------
+ Nested Loop Left Join
+   Join Filter: ((t2.id > 1) AND (t2.id < 2))
+   ->  Seq Scan on emp1 t2
+   ->  Materialize
+         ->  Seq Scan on emp1 t3
+(5 rows)
+
+-- Check that SJE doesn't replace the target relation
+EXPLAIN (COSTS OFF)
+WITH t1 AS (SELECT * FROM emp1)
+UPDATE emp1 SET code = t1.code + 1 FROM t1
+WHERE t1.id = emp1.id RETURNING emp1.id, emp1.code, t1.code;
+                      QUERY PLAN                       
+-------------------------------------------------------
+ Update on emp1
+   ->  Nested Loop
+         ->  Seq Scan on emp1
+         ->  Index Scan using emp1_pkey on emp1 emp1_1
+               Index Cond: (id = emp1.id)
+(5 rows)
+
+INSERT INTO emp1 VALUES (1, 1), (2, 1);
+WITH t1 AS (SELECT * FROM emp1)
+UPDATE emp1 SET code = t1.code + 1 FROM t1
+WHERE t1.id = emp1.id RETURNING emp1.id, emp1.code, t1.code;
+ id | code | code 
+----+------+------
+  1 |    2 |    1
+  2 |    2 |    1
+(2 rows)
+
+TRUNCATE emp1;
+EXPLAIN (COSTS OFF)
+UPDATE sj sq SET b = 1 FROM sj as sz WHERE sq.a = sz.a;
+             QUERY PLAN              
+-------------------------------------
+ Update on sj sq
+   ->  Nested Loop
+         Join Filter: (sq.a = sz.a)
+         ->  Seq Scan on sj sq
+         ->  Materialize
+               ->  Seq Scan on sj sz
+(6 rows)
+
+CREATE RULE sj_del_rule AS ON DELETE TO sj
+  DO INSTEAD
+    UPDATE sj SET a = 1 WHERE a = old.a;
+EXPLAIN (COSTS OFF) DELETE FROM sj;
+              QUERY PLAN              
+--------------------------------------
+ Update on sj sj_1
+   ->  Nested Loop
+         Join Filter: (sj.a = sj_1.a)
+         ->  Seq Scan on sj sj_1
+         ->  Materialize
+               ->  Seq Scan on sj
+(6 rows)
+
+DROP RULE sj_del_rule ON sj CASCADE;
+-- Check that SJE does not mistakenly omit qual clauses (bug #18187)
+insert into emp1 values (1, 1);
+explain (costs off)
+select 1 from emp1 full join
+    (select * from emp1 t1 join
+        emp1 t2 join emp1 t3 on t2.id = t3.id
+        on true
+    where false) s on true
+where false;
+        QUERY PLAN        
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+select 1 from emp1 full join
+    (select * from emp1 t1 join
+        emp1 t2 join emp1 t3 on t2.id = t3.id
+        on true
+    where false) s on true
+where false;
+ ?column? 
+----------
+(0 rows)
+
+-- Check that SJE does not mistakenly re-use knowledge of relation uniqueness
+-- made with different set of quals
+insert into emp1 values (2, 1);
+explain (costs off)
+select * from emp1 t1 where exists (select * from emp1 t2
+                                    where t2.id = t1.code and t2.code > 0);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on emp1 t1
+   ->  Index Scan using emp1_pkey on emp1 t2
+         Index Cond: (id = t1.code)
+         Filter: (code > 0)
+(5 rows)
+
+select * from emp1 t1 where exists (select * from emp1 t2
+                                    where t2.id = t1.code and t2.code > 0);
+ id | code 
+----+------
+  1 |    1
+  2 |    1
+(2 rows)
+
+-- We can remove the join even if we find the join can't duplicate rows and
+-- the base quals of each side are different.  In the following case we end up
+-- moving quals over to s1 to make it so it can't match any rows.
+create table sl(a int, b int, c int);
+create unique index on sl(a, b);
+vacuum analyze sl;
+-- Both sides are unique, but base quals are different
+explain (costs off)
+select * from sl t1, sl t2 where t1.a = t2.a and t1.b = 1 and t2.b = 2;
+          QUERY PLAN          
+------------------------------
+ Nested Loop
+   Join Filter: (t1.a = t2.a)
+   ->  Seq Scan on sl t1
+         Filter: (b = 1)
+   ->  Seq Scan on sl t2
+         Filter: (b = 2)
+(6 rows)
+
+-- Check NullTest in baserestrictinfo list
+explain (costs off)
+select * from sl t1, sl t2
+where t1.a = t2.a and t1.b = 1 and t2.b = 2
+  and t1.c IS NOT NULL and t2.c IS NOT NULL
+  and t2.b IS NOT NULL and t1.b IS NOT NULL
+  and t1.a IS NOT NULL and t2.a IS NOT NULL;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (t1.a = t2.a)
+   ->  Seq Scan on sl t1
+         Filter: ((c IS NOT NULL) AND (b IS NOT NULL) AND (a IS NOT NULL) AND (b = 1))
+   ->  Seq Scan on sl t2
+         Filter: ((c IS NOT NULL) AND (b IS NOT NULL) AND (a IS NOT NULL) AND (b = 2))
+(6 rows)
+
+explain (verbose, costs off)
+select * from sl t1, sl t2
+where t1.b = t2.b and t2.a = 3 and t1.a = 3
+  and t1.c IS NOT NULL and t2.c IS NOT NULL
+  and t2.b IS NOT NULL and t1.b IS NOT NULL
+  and t1.a IS NOT NULL and t2.a IS NOT NULL;
+                                         QUERY PLAN                                          
+---------------------------------------------------------------------------------------------
+ Seq Scan on public.sl t2
+   Output: t2.a, t2.b, t2.c, t2.a, t2.b, t2.c
+   Filter: ((t2.c IS NOT NULL) AND (t2.b IS NOT NULL) AND (t2.a IS NOT NULL) AND (t2.a = 3))
+(3 rows)
+
+-- Join qual isn't mergejoinable, but inner is unique.
+EXPLAIN (COSTS OFF)
+SELECT n2.a FROM sj n1, sj n2 WHERE n1.a <> n2.a AND n2.a = 1;
+          QUERY PLAN           
+-------------------------------
+ Nested Loop
+   Join Filter: (n1.a <> n2.a)
+   ->  Seq Scan on sj n2
+         Filter: (a = 1)
+   ->  Seq Scan on sj n1
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+(SELECT n2.a FROM sj n1, sj n2 WHERE n1.a <> n2.a) q0, sl
+WHERE q0.a = 1;
+          QUERY PLAN           
+-------------------------------
+ Nested Loop
+   Join Filter: (n1.a <> n2.a)
+   ->  Nested Loop
+         ->  Seq Scan on sl
+         ->  Seq Scan on sj n2
+               Filter: (a = 1)
+   ->  Seq Scan on sj n1
+(7 rows)
+
+-- Check optimization disabling if it will violate special join conditions.
+-- Two identical joined relations satisfies self join removal conditions but
+-- stay in different special join infos.
+CREATE TABLE sj_t1 (id serial, a int);
+CREATE TABLE sj_t2 (id serial, a int);
+CREATE TABLE sj_t3 (id serial, a int);
+CREATE TABLE sj_t4 (id serial, a int);
+CREATE UNIQUE INDEX ON sj_t3 USING btree (a,id);
+CREATE UNIQUE INDEX ON sj_t2 USING btree (id);
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj_t1
+JOIN (
+	SELECT sj_t2.id AS id FROM sj_t2
+	WHERE EXISTS
+		(
+		SELECT TRUE FROM sj_t3,sj_t4 WHERE sj_t3.a = 1 AND sj_t3.id = sj_t2.id
+		)
+	) t2t3t4
+ON sj_t1.id = t2t3t4.id
+JOIN (
+	SELECT sj_t2.id AS id FROM sj_t2
+	WHERE EXISTS
+		(
+		SELECT TRUE FROM sj_t3,sj_t4 WHERE sj_t3.a = 1 AND sj_t3.id = sj_t2.id
+		)
+	) _t2t3t4
+ON sj_t1.id = _t2t3t4.id;
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Nested Loop
+   Join Filter: (sj_t3.id = sj_t1.id)
+   ->  Nested Loop
+         Join Filter: (sj_t2.id = sj_t3.id)
+         ->  Nested Loop Semi Join
+               ->  Nested Loop
+                     ->  HashAggregate
+                           Group Key: sj_t3.id
+                           ->  Nested Loop
+                                 ->  Seq Scan on sj_t4
+                                 ->  Materialize
+                                       ->  Bitmap Heap Scan on sj_t3
+                                             Recheck Cond: (a = 1)
+                                             ->  Bitmap Index Scan on sj_t3_a_id_idx
+                                                   Index Cond: (a = 1)
+                     ->  Index Only Scan using sj_t2_id_idx on sj_t2 sj_t2_1
+                           Index Cond: (id = sj_t3.id)
+               ->  Nested Loop
+                     ->  Index Only Scan using sj_t3_a_id_idx on sj_t3 sj_t3_1
+                           Index Cond: ((a = 1) AND (id = sj_t3.id))
+                     ->  Seq Scan on sj_t4 sj_t4_1
+         ->  Index Only Scan using sj_t2_id_idx on sj_t2
+               Index Cond: (id = sj_t2_1.id)
+   ->  Seq Scan on sj_t1
+(24 rows)
+
+--
+-- Test RowMarks-related code
+--
+-- Both sides have explicit LockRows marks
+EXPLAIN (COSTS OFF)
+SELECT a1.a FROM sj a1,sj a2 WHERE (a1.a=a2.a) FOR UPDATE;
+           QUERY PLAN            
+---------------------------------
+ LockRows
+   ->  Seq Scan on sj a2
+         Filter: (a IS NOT NULL)
+(3 rows)
+
+reset enable_hashjoin;
+reset enable_mergejoin;
 --
 -- Test hints given on incorrect column references are useful
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 352abc0bd42..83228cfca29 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -168,10 +168,11 @@ select name, setting from pg_settings where name like 'enable%';
  enable_partitionwise_aggregate | off
  enable_partitionwise_join      | off
  enable_presorted_aggregate     | on
+ enable_self_join_elimination   | on
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(23 rows)
+(24 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/sql/equivclass.sql b/src/test/regress/sql/equivclass.sql
index 28ed7910d01..7fc2159349b 100644
--- a/src/test/regress/sql/equivclass.sql
+++ b/src/test/regress/sql/equivclass.sql
@@ -259,6 +259,22 @@ drop user regress_user_ectest;
 explain (costs off)
   select * from tenk1 where unique1 = unique1 and unique2 = unique2;
 
+-- Test that broken ECs are processed correctly during self join removal.
+-- Disable merge joins so that we don't get an error about missing commutator.
+-- Test both orientations of the join clause, because only one of them breaks
+-- the EC.
+set enable_mergejoin to off;
+
+explain (costs off)
+  select * from ec0 m join ec0 n on m.ff = n.ff
+  join ec1 p on m.ff + n.ff = p.f1;
+
+explain (costs off)
+  select * from ec0 m join ec0 n on m.ff = n.ff
+  join ec1 p on p.f1::int8 = (m.ff + n.ff)::int8alias1;
+
+reset enable_mergejoin;
+
 -- this could be converted, but isn't at present
 explain (costs off)
   select * from tenk1 where unique1 = unique1 or unique2 = unique2;
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index c7349eab933..c29d13b9fed 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -2175,6 +2175,17 @@ select c.id, ss.a from c
   left join (select d.a from onerow, d left join b on d.a = b.id) ss
   on c.id = ss.a;
 
+-- check the case when the placeholder relates to an outer join and its
+-- inner in the press field but actually uses only the outer side of the join
+explain (costs off)
+SELECT q.val FROM b LEFT JOIN (
+  SELECT (q1.z IS NOT NULL) AS val
+  FROM b LEFT JOIN (
+    SELECT (t1.b_id IS NOT NULL) AS z FROM a t1 LEFT JOIN a t2 USING (id)
+    ) AS q1
+  ON true
+) AS q ON true;
+
 CREATE TEMP TABLE parted_b (id int PRIMARY KEY) partition by range(id);
 CREATE TEMP TABLE parted_b1 partition of parted_b for values from (0) to (10);
 
@@ -2409,6 +2420,489 @@ select * from
 select * from
   int8_tbl x join (int4_tbl x cross join int4_tbl y(ff)) j on q1 = f1; -- ok
 
+--
+-- test that semi- or inner self-joins on a unique column are removed
+--
+
+-- enable only nestloop to get more predictable plans
+set enable_hashjoin to off;
+set enable_mergejoin to off;
+
+create table sj (a int unique, b int, c int unique);
+insert into sj values (1, null, 2), (null, 2, null), (2, 1, 1);
+analyze sj;
+
+-- Trivial self-join case.
+explain (costs off)
+select p.* from sj p, sj q where q.a = p.a and q.b = q.a - 1;
+select p.* from sj p, sj q where q.a = p.a and q.b = q.a - 1;
+
+-- Self-join removal performs after a subquery pull-up process and could remove
+-- such kind of self-join too. Check this option.
+explain (costs off)
+select * from sj p
+where exists (select * from sj q
+              where q.a = p.a and q.b < 10);
+select * from sj p
+where exists (select * from sj q
+              where q.a = p.a and q.b < 10);
+
+-- Don't remove self-join for the case of equality of two different unique columns.
+explain (costs off)
+select * from sj t1, sj t2 where t1.a = t2.c and t1.b is not null;
+
+-- Ensure that relations with TABLESAMPLE clauses are not considered as
+-- candidates to be removed
+explain (costs off)
+select * from sj t1
+    join lateral
+      (select * from sj tablesample system(t1.b)) s
+    on t1.a = s.a;
+
+-- Ensure that SJE does not form a self-referential lateral dependency
+explain (costs off)
+select * from sj t1
+    left join lateral
+      (select t1.a as t1a, * from sj t2) s
+    on true
+where t1.a = s.a;
+
+-- Degenerated case.
+explain (costs off)
+select * from
+  (select a as x from sj where false) as q1,
+  (select a as y from sj where false) as q2
+where q1.x = q2.y;
+
+-- We can't use a cross-EC generated self join qual because of current logic of
+-- the generate_join_implied_equalities routine.
+explain (costs off)
+select * from sj t1, sj t2 where t1.a = t1.b and t1.b = t2.b and t2.b = t2.a;
+explain (costs off)
+select * from sj t1, sj t2, sj t3
+where t1.a = t1.b and t1.b = t2.b and t2.b = t2.a and
+      t1.b = t3.b and t3.b = t3.a;
+
+-- Double self-join removal.
+-- Use a condition on "b + 1", not on "b", for the second join, so that
+-- the equivalence class is different from the first one, and we can
+-- test the non-ec code path.
+explain (costs off)
+select *
+from  sj t1
+      join sj t2 on t1.a = t2.a and t1.b = t2.b
+	  join sj t3 on t2.a = t3.a and t2.b + 1 = t3.b + 1;
+
+-- subselect that references the removed relation
+explain (costs off)
+select t1.a, (select a from sj where a = t2.a and a = t1.a)
+from sj t1, sj t2
+where t1.a = t2.a;
+
+-- self-join under outer join
+explain (costs off)
+select * from sj x join sj y on x.a = y.a
+left join int8_tbl z on x.a = z.q1;
+
+explain (costs off)
+select * from sj x join sj y on x.a = y.a
+left join int8_tbl z on y.a = z.q1;
+
+explain (costs off)
+select * from (
+  select t1.*, t2.a as ax from sj t1 join sj t2
+  on (t1.a = t2.a and t1.c * t1.c = t2.c + 2 and t2.b is null)
+) as q1
+left join
+  (select t3.* from sj t3, sj t4 where t3.c = t4.c) as q2
+on q1.ax = q2.a;
+
+-- Test that placeholders are updated correctly after join removal
+explain (costs off)
+select * from (values (1)) x
+left join (select coalesce(y.q1, 1) from int8_tbl y
+	right join sj j1 inner join sj j2 on j1.a = j2.a
+	on true) z
+on true;
+
+-- Test that references to the removed rel in lateral subqueries are replaced
+-- correctly after join removal
+explain (verbose, costs off)
+select t3.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select t1.a offset 0) t3 on true;
+
+explain (verbose, costs off)
+select t3.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select * from (select t1.a offset 0) offset 0) t3 on true;
+
+explain (verbose, costs off)
+select t4.a from sj t1
+	join sj t2 on t1.a = t2.a
+	join lateral (select t3.a from sj t3, (select t1.a) offset 0) t4 on true;
+
+-- Check updating of semi_rhs_exprs links from upper-level semi join to
+-- the removing relation
+explain (verbose, costs off)
+select t1.a from sj t1 where t1.b in (
+  select t2.b from sj t2 join sj t3 on t2.c=t3.c);
+
+--
+-- SJE corner case: uniqueness of an inner is [partially] derived from
+-- baserestrictinfo clauses.
+-- XXX: We really should allow SJE for these corner cases?
+--
+
+INSERT INTO sj VALUES (3, 1, 3);
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 3;
+-- Return one row
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 3;
+
+-- Remove SJ, define uniqueness by a constant
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 2;
+-- Return one row
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2 AND j2.a = 2;
+
+-- Remove SJ, define uniqueness by a constant expression
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND j1.a = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = j2.a;
+-- Return one row
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND j1.a = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = j2.a;
+
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 1 AND j2.a = 1;
+-- Return no rows
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 1 AND j2.a = 1;
+
+-- Shuffle a clause. Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 1 = j1.a AND j2.a = 1;
+-- Return no rows
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 1 = j1.a AND j2.a = 1;
+
+-- SJE Corner case: a 'a.x=a.x' clause, have replaced with 'a.x IS NOT NULL'
+-- after SJ elimination it shouldn't be a mergejoinable clause.
+EXPLAIN (COSTS OFF)
+SELECT t4.*
+FROM (SELECT t1.*, t2.a AS a1 FROM sj t1, sj t2 WHERE t1.b = t2.b) AS t3
+JOIN sj t4 ON (t4.a = t3.a) WHERE t3.a1 = 42;
+SELECT t4.*
+FROM (SELECT t1.*, t2.a AS a1 FROM sj t1, sj t2 WHERE t1.b = t2.b) AS t3
+JOIN sj t4 ON (t4.a = t3.a) WHERE t3.a1 = 42;
+
+-- Functional index
+CREATE UNIQUE INDEX sj_fn_idx ON sj((a * a));
+
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a*j1.a = 1 AND j2.a*j2.a = 1;
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a*j1.a = 1 AND j2.a*j2.a = 2;
+
+-- Restriction contains expressions in both sides, Remove SJ.
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.a) = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = (j2.a*j2.a);
+-- Empty set of rows should be returned
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.a) = (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int
+  AND (EXTRACT(DOW FROM current_timestamp(0))/15 + 3)::int = (j2.a*j2.a);
+
+-- Restriction contains volatile function - disable SJE feature.
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.c/3) = (random()/3 + 3)::int
+  AND (random()/3 + 3)::int = (j2.a*j2.c/3);
+-- Return one row
+SELECT * FROM sj j1, sj j2
+WHERE j1.b = j2.b
+  AND (j1.a*j1.c/3) = (random()/3 + 3)::int
+  AND (random()/3 + 3)::int = (j2.a*j2.c/3);
+
+-- Multiple filters
+CREATE UNIQUE INDEX sj_temp_idx1 ON sj(a,b,c);
+
+-- Remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND j1.a = 2 AND j1.c = 3 AND j2.a = 2 AND 3 = j2.c;
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+	SELECT * FROM sj j1, sj j2
+	WHERE j1.b = j2.b AND 2 = j1.a AND j1.c = 3 AND j2.a = 1 AND 3 = j2.c;
+
+CREATE UNIQUE INDEX sj_temp_idx ON sj(a,b);
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND j1.a = 2;
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND 2 = j2.a;
+
+-- Don't remove SJ
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj j1, sj j2 WHERE j1.b = j2.b AND (j1.a = 1 OR j2.a = 1);
+
+DROP INDEX sj_fn_idx, sj_temp_idx1, sj_temp_idx;
+
+-- Test that OR predicated are updated correctly after join removal
+CREATE TABLE tab_with_flag ( id INT PRIMARY KEY, is_flag SMALLINT);
+CREATE INDEX idx_test_is_flag ON tab_with_flag (is_flag);
+
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) FROM tab_with_flag
+WHERE
+	(is_flag IS NULL OR is_flag = 0)
+	AND id IN (SELECT id FROM tab_with_flag WHERE id IN (2, 3));
+DROP TABLE tab_with_flag;
+
+-- HAVING clause
+explain (costs off)
+select p.b from sj p join sj q on p.a = q.a group by p.b having sum(p.a) = 1;
+
+-- update lateral references and range table entry reference
+explain (verbose, costs off)
+select 1 from (select x.* from sj x, sj y where x.a = y.a) q,
+  lateral generate_series(1, q.a) gs(i);
+
+explain (verbose, costs off)
+select 1 from (select y.* from sj x, sj y where x.a = y.a) q,
+  lateral generate_series(1, q.a) gs(i);
+
+-- Test that a non-EC-derived join clause is processed correctly. Use an
+-- outer join so that we can't form an EC.
+explain (costs off) select * from sj p join sj q on p.a = q.a
+  left join sj r on p.a + q.a = r.a;
+
+-- FIXME this constant false filter doesn't look good. Should we merge
+-- equivalence classes?
+explain (costs off)
+select * from sj p, sj q where p.a = q.a and p.b = 1 and q.b = 2;
+
+-- Check that attr_needed is updated correctly after self-join removal. In this
+-- test, the join of j1 with j2 is removed. k1.b is required at either j1 or j2.
+-- If this info is lost, join targetlist for (k1, k2) will not contain k1.b.
+-- Use index scan for k1 so that we don't get 'b' from physical tlist used for
+-- seqscan. Also disable reordering of joins because this test depends on a
+-- particular join tree.
+create table sk (a int, b int);
+create index on sk(a);
+set join_collapse_limit to 1;
+set enable_seqscan to off;
+explain (costs off) select 1 from
+	(sk k1 join sk k2 on k1.a = k2.a)
+	join (sj j1 join sj j2 on j1.a = j2.a) on j1.b = k1.b;
+explain (costs off) select 1 from
+	(sk k1 join sk k2 on k1.a = k2.a)
+	join (sj j1 join sj j2 on j1.a = j2.a) on j2.b = k1.b;
+reset join_collapse_limit;
+reset enable_seqscan;
+
+-- Check that clauses from the join filter list is not lost on the self-join removal
+CREATE TABLE emp1 (id SERIAL PRIMARY KEY NOT NULL, code int);
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT * FROM emp1 e1, emp1 e2 WHERE e1.id = e2.id AND e2.code <> e1.code;
+
+-- Shuffle self-joined relations. Only in the case of iterative deletion
+-- attempts explains of these queries will be identical.
+CREATE UNIQUE INDEX ON emp1((id*id));
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c1.id=c2.id AND c1.id*c2.id=c3.id*c3.id;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c1.id=c3.id AND c1.id*c3.id=c2.id*c2.id;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM emp1 c1, emp1 c2, emp1 c3
+WHERE c3.id=c2.id AND c3.id*c2.id=c1.id*c1.id;
+
+-- Check the usage of a parse tree by the set operations (bug #18170)
+EXPLAIN (COSTS OFF)
+SELECT c1.code FROM emp1 c1 LEFT JOIN emp1 c2 ON c1.id = c2.id
+WHERE c2.id IS NOT NULL
+EXCEPT ALL
+SELECT c3.code FROM emp1 c3;
+
+-- Check that SJE removes references from PHVs correctly
+explain (costs off)
+select * from emp1 t1 left join
+    (select coalesce(t3.code, 1) from emp1 t2
+        left join (emp1 t3 join emp1 t4 on t3.id = t4.id)
+        on true)
+on true;
+
+-- Check that SJE removes the whole PHVs correctly
+explain (verbose, costs off)
+select 1 from emp1 t1 left join
+    ((select 1 as x, * from emp1 t2) s1 inner join
+        (select * from emp1 t3) s2 on s1.id = s2.id)
+    on true
+where s1.x = 1;
+
+-- Check that PHVs do not impose any constraints on removing self joins
+explain (verbose, costs off)
+select * from emp1 t1 join emp1 t2 on t1.id = t2.id left join
+    lateral (select t1.id as t1id, * from generate_series(1,1) t3) s on true;
+
+explain (verbose, costs off)
+select * from generate_series(1,10) t1(id) left join
+    lateral (select t1.id as t1id, t2.id from emp1 t2 join emp1 t3 on t2.id = t3.id)
+on true;
+
+-- Check that SJE replaces join clauses involving the removed rel correctly
+explain (costs off)
+select * from emp1 t1
+   inner join emp1 t2 on t1.id = t2.id
+    left join emp1 t3 on t1.id > 1 and t1.id < 2;
+
+-- Check that SJE doesn't replace the target relation
+EXPLAIN (COSTS OFF)
+WITH t1 AS (SELECT * FROM emp1)
+UPDATE emp1 SET code = t1.code + 1 FROM t1
+WHERE t1.id = emp1.id RETURNING emp1.id, emp1.code, t1.code;
+
+INSERT INTO emp1 VALUES (1, 1), (2, 1);
+
+WITH t1 AS (SELECT * FROM emp1)
+UPDATE emp1 SET code = t1.code + 1 FROM t1
+WHERE t1.id = emp1.id RETURNING emp1.id, emp1.code, t1.code;
+
+TRUNCATE emp1;
+
+EXPLAIN (COSTS OFF)
+UPDATE sj sq SET b = 1 FROM sj as sz WHERE sq.a = sz.a;
+
+CREATE RULE sj_del_rule AS ON DELETE TO sj
+  DO INSTEAD
+    UPDATE sj SET a = 1 WHERE a = old.a;
+EXPLAIN (COSTS OFF) DELETE FROM sj;
+DROP RULE sj_del_rule ON sj CASCADE;
+
+-- Check that SJE does not mistakenly omit qual clauses (bug #18187)
+insert into emp1 values (1, 1);
+explain (costs off)
+select 1 from emp1 full join
+    (select * from emp1 t1 join
+        emp1 t2 join emp1 t3 on t2.id = t3.id
+        on true
+    where false) s on true
+where false;
+select 1 from emp1 full join
+    (select * from emp1 t1 join
+        emp1 t2 join emp1 t3 on t2.id = t3.id
+        on true
+    where false) s on true
+where false;
+
+-- Check that SJE does not mistakenly re-use knowledge of relation uniqueness
+-- made with different set of quals
+insert into emp1 values (2, 1);
+explain (costs off)
+select * from emp1 t1 where exists (select * from emp1 t2
+                                    where t2.id = t1.code and t2.code > 0);
+select * from emp1 t1 where exists (select * from emp1 t2
+                                    where t2.id = t1.code and t2.code > 0);
+
+-- We can remove the join even if we find the join can't duplicate rows and
+-- the base quals of each side are different.  In the following case we end up
+-- moving quals over to s1 to make it so it can't match any rows.
+create table sl(a int, b int, c int);
+create unique index on sl(a, b);
+vacuum analyze sl;
+
+-- Both sides are unique, but base quals are different
+explain (costs off)
+select * from sl t1, sl t2 where t1.a = t2.a and t1.b = 1 and t2.b = 2;
+
+-- Check NullTest in baserestrictinfo list
+explain (costs off)
+select * from sl t1, sl t2
+where t1.a = t2.a and t1.b = 1 and t2.b = 2
+  and t1.c IS NOT NULL and t2.c IS NOT NULL
+  and t2.b IS NOT NULL and t1.b IS NOT NULL
+  and t1.a IS NOT NULL and t2.a IS NOT NULL;
+explain (verbose, costs off)
+select * from sl t1, sl t2
+where t1.b = t2.b and t2.a = 3 and t1.a = 3
+  and t1.c IS NOT NULL and t2.c IS NOT NULL
+  and t2.b IS NOT NULL and t1.b IS NOT NULL
+  and t1.a IS NOT NULL and t2.a IS NOT NULL;
+
+-- Join qual isn't mergejoinable, but inner is unique.
+EXPLAIN (COSTS OFF)
+SELECT n2.a FROM sj n1, sj n2 WHERE n1.a <> n2.a AND n2.a = 1;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM
+(SELECT n2.a FROM sj n1, sj n2 WHERE n1.a <> n2.a) q0, sl
+WHERE q0.a = 1;
+
+-- Check optimization disabling if it will violate special join conditions.
+-- Two identical joined relations satisfies self join removal conditions but
+-- stay in different special join infos.
+CREATE TABLE sj_t1 (id serial, a int);
+CREATE TABLE sj_t2 (id serial, a int);
+CREATE TABLE sj_t3 (id serial, a int);
+CREATE TABLE sj_t4 (id serial, a int);
+
+CREATE UNIQUE INDEX ON sj_t3 USING btree (a,id);
+CREATE UNIQUE INDEX ON sj_t2 USING btree (id);
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM sj_t1
+JOIN (
+	SELECT sj_t2.id AS id FROM sj_t2
+	WHERE EXISTS
+		(
+		SELECT TRUE FROM sj_t3,sj_t4 WHERE sj_t3.a = 1 AND sj_t3.id = sj_t2.id
+		)
+	) t2t3t4
+ON sj_t1.id = t2t3t4.id
+JOIN (
+	SELECT sj_t2.id AS id FROM sj_t2
+	WHERE EXISTS
+		(
+		SELECT TRUE FROM sj_t3,sj_t4 WHERE sj_t3.a = 1 AND sj_t3.id = sj_t2.id
+		)
+	) _t2t3t4
+ON sj_t1.id = _t2t3t4.id;
+
+--
+-- Test RowMarks-related code
+--
+
+-- Both sides have explicit LockRows marks
+EXPLAIN (COSTS OFF)
+SELECT a1.a FROM sj a1,sj a2 WHERE (a1.a=a2.a) FOR UPDATE;
+
+reset enable_hashjoin;
+reset enable_mergejoin;
+
 --
 -- Test hints given on incorrect column references are useful
 --
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b6c170ac249..bce4214503d 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2593,6 +2593,7 @@ SeenRelsEntry
 SelectLimit
 SelectStmt
 Selectivity
+SelfJoinCandidate
 SemTPadded
 SemiAntiJoinFactors
 SeqScan
@@ -4056,6 +4057,7 @@ unicodeStyleColumnFormat
 unicodeStyleFormat
 unicodeStyleRowFormat
 unicode_linestyle
+UniqueRelInfo
 unit_conversion
 unlogged_relation_entry
 utf_local_conversion_func
-- 
2.34.1



  [text/x-patch] v19-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch (71.1K, 3-v19-0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch)
  download | inline diff:
From 40ab9437e0efa6b35005c497d53dcc219acb7038 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <[email protected]>
Date: Mon, 17 Feb 2025 17:13:55 +0300
Subject: [PATCH 2/4] Machinery for grabbing an extended vacuum statistics on 
 table relations.

Value of total_blks_hit, total_blks_read, total_blks_dirtied are number of
hitted, missed and dirtied pages in shared buffers during a vacuum operation
respectively.

total_blks_dirtied means 'dirtied only by this action'. So, if this page was
dirty before the vacuum operation, it doesn't count this page as 'dirtied'.

The tuples_deleted parameter is the number of tuples cleaned up by the vacuum
operation.

The delay_time value means total vacuum sleep time in vacuum delay point.
The pages_removed value is the number of pages by which the physical data
storage of the relation was reduced.
The value of pages_deleted parameter is the number of freed pages in the table
(file size may not have changed).

Tracking of IO during an (auto)vacuum operation.
Introduced variables blk_read_time and blk_write_time tracks only access to
buffer pages and flushing them to disk. Reading operation is trivial, but
writing measurement technique is not obvious.
So, during a vacuum writing time can be zero incremented because no any flushing
operations were performed.

System time and user time are parameters that describes how much time a vacuum
operation has spent in executing of code in user space and kernel space
accordingly. Also, accumulate total time of a vacuum that is a diff between
timestamps in start and finish points in the vacuum code.
Remember about idle time, when vacuum waited for IO and locks, so total time
isn't equal a sum of user and system time, but no less.

pages_frozen is a number of pages that are marked as frozen in vm during vacuum.
This parameter is incremented if page is marked as all-frozen.
pages_all_visible is a number of pages that are marked as all-visible in vm during
vacuum.

wraparound_failsafe_count is a number of times when the vacuum starts urgent cleanup
to prevent wraparound problem which is critical for the database.

Authors: Alena Rybakina <[email protected]>,
	 Andrei Lepikhov <[email protected]>,
	 Andrei Zubkov <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>, Masahiko Sawada <[email protected]>,
	     Ilia Evdokimov <[email protected]>, jian he <[email protected]>,
	     Kirill Reshke <[email protected]>, Alexander Korotkov <[email protected]>,
	     Jim Nasby <[email protected]>, Sami Imseih <[email protected]>
---
 src/backend/access/heap/vacuumlazy.c          | 149 +++++++++++-
 src/backend/access/heap/visibilitymap.c       |  10 +
 src/backend/catalog/system_views.sql          |  52 +++-
 src/backend/commands/vacuum.c                 |   4 +
 src/backend/commands/vacuumparallel.c         |   1 +
 src/backend/utils/activity/pgstat.c           |  12 +-
 src/backend/utils/activity/pgstat_relation.c  |  46 +++-
 src/backend/utils/adt/pgstatfuncs.c           | 147 ++++++++++++
 src/backend/utils/error/elog.c                |  13 +
 src/backend/utils/misc/guc_tables.c           |   9 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/catalog/pg_proc.dat               |  18 ++
 src/include/commands/vacuum.h                 |   1 +
 src/include/pgstat.h                          |  80 +++++-
 src/include/utils/elog.h                      |   1 +
 .../vacuum-extending-in-repetable-read.out    |  53 ++++
 src/test/isolation/isolation_schedule         |   1 +
 .../vacuum-extending-in-repetable-read.spec   |  53 ++++
 src/test/regress/expected/rules.out           |  44 +++-
 .../expected/vacuum_tables_statistics.out     | 227 ++++++++++++++++++
 src/test/regress/parallel_schedule            |   5 +
 .../regress/sql/vacuum_tables_statistics.sql  | 183 ++++++++++++++
 22 files changed, 1094 insertions(+), 16 deletions(-)
 create mode 100644 src/test/isolation/expected/vacuum-extending-in-repetable-read.out
 create mode 100644 src/test/isolation/specs/vacuum-extending-in-repetable-read.spec
 create mode 100644 src/test/regress/expected/vacuum_tables_statistics.out
 create mode 100644 src/test/regress/sql/vacuum_tables_statistics.sql

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 911f68b413d..839bd41be62 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -290,6 +290,7 @@ typedef struct LVRelState
 	/* Error reporting state */
 	char	   *dbname;
 	char	   *relnamespace;
+	Oid			reloid;
 	char	   *relname;
 	char	   *indname;		/* Current index name */
 	BlockNumber blkno;			/* used only for heap operations */
@@ -408,6 +409,8 @@ typedef struct LVRelState
 	 * been permanently disabled.
 	 */
 	BlockNumber eager_scan_remaining_fails;
+
+	int32		wraparound_failsafe_count; /* the number of times to prevent workaround problem */
 } LVRelState;
 
 
@@ -419,6 +422,18 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
+/*
+ * Counters and usage data for extended stats tracking.
+ */
+typedef struct LVExtStatCounters
+{
+	TimestampTz starttime;
+	WalUsage	walusage;
+	BufferUsage bufusage;
+	double		VacuumDelayTime;
+	PgStat_Counter blocks_fetched;
+	PgStat_Counter blocks_hit;
+} LVExtStatCounters;
 
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
@@ -475,7 +490,106 @@ static void update_vacuum_error_info(LVRelState *vacrel,
 static void restore_vacuum_error_info(LVRelState *vacrel,
 									  const LVSavedErrInfo *saved_vacrel);
 
+/* ----------
+ * extvac_stats_start() -
+ *
+ * Save cut-off values of extended vacuum counters before start of a relation
+ * processing.
+ * ----------
+ */
+static void
+extvac_stats_start(Relation rel, LVExtStatCounters *counters)
+{
+	TimestampTz	starttime;
+
+	if(!pgstat_track_vacuum_statistics)
+		return;
+
+	memset(counters, 0, sizeof(LVExtStatCounters));
+
+	starttime = GetCurrentTimestamp();
+
+	counters->starttime = starttime;
+	counters->walusage = pgWalUsage;
+	counters->bufusage = pgBufferUsage;
+	counters->VacuumDelayTime = VacuumDelayTime;
+	counters->blocks_fetched = 0;
+	counters->blocks_hit = 0;
+
+	if (!rel->pgstat_info || !pgstat_track_counts)
+		/*
+		 * if something goes wrong or user doesn't want to track a database
+		 * activity - just suppress it.
+		 */
+		return;
+
+	counters->blocks_fetched = rel->pgstat_info->counts.blocks_fetched;
+	counters->blocks_hit = rel->pgstat_info->counts.blocks_hit;
+}
+
+/* ----------
+ * extvac_stats_end() -
+ *
+ *	Called to finish an extended vacuum statistic gathering and form a report.
+ * ----------
+ */
+static void
+extvac_stats_end(Relation rel, LVExtStatCounters *counters,
+				  ExtVacReport *report)
+{
+	WalUsage	walusage;
+	BufferUsage	bufusage;
+	TimestampTz endtime;
+	long		secs;
+	int			usecs;
+
+	if(!pgstat_track_vacuum_statistics)
+		return;
+
+	/* Calculate diffs of global stat parameters on WAL and buffer usage. */
+	memset(&walusage, 0, sizeof(WalUsage));
+	WalUsageAccumDiff(&walusage, &pgWalUsage, &counters->walusage);
+
+	memset(&bufusage, 0, sizeof(BufferUsage));
+	BufferUsageAccumDiff(&bufusage, &pgBufferUsage, &counters->bufusage);
+
+	endtime = GetCurrentTimestamp();
+	TimestampDifference(counters->starttime, endtime, &secs, &usecs);
+
+	memset(report, 0, sizeof(ExtVacReport));
+
+	/*
+	 * Fill additional statistics on a vacuum processing operation.
+	 */
+	report->total_blks_read = bufusage.local_blks_read + bufusage.shared_blks_read;
+	report->total_blks_hit = bufusage.local_blks_hit + bufusage.shared_blks_hit;
+	report->total_blks_dirtied = bufusage.local_blks_dirtied + bufusage.shared_blks_dirtied;
+	report->total_blks_written = bufusage.shared_blks_written;
+
+	report->wal_records = walusage.wal_records;
+	report->wal_fpi = walusage.wal_fpi;
+	report->wal_bytes = walusage.wal_bytes;
+
+	report->blk_read_time = INSTR_TIME_GET_MILLISEC(bufusage.local_blk_read_time);
+	report->blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage.shared_blk_read_time);
+	report->blk_write_time = INSTR_TIME_GET_MILLISEC(bufusage.local_blk_write_time);
+	report->blk_write_time = INSTR_TIME_GET_MILLISEC(bufusage.shared_blk_write_time);
+	report->delay_time = VacuumDelayTime - counters->VacuumDelayTime;
 
+	report->total_time = secs * 1000. + usecs / 1000.;
+
+	if (!rel->pgstat_info || !pgstat_track_counts)
+		/*
+		 * if something goes wrong or an user doesn't want to track a database
+		 * activity - just suppress it.
+		 */
+		return;
+
+	report->blks_fetched =
+		rel->pgstat_info->counts.blocks_fetched - counters->blocks_fetched;
+	report->blks_hit =
+		rel->pgstat_info->counts.blocks_hit - counters->blocks_hit;
+}
 
 /*
  * Helper to set up the eager scanning state for vacuuming a single relation.
@@ -631,7 +745,14 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	WalUsage	startwalusage = pgWalUsage;
 	BufferUsage startbufferusage = pgBufferUsage;
 	ErrorContextCallback errcallback;
+	LVExtStatCounters extVacCounters;
+	ExtVacReport extVacReport;
 	char	  **indnames = NULL;
+	ExtVacReport allzero;
+
+	/* Initialize vacuum statistics */
+	memset(&allzero, 0, sizeof(ExtVacReport));
+	extVacReport = allzero;
 
 	verbose = (params->options & VACOPT_VERBOSE) != 0;
 	instrument = (verbose || (AmAutoVacuumWorkerProcess() &&
@@ -651,7 +772,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
 								  RelationGetRelid(rel));
-
+	extvac_stats_start(rel, &extVacCounters);
 	/*
 	 * Setup error traceback support for ereport() first.  The idea is to set
 	 * up an error context callback to display additional information on any
@@ -668,6 +789,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	vacrel->dbname = get_database_name(MyDatabaseId);
 	vacrel->relnamespace = get_namespace_name(RelationGetNamespace(rel));
 	vacrel->relname = pstrdup(RelationGetRelationName(rel));
+	vacrel->reloid = RelationGetRelid(rel);
 	vacrel->indname = NULL;
 	vacrel->phase = VACUUM_ERRCB_PHASE_UNKNOWN;
 	vacrel->verbose = verbose;
@@ -756,6 +878,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	vacrel->vm_new_visible_pages = 0;
 	vacrel->vm_new_visible_frozen_pages = 0;
 	vacrel->vm_new_frozen_pages = 0;
+	vacrel->wraparound_failsafe_count = 0;
 	vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel);
 
 	/*
@@ -914,6 +1037,26 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 						vacrel->NewRelfrozenXid, vacrel->NewRelminMxid,
 						&frozenxid_updated, &minmulti_updated, false);
 
+	/* Make generic extended vacuum stats report */
+	extvac_stats_end(rel, &extVacCounters, &extVacReport);
+
+	if(pgstat_track_vacuum_statistics)
+	{
+		/* Fill heap-specific extended stats fields */
+		extVacReport.pages_scanned = vacrel->scanned_pages;
+		extVacReport.pages_removed = vacrel->removed_pages;
+		extVacReport.vm_new_frozen_pages = vacrel->vm_new_frozen_pages;
+		extVacReport.vm_new_visible_pages = vacrel->vm_new_visible_pages;
+		extVacReport.vm_new_visible_frozen_pages = vacrel->vm_new_visible_frozen_pages;
+		extVacReport.tuples_deleted = vacrel->tuples_deleted;
+		extVacReport.tuples_frozen = vacrel->tuples_frozen;
+		extVacReport.recently_dead_tuples = vacrel->recently_dead_tuples;
+		extVacReport.missed_dead_tuples = vacrel->missed_dead_tuples;
+		extVacReport.missed_dead_pages = vacrel->missed_dead_pages;
+		extVacReport.index_vacuum_count = vacrel->num_index_scans;
+		extVacReport.wraparound_failsafe_count = vacrel->wraparound_failsafe_count;
+	}
+
 	/*
 	 * Report results to the cumulative stats system, too.
 	 *
@@ -929,7 +1072,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 						 Max(vacrel->new_live_tuples, 0),
 						 vacrel->recently_dead_tuples +
 						 vacrel->missed_dead_tuples,
-						 starttime);
+						 starttime,
+						 &extVacReport);
 	pgstat_progress_end_command();
 
 	if (instrument)
@@ -2944,6 +3088,7 @@ lazy_check_wraparound_failsafe(LVRelState *vacrel)
 		int64		progress_val[2] = {0, 0};
 
 		VacuumFailsafeActive = true;
+		vacrel->wraparound_failsafe_count ++;
 
 		/*
 		 * Abandon use of a buffer access strategy to allow use of all of
diff --git a/src/backend/access/heap/visibilitymap.c b/src/backend/access/heap/visibilitymap.c
index 745a04ef26e..07623a045fa 100644
--- a/src/backend/access/heap/visibilitymap.c
+++ b/src/backend/access/heap/visibilitymap.c
@@ -91,6 +91,7 @@
 #include "access/xloginsert.h"
 #include "access/xlogutils.h"
 #include "miscadmin.h"
+#include "pgstat.h"
 #include "port/pg_bitutils.h"
 #include "storage/bufmgr.h"
 #include "storage/smgr.h"
@@ -160,6 +161,15 @@ visibilitymap_clear(Relation rel, BlockNumber heapBlk, Buffer vmbuf, uint8 flags
 
 	if (map[mapByte] & mask)
 	{
+		/*
+		 * As part of vacuum stats, track how often all-visible or all-frozen
+		 * bits are cleared.
+		 */
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_VISIBLE)
+			pgstat_count_vm_rev_all_visible(rel);
+		if (map[mapByte] >> mapOffset & flags & VISIBILITYMAP_ALL_FROZEN)
+			pgstat_count_vm_rev_all_frozen(rel);
+
 		map[mapByte] &= ~mask;
 
 		MarkBufferDirty(vmbuf);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index eff0990957e..7526c8973ee 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -700,7 +700,9 @@ CREATE VIEW pg_stat_all_tables AS
             pg_stat_get_total_vacuum_time(C.oid) AS total_vacuum_time,
             pg_stat_get_total_autovacuum_time(C.oid) AS total_autovacuum_time,
             pg_stat_get_total_analyze_time(C.oid) AS total_analyze_time,
-            pg_stat_get_total_autoanalyze_time(C.oid) AS total_autoanalyze_time
+            pg_stat_get_total_autoanalyze_time(C.oid) AS total_autoanalyze_time,
+            pg_stat_get_rev_all_frozen_pages(C.oid) AS rev_all_frozen_pages,
+            pg_stat_get_rev_all_visible_pages(C.oid) AS rev_all_visible_pages
     FROM pg_class C LEFT JOIN
          pg_index I ON C.oid = I.indrelid
          LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
@@ -1394,3 +1396,51 @@ CREATE VIEW pg_stat_subscription_stats AS
 
 CREATE VIEW pg_wait_events AS
     SELECT * FROM pg_get_wait_events();
+--
+-- Show extended cumulative statistics on a vacuum operation over all tables and
+-- databases of the instance.
+-- Use Invalid Oid "0" as an input relation id to get stat on each table in a
+-- database.
+--
+
+CREATE VIEW pg_stat_vacuum_tables AS
+SELECT
+  ns.nspname AS schemaname,
+  rel.relname AS relname,
+  stats.relid as relid,
+
+  stats.total_blks_read AS total_blks_read,
+  stats.total_blks_hit AS total_blks_hit,
+  stats.total_blks_dirtied AS total_blks_dirtied,
+  stats.total_blks_written AS total_blks_written,
+
+  stats.rel_blks_read AS rel_blks_read,
+  stats.rel_blks_hit AS rel_blks_hit,
+
+  stats.pages_scanned AS pages_scanned,
+  stats.pages_removed AS pages_removed,
+  stats.vm_new_frozen_pages AS vm_new_frozen_pages,
+  stats.vm_new_visible_pages AS vm_new_visible_pages,
+  stats.vm_new_visible_frozen_pages AS vm_new_visible_frozen_pages,
+  stats.missed_dead_pages AS missed_dead_pages,
+  stats.tuples_deleted AS tuples_deleted,
+  stats.tuples_frozen AS tuples_frozen,
+  stats.recently_dead_tuples AS recently_dead_tuples,
+  stats.missed_dead_tuples AS missed_dead_tuples,
+
+  stats.wraparound_failsafe AS wraparound_failsafe,
+  stats.index_vacuum_count AS index_vacuum_count,
+  stats.wal_records AS wal_records,
+  stats.wal_fpi AS wal_fpi,
+  stats.wal_bytes AS wal_bytes,
+
+  stats.blk_read_time AS blk_read_time,
+  stats.blk_write_time AS blk_write_time,
+
+  stats.delay_time AS delay_time,
+  stats.total_time AS total_time
+
+FROM pg_class rel
+  JOIN pg_namespace ns ON ns.oid = rel.relnamespace,
+  LATERAL pg_stat_get_vacuum_tables(rel.oid) stats
+WHERE rel.relkind = 'r';
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 0239d9bae65..aea2cf307da 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -114,6 +114,9 @@ pg_atomic_uint32 *VacuumSharedCostBalance = NULL;
 pg_atomic_uint32 *VacuumActiveNWorkers = NULL;
 int			VacuumCostBalanceLocal = 0;
 
+/* Cumulative storage to report total vacuum delay time. */
+double VacuumDelayTime = 0; /* msec. */
+
 /* non-export function prototypes */
 static List *expand_vacuum_rel(VacuumRelation *vrel,
 							   MemoryContext vac_context, int options);
@@ -2497,6 +2500,7 @@ vacuum_delay_point(bool is_analyze)
 			exit(1);
 
 		VacuumCostBalance = 0;
+		VacuumDelayTime += msec;
 
 		/*
 		 * Balance and update limit values for autovacuum workers. We must do
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 2b9d548cdeb..7924c526cb0 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -1054,6 +1054,7 @@ parallel_vacuum_main(dsm_segment *seg, shm_toc *toc)
 	/* Set cost-based vacuum delay */
 	VacuumUpdateCosts();
 	VacuumCostBalance = 0;
+	VacuumDelayTime = 0;
 	VacuumCostBalanceLocal = 0;
 	VacuumSharedCostBalance = &(shared->cost_balance);
 	VacuumActiveNWorkers = &(shared->active_nworkers);
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 3168b825e25..826a5685b2a 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -191,7 +191,7 @@ static void pgstat_reset_after_failure(void);
 static bool pgstat_flush_pending_entries(bool nowait);
 
 static void pgstat_prep_snapshot(void);
-static void pgstat_build_snapshot(void);
+static void pgstat_build_snapshot(PgStat_Kind statKind);
 static void pgstat_build_snapshot_fixed(PgStat_Kind kind);
 
 static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
@@ -204,7 +204,7 @@ static inline bool pgstat_is_kind_valid(PgStat_Kind kind);
 
 bool		pgstat_track_counts = false;
 int			pgstat_fetch_consistency = PGSTAT_FETCH_CONSISTENCY_CACHE;
-
+bool		pgstat_track_vacuum_statistics = true;
 
 /* ----------
  * state shared with pgstat_*.c
@@ -261,7 +261,6 @@ static bool pgstat_is_initialized = false;
 static bool pgstat_is_shutdown = false;
 #endif
 
-
 /*
  * The different kinds of built-in statistics.
  *
@@ -901,7 +900,6 @@ pgstat_reset_of_kind(PgStat_Kind kind)
 		pgstat_reset_entries_of_kind(kind, ts);
 }
 
-
 /* ------------------------------------------------------------
  * Fetching of stats
  * ------------------------------------------------------------
@@ -970,7 +968,7 @@ pgstat_fetch_entry(PgStat_Kind kind, Oid dboid, uint64 objid)
 
 	/* if we need to build a full snapshot, do so */
 	if (pgstat_fetch_consistency == PGSTAT_FETCH_CONSISTENCY_SNAPSHOT)
-		pgstat_build_snapshot();
+		pgstat_build_snapshot(PGSTAT_KIND_INVALID);
 
 	/* if caching is desired, look up in cache */
 	if (pgstat_fetch_consistency > PGSTAT_FETCH_CONSISTENCY_NONE)
@@ -1086,7 +1084,7 @@ pgstat_snapshot_fixed(PgStat_Kind kind)
 		pgstat_clear_snapshot();
 
 	if (pgstat_fetch_consistency == PGSTAT_FETCH_CONSISTENCY_SNAPSHOT)
-		pgstat_build_snapshot();
+		pgstat_build_snapshot(PGSTAT_KIND_INVALID);
 	else
 		pgstat_build_snapshot_fixed(kind);
 
@@ -1137,7 +1135,7 @@ pgstat_prep_snapshot(void)
 }
 
 static void
-pgstat_build_snapshot(void)
+pgstat_build_snapshot(PgStat_Kind statKind)
 {
 	dshash_seq_status hstat;
 	PgStatShared_HashEntry *p;
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index d64595a165c..0272dd1f393 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -47,6 +47,8 @@ static void add_tabstat_xact_level(PgStat_TableStatus *pgstat_info, int nest_lev
 static void ensure_tabstat_xact_level(PgStat_TableStatus *pgstat_info);
 static void save_truncdrop_counters(PgStat_TableXactStatus *trans, bool is_drop);
 static void restore_truncdrop_counters(PgStat_TableXactStatus *trans);
+static void pgstat_accumulate_extvac_stats(ExtVacReport *dst, ExtVacReport *src,
+							   bool accumulate_reltype_specific_info);
 
 
 /*
@@ -209,7 +211,7 @@ pgstat_drop_relation(Relation rel)
 void
 pgstat_report_vacuum(Oid tableoid, bool shared,
 					 PgStat_Counter livetuples, PgStat_Counter deadtuples,
-					 TimestampTz starttime)
+					 TimestampTz starttime, ExtVacReport *params)
 {
 	PgStat_EntryRef *entry_ref;
 	PgStatShared_Relation *shtabentry;
@@ -235,6 +237,8 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
 	tabentry->live_tuples = livetuples;
 	tabentry->dead_tuples = deadtuples;
 
+	pgstat_accumulate_extvac_stats(&tabentry->vacuum_ext, params, true);
+
 	/*
 	 * It is quite possible that a non-aggressive VACUUM ended up skipping
 	 * various pages, however, we'll zero the insert counter here regardless.
@@ -872,6 +876,9 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 	tabentry->blocks_fetched += lstats->counts.blocks_fetched;
 	tabentry->blocks_hit += lstats->counts.blocks_hit;
 
+	tabentry->rev_all_frozen_pages += lstats->counts.rev_all_frozen_pages;
+	tabentry->rev_all_visible_pages += lstats->counts.rev_all_visible_pages;
+
 	/* Clamp live_tuples in case of negative delta_live_tuples */
 	tabentry->live_tuples = Max(tabentry->live_tuples, 0);
 	/* Likewise for dead_tuples */
@@ -995,3 +1002,40 @@ restore_truncdrop_counters(PgStat_TableXactStatus *trans)
 		trans->tuples_deleted = trans->deleted_pre_truncdrop;
 	}
 }
+
+static void
+pgstat_accumulate_extvac_stats(ExtVacReport *dst, ExtVacReport *src,
+							   bool accumulate_reltype_specific_info)
+{
+	dst->total_blks_read += src->total_blks_read;
+	dst->total_blks_hit += src->total_blks_hit;
+	dst->total_blks_dirtied += src->total_blks_dirtied;
+	dst->total_blks_written += src->total_blks_written;
+	dst->wal_bytes += src->wal_bytes;
+	dst->wal_fpi += src->wal_fpi;
+	dst->wal_records += src->wal_records;
+	dst->blk_read_time += src->blk_read_time;
+	dst->blk_write_time += src->blk_write_time;
+	dst->delay_time += src->delay_time;
+	dst->total_time += src->total_time;
+
+	if (!accumulate_reltype_specific_info)
+		return;
+
+	dst->blks_fetched += src->blks_fetched;
+	dst->blks_hit += src->blks_hit;
+
+	dst->pages_scanned += src->pages_scanned;
+	dst->pages_removed += src->pages_removed;
+	dst->vm_new_frozen_pages += src->vm_new_frozen_pages;
+	dst->vm_new_visible_pages += src->vm_new_visible_pages;
+	dst->vm_new_visible_frozen_pages += src->vm_new_visible_frozen_pages;
+	dst->tuples_deleted += src->tuples_deleted;
+	dst->tuples_frozen += src->tuples_frozen;
+	dst->recently_dead_tuples += src->recently_dead_tuples;
+	dst->index_vacuum_count += src->index_vacuum_count;
+	dst->wraparound_failsafe_count += src->wraparound_failsafe_count;
+	dst->missed_dead_pages += src->missed_dead_pages;
+	dst->missed_dead_tuples += src->missed_dead_tuples;
+
+}
\ No newline at end of file
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e9096a88492..e112762ed2f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -106,6 +106,12 @@ PG_STAT_GET_RELENTRY_INT64(tuples_updated)
 /* pg_stat_get_vacuum_count */
 PG_STAT_GET_RELENTRY_INT64(vacuum_count)
 
+/* pg_stat_get_rev_frozen_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_frozen_pages)
+
+/* pg_stat_get_rev_all_visible_pages */
+PG_STAT_GET_RELENTRY_INT64(rev_all_visible_pages)
+
 #define PG_STAT_GET_RELENTRY_FLOAT8(stat)						\
 Datum															\
 CppConcat(pg_stat_get_,stat)(PG_FUNCTION_ARGS)					\
@@ -2244,3 +2250,144 @@ pg_stat_have_stats(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(pgstat_have_entry(kind, dboid, objid));
 }
+
+
+/*
+ * Get the vacuum statistics for the heap tables.
+ */
+Datum
+pg_stat_get_vacuum_tables(PG_FUNCTION_ARGS)
+{
+	#define PG_STAT_GET_VACUUM_TABLES_STATS_COLS 26
+
+	Oid						relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry     *tabentry;
+	ExtVacReport 			*extvacuum;
+	TupleDesc				 tupdesc;
+	Datum					 values[PG_STAT_GET_VACUUM_TABLES_STATS_COLS] = {0};
+	bool					 nulls[PG_STAT_GET_VACUUM_TABLES_STATS_COLS] = {0};
+	char					 buf[256];
+	int						 i = 0;
+	ExtVacReport allzero;
+
+	/* Initialise attributes information in the tuple descriptor */
+	tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_VACUUM_TABLES_STATS_COLS);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "relid",
+					   INT4OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_read",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_hit",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_dirtied",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_written",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "rel_blks_read",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "rel_blks_hit",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "pages_scanned",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "pages_removed",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "vm_new_frozen_pages",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "vm_new_visible_pages",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "vm_new_visible_frozen_pages",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "missed_dead_pages",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "tuples_deleted",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "tuples_frozen",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "recently_dead_tuples",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "missed_dead_tuples",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wraparound_failsafe_count",
+					   INT4OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "index_vacuum_count",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_records",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_fpi",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_bytes",
+					   NUMERICOID, -1, 0);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "blk_read_time",
+					   FLOAT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "blk_write_time",
+					   FLOAT8OID, -1, 0);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "delay_time",
+					   FLOAT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_time",
+					   FLOAT8OID, -1, 0);
+
+	Assert(i == PG_STAT_GET_VACUUM_TABLES_STATS_COLS);
+
+	BlessTupleDesc(tupdesc);
+
+	tabentry = pgstat_fetch_stat_tabentry(relid);
+
+	if (tabentry == NULL)
+	{
+		/* If the subscription is not found, initialise its stats */
+		memset(&allzero, 0, sizeof(ExtVacReport));
+		extvacuum = &allzero;
+	}
+	else
+	{
+		extvacuum = &(tabentry->vacuum_ext);
+	}
+
+	i = 0;
+
+	values[i++] = ObjectIdGetDatum(relid);
+
+	values[i++] = Int64GetDatum(extvacuum->total_blks_read);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_hit);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_dirtied);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_written);
+
+	values[i++] = Int64GetDatum(extvacuum->blks_fetched -
+									extvacuum->blks_hit);
+	values[i++] = Int64GetDatum(extvacuum->blks_hit);
+
+	values[i++] = Int64GetDatum(extvacuum->pages_scanned);
+	values[i++] = Int64GetDatum(extvacuum->pages_removed);
+	values[i++] = Int64GetDatum(extvacuum->vm_new_frozen_pages);
+	values[i++] = Int64GetDatum(extvacuum->vm_new_visible_pages);
+	values[i++] = Int64GetDatum(extvacuum->vm_new_visible_frozen_pages);
+	values[i++] = Int64GetDatum(extvacuum->missed_dead_pages);
+	values[i++] = Int64GetDatum(extvacuum->tuples_deleted);
+	values[i++] = Int64GetDatum(extvacuum->tuples_frozen);
+	values[i++] = Int64GetDatum(extvacuum->recently_dead_tuples);
+	values[i++] = Int64GetDatum(extvacuum->missed_dead_tuples);
+	values[i++] = Int32GetDatum(extvacuum->wraparound_failsafe_count);
+	values[i++] = Int64GetDatum(extvacuum->index_vacuum_count);
+
+	values[i++] = Int64GetDatum(extvacuum->wal_records);
+	values[i++] = Int64GetDatum(extvacuum->wal_fpi);
+
+	/* Convert to numeric, like pg_stat_statements */
+	snprintf(buf, sizeof buf, UINT64_FORMAT, extvacuum->wal_bytes);
+	values[i++] = DirectFunctionCall3(numeric_in,
+									  CStringGetDatum(buf),
+									  ObjectIdGetDatum(0),
+									  Int32GetDatum(-1));
+
+	values[i++] = Float8GetDatum(extvacuum->blk_read_time);
+	values[i++] = Float8GetDatum(extvacuum->blk_write_time);
+	values[i++] = Float8GetDatum(extvacuum->delay_time);
+	values[i++] = Float8GetDatum(extvacuum->total_time);
+
+	Assert(i == PG_STAT_GET_VACUUM_TABLES_STATS_COLS);
+
+	/* Returns the record as Datum */
+	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
\ No newline at end of file
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 860bbd40d42..4da8d3f87fd 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -1619,6 +1619,19 @@ getinternalerrposition(void)
 	return edata->internalpos;
 }
 
+/*
+ * Return elevel of errors
+ */
+int
+geterrelevel(void)
+{
+	ErrorData  *edata = &errordata[errordata_stack_depth];
+
+	/* we don't bother incrementing recursion_depth */
+	CHECK_STACK_DEPTH();
+
+	return edata->elevel;
+}
 
 /*
  * Functions to allow construction of error message strings separately from
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index cce73314609..6aa98f99317 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1499,6 +1499,15 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"track_vacuum_statistics", PGC_SUSET, STATS_CUMULATIVE,
+			gettext_noop("Collects vacuum statistics for table relations."),
+			NULL
+		},
+		&pgstat_track_vacuum_statistics,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"track_wal_io_timing", PGC_SUSET, STATS_CUMULATIVE,
 			gettext_noop("Collects timing statistics for WAL I/O activity."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index d472987ed46..248ef6c9a36 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -640,6 +640,7 @@
 #track_wal_io_timing = off
 #track_functions = none			# none, pl, all
 #stats_fetch_consistency = cache	# cache, none, snapshot
+#track_vacuum_statistics = off
 
 
 # - Monitoring -
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 9e803d610d7..3cc2eff3437 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12464,4 +12464,22 @@
   proargtypes => 'int4',
   prosrc => 'gist_stratnum_common' },
 
+{ oid => '8001',
+  descr => 'pg_stat_get_vacuum_tables returns vacuum stats values for table',
+  proname => 'pg_stat_get_vacuum_tables', prorows => 1000, provolatile => 's', prorettype => 'record',proisstrict => 'f',
+  proretset => 't',
+  proargtypes => 'oid',
+  proallargtypes => '{oid,oid,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,int4,int8,int8,int8,numeric,float8,float8,float8,float8}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{reloid,relid,total_blks_read,total_blks_hit,total_blks_dirtied,total_blks_written,rel_blks_read,rel_blks_hit,pages_scanned,pages_removed,vm_new_frozen_pages,vm_new_visible_pages,vm_new_visible_frozen_pages,missed_dead_pages,tuples_deleted,tuples_frozen,recently_dead_tuples,missed_dead_tuples,wraparound_failsafe,index_vacuum_count,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,total_time}',
+  prosrc => 'pg_stat_get_vacuum_tables' },
+
+  { oid => '8002', descr => 'statistics: number of times the all-visible pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_visible_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_visible_pages' },
+  { oid => '8003', descr => 'statistics: number of times the all-frozen pages in the visibility map was removed for pages of table',
+  proname => 'pg_stat_get_rev_all_frozen_pages', provolatile => 's',
+  proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
+  prosrc => 'pg_stat_get_rev_all_frozen_pages' },
 ]
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 1571a66c6bf..a5fe622d932 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -326,6 +326,7 @@ extern PGDLLIMPORT double vacuum_max_eager_freeze_failure_rate;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
 extern PGDLLIMPORT int VacuumCostBalanceLocal;
+extern PGDLLIMPORT double VacuumDelayTime;
 
 extern PGDLLIMPORT bool VacuumFailsafeActive;
 extern PGDLLIMPORT double vacuum_cost_delay;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 53f2a8458e6..d369284101a 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -112,6 +112,53 @@ typedef struct PgStat_BackendSubEntry
 	PgStat_Counter conflict_count[CONFLICT_NUM_TYPES];
 } PgStat_BackendSubEntry;
 
+/* ----------
+ *
+ * ExtVacReport
+ *
+ * Additional statistics of vacuum processing over a heap relation.
+ * pages_removed is the amount by which the physically shrank,
+ * if any (ie the change in its total size on disk)
+ * pages_deleted refer to free space within the index file
+ * ----------
+ */
+typedef struct ExtVacReport
+{
+	/* number of blocks missed, hit, dirtied and written during a vacuum of specific relation */
+	int64		total_blks_read;
+	int64		total_blks_hit;
+	int64		total_blks_dirtied;
+	int64		total_blks_written;
+
+	/* blocks missed and hit for just the heap during a vacuum of specific relation */
+	int64		blks_fetched;
+	int64		blks_hit;
+
+	/* Vacuum WAL usage stats */
+	int64		wal_records;	/* wal usage: number of WAL records */
+	int64		wal_fpi;		/* wal usage: number of WAL full page images produced */
+	uint64		wal_bytes;		/* wal usage: size of WAL records produced */
+
+	/* Time stats. */
+	double		blk_read_time;	/* time spent reading pages, in msec */
+	double		blk_write_time; /* time spent writing pages, in msec */
+	double		delay_time;		/* how long vacuum slept in vacuum delay point, in msec */
+	double		total_time;		/* total time of a vacuum operation, in msec */
+
+	int64		pages_scanned;		/* heap pages examined (not skipped by VM) */
+	int64		pages_removed;		/* heap pages removed by vacuum "truncation" */
+	int64		vm_new_frozen_pages;		/* pages marked in VM as frozen */
+	int64		vm_new_visible_pages;	/* pages marked in VM as all-visible */
+	int64		vm_new_visible_frozen_pages;	/* pages marked in VM as all-visible and frozen */
+	int64		missed_dead_tuples;		/* tuples not pruned by vacuum due to failure to get a cleanup lock */
+	int64		missed_dead_pages;		/* pages with missed dead tuples */
+	int64		tuples_deleted;		/* tuples deleted by vacuum */
+	int64		tuples_frozen;		/* tuples frozen up by vacuum */
+	int64		recently_dead_tuples;	/* deleted tuples that are still visible to some transaction */
+	int64		index_vacuum_count;	/* the number of index vacuumings */
+	int32		wraparound_failsafe_count;	/* the number of times to prevent workaround problem */
+} ExtVacReport;
+
 /* ----------
  * PgStat_TableCounts			The actual per-table counts kept by a backend
  *
@@ -154,6 +201,16 @@ typedef struct PgStat_TableCounts
 
 	PgStat_Counter blocks_fetched;
 	PgStat_Counter blocks_hit;
+
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
+
+	/*
+	 * Additional cumulative stat on vacuum operations.
+	 * Use an expensive structure as an abstraction for different types of
+	 * relations.
+	 */
+	ExtVacReport	vacuum_ext;
 } PgStat_TableCounts;
 
 /* ----------
@@ -212,7 +269,7 @@ typedef struct PgStat_TableXactStatus
  * ------------------------------------------------------------
  */
 
-#define PGSTAT_FILE_FORMAT_ID	0x01A5BCB3
+#define PGSTAT_FILE_FORMAT_ID	0x01A5BCB4
 
 typedef struct PgStat_ArchiverStats
 {
@@ -394,6 +451,8 @@ typedef struct PgStat_StatDBEntry
 	PgStat_Counter parallel_workers_launched;
 
 	TimestampTz stat_reset_timestamp;
+
+	ExtVacReport vacuum_ext;		/* extended vacuum statistics */
 } PgStat_StatDBEntry;
 
 typedef struct PgStat_StatFuncEntry
@@ -472,6 +531,11 @@ typedef struct PgStat_StatTabEntry
 	PgStat_Counter total_autovacuum_time;
 	PgStat_Counter total_analyze_time;
 	PgStat_Counter total_autoanalyze_time;
+
+	PgStat_Counter rev_all_visible_pages;
+	PgStat_Counter rev_all_frozen_pages;
+
+	ExtVacReport vacuum_ext;
 } PgStat_StatTabEntry;
 
 typedef struct PgStat_WalStats
@@ -656,7 +720,7 @@ extern void pgstat_unlink_relation(Relation rel);
 
 extern void pgstat_report_vacuum(Oid tableoid, bool shared,
 								 PgStat_Counter livetuples, PgStat_Counter deadtuples,
-								 TimestampTz starttime);
+								 TimestampTz starttime, ExtVacReport *params);
 extern void pgstat_report_analyze(Relation rel,
 								  PgStat_Counter livetuples, PgStat_Counter deadtuples,
 								  bool resetcounter, TimestampTz starttime);
@@ -707,6 +771,17 @@ extern void pgstat_report_analyze(Relation rel,
 		if (pgstat_should_count_relation(rel))						\
 			(rel)->pgstat_info->counts.blocks_hit++;				\
 	} while (0)
+/* accumulate unfrozen all-visible and all-frozen pages */
+#define pgstat_count_vm_rev_all_visible(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_visible_pages++;	\
+	} while (0)
+#define pgstat_count_vm_rev_all_frozen(rel)						\
+	do {															\
+		if (pgstat_should_count_relation(rel))						\
+			(rel)->pgstat_info->counts.rev_all_frozen_pages++;	\
+	} while (0)
 
 extern void pgstat_count_heap_insert(Relation rel, PgStat_Counter n);
 extern void pgstat_count_heap_update(Relation rel, bool hot, bool newpage);
@@ -795,6 +870,7 @@ extern PgStat_WalStats *pgstat_fetch_stat_wal(void);
 extern PGDLLIMPORT bool pgstat_track_counts;
 extern PGDLLIMPORT int pgstat_track_functions;
 extern PGDLLIMPORT int pgstat_fetch_consistency;
+extern PGDLLIMPORT bool pgstat_track_vacuum_statistics;
 
 
 /*
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 7161f5c6ad6..9e080747a92 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -230,6 +230,7 @@ extern int	geterrlevel(void);
 extern int	geterrposition(void);
 extern int	getinternalerrposition(void);
 
+extern int	geterrelevel(void);
 
 /*----------
  * Old-style error reporting API: to be used in this way:
diff --git a/src/test/isolation/expected/vacuum-extending-in-repetable-read.out b/src/test/isolation/expected/vacuum-extending-in-repetable-read.out
new file mode 100644
index 00000000000..87f7e40b4a6
--- /dev/null
+++ b/src/test/isolation/expected/vacuum-extending-in-repetable-read.out
@@ -0,0 +1,53 @@
+unused step name: s2_delete
+Parsed test spec with 2 sessions
+
+starting permutation: s2_insert s2_print_vacuum_stats_table s1_begin_repeatable_read s2_update s2_insert_interrupt s2_vacuum s2_print_vacuum_stats_table s1_commit s2_checkpoint s2_vacuum s2_print_vacuum_stats_table
+step s2_insert: INSERT INTO test_vacuum_stat_isolation(id, ival) SELECT ival, ival%10 FROM generate_series(1,1000) As ival;
+step s2_print_vacuum_stats_table: 
+    SELECT
+    vt.relname, vt.tuples_deleted, vt.recently_dead_tuples, vt.missed_dead_tuples, vt.missed_dead_pages, vt.tuples_frozen
+    FROM pg_stat_vacuum_tables vt, pg_class c
+    WHERE vt.relname = 'test_vacuum_stat_isolation' AND vt.relid = c.oid;
+
+relname                   |tuples_deleted|recently_dead_tuples|missed_dead_tuples|missed_dead_pages|tuples_frozen
+--------------------------+--------------+--------------------+------------------+-----------------+-------------
+test_vacuum_stat_isolation|             0|                   0|                 0|                0|            0
+(1 row)
+
+step s1_begin_repeatable_read: 
+  BEGIN transaction ISOLATION LEVEL REPEATABLE READ;
+  select count(ival) from test_vacuum_stat_isolation where id>900;
+
+count
+-----
+  100
+(1 row)
+
+step s2_update: UPDATE test_vacuum_stat_isolation SET ival = ival + 2 where id > 900;
+step s2_insert_interrupt: INSERT INTO test_vacuum_stat_isolation values (1,1);
+step s2_vacuum: VACUUM test_vacuum_stat_isolation;
+step s2_print_vacuum_stats_table: 
+    SELECT
+    vt.relname, vt.tuples_deleted, vt.recently_dead_tuples, vt.missed_dead_tuples, vt.missed_dead_pages, vt.tuples_frozen
+    FROM pg_stat_vacuum_tables vt, pg_class c
+    WHERE vt.relname = 'test_vacuum_stat_isolation' AND vt.relid = c.oid;
+
+relname                   |tuples_deleted|recently_dead_tuples|missed_dead_tuples|missed_dead_pages|tuples_frozen
+--------------------------+--------------+--------------------+------------------+-----------------+-------------
+test_vacuum_stat_isolation|             0|                 100|                 0|                0|            0
+(1 row)
+
+step s1_commit: COMMIT;
+step s2_checkpoint: CHECKPOINT;
+step s2_vacuum: VACUUM test_vacuum_stat_isolation;
+step s2_print_vacuum_stats_table: 
+    SELECT
+    vt.relname, vt.tuples_deleted, vt.recently_dead_tuples, vt.missed_dead_tuples, vt.missed_dead_pages, vt.tuples_frozen
+    FROM pg_stat_vacuum_tables vt, pg_class c
+    WHERE vt.relname = 'test_vacuum_stat_isolation' AND vt.relid = c.oid;
+
+relname                   |tuples_deleted|recently_dead_tuples|missed_dead_tuples|missed_dead_pages|tuples_frozen
+--------------------------+--------------+--------------------+------------------+-----------------+-------------
+test_vacuum_stat_isolation|           100|                 100|                 0|                0|          101
+(1 row)
+
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 143109aa4da..e93dd4f626c 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -95,6 +95,7 @@ test: timeouts
 test: vacuum-concurrent-drop
 test: vacuum-conflict
 test: vacuum-skip-locked
+test: vacuum-extending-in-repetable-read
 test: stats
 test: horizons
 test: predicate-hash
diff --git a/src/test/isolation/specs/vacuum-extending-in-repetable-read.spec b/src/test/isolation/specs/vacuum-extending-in-repetable-read.spec
new file mode 100644
index 00000000000..5893d89573d
--- /dev/null
+++ b/src/test/isolation/specs/vacuum-extending-in-repetable-read.spec
@@ -0,0 +1,53 @@
+# Test for checking recently_dead_tuples, tuples_deleted and frozen tuples in pg_stat_vacuum_tables.
+# recently_dead_tuples values are counted when vacuum hasn't cleared tuples because they were deleted recently.
+# recently_dead_tuples aren't increased after releasing lock compared with tuples_deleted, which increased
+# by the value of the cleared tuples that the vacuum managed to clear.
+
+setup
+{
+    CREATE TABLE test_vacuum_stat_isolation(id int, ival int) WITH (autovacuum_enabled = off);
+    SET track_io_timing = on;
+    SET track_vacuum_statistics TO 'on';
+}
+
+teardown
+{
+    DROP TABLE test_vacuum_stat_isolation CASCADE;
+    RESET track_io_timing;
+    RESET track_vacuum_statistics;
+}
+
+session s1
+step s1_begin_repeatable_read   {
+  BEGIN transaction ISOLATION LEVEL REPEATABLE READ;
+  select count(ival) from test_vacuum_stat_isolation where id>900;
+  }
+step s1_commit                  { COMMIT; }
+
+session s2
+step s2_insert                  { INSERT INTO test_vacuum_stat_isolation(id, ival) SELECT ival, ival%10 FROM generate_series(1,1000) As ival; }
+step s2_update                  { UPDATE test_vacuum_stat_isolation SET ival = ival + 2 where id > 900; }
+step s2_delete                  { DELETE FROM test_vacuum_stat_isolation where id > 900; }
+step s2_insert_interrupt        { INSERT INTO test_vacuum_stat_isolation values (1,1); }
+step s2_vacuum                  { VACUUM test_vacuum_stat_isolation; }
+step s2_checkpoint              { CHECKPOINT; }
+step s2_print_vacuum_stats_table
+{
+    SELECT
+    vt.relname, vt.tuples_deleted, vt.recently_dead_tuples, vt.missed_dead_tuples, vt.missed_dead_pages, vt.tuples_frozen
+    FROM pg_stat_vacuum_tables vt, pg_class c
+    WHERE vt.relname = 'test_vacuum_stat_isolation' AND vt.relid = c.oid;
+}
+
+permutation
+    s2_insert
+    s2_print_vacuum_stats_table
+    s1_begin_repeatable_read
+    s2_update
+    s2_insert_interrupt
+    s2_vacuum
+    s2_print_vacuum_stats_table
+    s1_commit
+    s2_checkpoint
+    s2_vacuum
+    s2_print_vacuum_stats_table
\ No newline at end of file
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 5baba8d39ff..7cc94fbf275 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1808,7 +1808,9 @@ pg_stat_all_tables| SELECT c.oid AS relid,
     pg_stat_get_total_vacuum_time(c.oid) AS total_vacuum_time,
     pg_stat_get_total_autovacuum_time(c.oid) AS total_autovacuum_time,
     pg_stat_get_total_analyze_time(c.oid) AS total_analyze_time,
-    pg_stat_get_total_autoanalyze_time(c.oid) AS total_autoanalyze_time
+    pg_stat_get_total_autoanalyze_time(c.oid) AS total_autoanalyze_time,
+    pg_stat_get_rev_all_frozen_pages(c.oid) AS rev_all_frozen_pages,
+    pg_stat_get_rev_all_visible_pages(c.oid) AS rev_all_visible_pages
    FROM ((pg_class c
      LEFT JOIN pg_index i ON ((c.oid = i.indrelid)))
      LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))
@@ -2200,7 +2202,9 @@ pg_stat_sys_tables| SELECT relid,
     total_vacuum_time,
     total_autovacuum_time,
     total_analyze_time,
-    total_autoanalyze_time
+    total_autoanalyze_time,
+    rev_all_frozen_pages,
+    rev_all_visible_pages
    FROM pg_stat_all_tables
   WHERE ((schemaname = ANY (ARRAY['pg_catalog'::name, 'information_schema'::name])) OR (schemaname ~ '^pg_toast'::text));
 pg_stat_user_functions| SELECT p.oid AS funcid,
@@ -2252,9 +2256,43 @@ pg_stat_user_tables| SELECT relid,
     total_vacuum_time,
     total_autovacuum_time,
     total_analyze_time,
-    total_autoanalyze_time
+    total_autoanalyze_time,
+    rev_all_frozen_pages,
+    rev_all_visible_pages
    FROM pg_stat_all_tables
   WHERE ((schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (schemaname !~ '^pg_toast'::text));
+pg_stat_vacuum_tables| SELECT ns.nspname AS schemaname,
+    rel.relname,
+    stats.relid,
+    stats.total_blks_read,
+    stats.total_blks_hit,
+    stats.total_blks_dirtied,
+    stats.total_blks_written,
+    stats.rel_blks_read,
+    stats.rel_blks_hit,
+    stats.pages_scanned,
+    stats.pages_removed,
+    stats.vm_new_frozen_pages,
+    stats.vm_new_visible_pages,
+    stats.vm_new_visible_frozen_pages,
+    stats.missed_dead_pages,
+    stats.tuples_deleted,
+    stats.tuples_frozen,
+    stats.recently_dead_tuples,
+    stats.missed_dead_tuples,
+    stats.wraparound_failsafe,
+    stats.index_vacuum_count,
+    stats.wal_records,
+    stats.wal_fpi,
+    stats.wal_bytes,
+    stats.blk_read_time,
+    stats.blk_write_time,
+    stats.delay_time,
+    stats.total_time
+   FROM (pg_class rel
+     JOIN pg_namespace ns ON ((ns.oid = rel.relnamespace))),
+    LATERAL pg_stat_get_vacuum_tables(rel.oid) stats(relid, total_blks_read, total_blks_hit, total_blks_dirtied, total_blks_written, rel_blks_read, rel_blks_hit, pages_scanned, pages_removed, vm_new_frozen_pages, vm_new_visible_pages, vm_new_visible_frozen_pages, missed_dead_pages, tuples_deleted, tuples_frozen, recently_dead_tuples, missed_dead_tuples, wraparound_failsafe, index_vacuum_count, wal_records, wal_fpi, wal_bytes, blk_read_time, blk_write_time, delay_time, total_time)
+  WHERE (rel.relkind = 'r'::"char");
 pg_stat_wal| SELECT wal_records,
     wal_fpi,
     wal_bytes,
diff --git a/src/test/regress/expected/vacuum_tables_statistics.out b/src/test/regress/expected/vacuum_tables_statistics.out
new file mode 100644
index 00000000000..b5ea9c9ab1e
--- /dev/null
+++ b/src/test/regress/expected/vacuum_tables_statistics.out
@@ -0,0 +1,227 @@
+--
+-- Test cumulative vacuum stats system
+--
+-- Check the wall statistics collected during vacuum operation:
+-- number of frozen and visible pages set by vacuum;
+-- number of frozen and visible pages removed by backend.
+-- Statistic wal_fpi is not displayed in this test because its behavior is unstable.
+--
+-- conditio sine qua non
+SHOW track_counts;  -- must be on
+ track_counts 
+--------------
+ on
+(1 row)
+
+\set sample_size 10000
+-- not enabled by default, but we want to test it...
+SET track_functions TO 'all';
+-- Test that vacuum statistics will be empty when parameter is off.
+SET track_vacuum_statistics TO 'off';
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+DELETE FROM vestat WHERE x % 2 = 0;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+-- Must be empty.
+SELECT relname,total_blks_read, total_blks_hit, total_blks_dirtied, total_blks_written,rel_blks_read, rel_blks_hit,
+pages_scanned, pages_removed, vm_new_frozen_pages, vm_new_visible_pages, vm_new_visible_frozen_pages, missed_dead_pages,
+tuples_deleted, tuples_frozen, recently_dead_tuples, missed_dead_tuples, index_vacuum_count,
+wal_records, wal_fpi, wal_bytes, blk_read_time, blk_write_time,delay_time, total_time
+FROM pg_stat_vacuum_tables vt
+WHERE vt.relname = 'vestat';
+ relname | total_blks_read | total_blks_hit | total_blks_dirtied | total_blks_written | rel_blks_read | rel_blks_hit | pages_scanned | pages_removed | vm_new_frozen_pages | vm_new_visible_pages | vm_new_visible_frozen_pages | missed_dead_pages | tuples_deleted | tuples_frozen | recently_dead_tuples | missed_dead_tuples | index_vacuum_count | wal_records | wal_fpi | wal_bytes | blk_read_time | blk_write_time | delay_time | total_time 
+---------+-----------------+----------------+--------------------+--------------------+---------------+--------------+---------------+---------------+---------------------+----------------------+-----------------------------+-------------------+----------------+---------------+----------------------+--------------------+--------------------+-------------+---------+-----------+---------------+----------------+------------+------------
+ vestat  |               0 |              0 |                  0 |                  0 |             0 |            0 |             0 |             0 |                   0 |                    0 |                           0 |                 0 |              0 |             0 |                    0 |                  0 |                  0 |           0 |       0 |         0 |             0 |              0 |          0 |          0
+(1 row)
+
+RESET track_vacuum_statistics;
+DROP TABLE vestat CASCADE;
+SHOW track_vacuum_statistics;  -- must be on
+ track_vacuum_statistics 
+-------------------------
+ on
+(1 row)
+
+-- ensure pending stats are flushed
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+--SET stats_fetch_consistency = snapshot;
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+SELECT oid AS roid from pg_class where relname = 'vestat' \gset
+DELETE FROM vestat WHERE x % 2 = 0;
+-- Before the first vacuum execution extended stats view is empty.
+SELECT vt.relname,vm_new_frozen_pages,tuples_deleted,relpages,pages_scanned,pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+ relname | vm_new_frozen_pages | tuples_deleted | relpages | pages_scanned | pages_removed 
+---------+---------------------+----------------+----------+---------------+---------------
+ vestat  |                   0 |              0 |      455 |             0 |             0
+(1 row)
+
+SELECT relpages AS rp
+FROM pg_class c
+WHERE relname = 'vestat' \gset
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- The table and index extended vacuum statistics should show us that
+-- vacuum frozed pages and clean up pages, but pages_removed stayed the same
+-- because of not full table have cleaned up
+SELECT vt.relname,vm_new_frozen_pages > 0 AS vm_new_frozen_pages,tuples_deleted > 0 AS tuples_deleted,relpages-:rp = 0 AS relpages,pages_scanned > 0 AS pages_scanned,pages_removed = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+ relname | vm_new_frozen_pages | tuples_deleted | relpages | pages_scanned | pages_removed 
+---------+---------------------+----------------+----------+---------------+---------------
+ vestat  | f                   | t              | t        | t             | t
+(1 row)
+
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps, pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+-- Look into WAL records deltas.
+SELECT wal_records > 0 AS dWR, wal_bytes > 0 AS dWB
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat';
+ dwr | dwb 
+-----+-----
+ t   | t
+(1 row)
+
+DELETE FROM vestat;;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- pages_removed must be increased
+SELECT vt.relname,vm_new_frozen_pages-:fp > 0 AS vm_new_frozen_pages,tuples_deleted-:td > 0 AS tuples_deleted,relpages -:rp = 0 AS relpages,pages_scanned-:ps > 0 AS pages_scanned,pages_removed-:pr > 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+ relname | vm_new_frozen_pages | tuples_deleted | relpages | pages_scanned | pages_removed 
+---------+---------------------+----------------+----------+---------------+---------------
+ vestat  | f                   | t              | f        | t             | t
+(1 row)
+
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps, pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records-:hwr AS dwr, wal_bytes-:hwb AS dwb, wal_fpi-:hfpi AS dfpi
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+-- WAL advance should be detected.
+SELECT :dwr > 0 AS dWR, :dwb > 0 AS dWB;
+ dwr | dwb 
+-----+-----
+ t   | t
+(1 row)
+
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+DELETE FROM vestat WHERE x % 2 = 0;
+-- VACUUM FULL doesn't report to stat collector. So, no any advancements of statistics
+-- are detected here.
+VACUUM FULL vestat;
+-- It is necessary to check the wal statistics
+CHECKPOINT;
+-- Store WAL advances into variables
+SELECT wal_records-:hwr AS dwr2, wal_bytes-:hwb AS dwb2, wal_fpi-:hfpi AS dfpi2
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+-- WAL and other statistics advance should not be detected.
+SELECT :dwr2=0 AS dWR, :dfpi2=0 AS dFPI, :dwb2=0 AS dWB;
+ dwr | dfpi | dwb 
+-----+------+-----
+ t   | t    | t
+(1 row)
+
+SELECT vt.relname,vm_new_frozen_pages-:fp = 0 AS vm_new_frozen_pages,tuples_deleted-:td = 0 AS tuples_deleted,relpages -:rp < 0 AS relpages,pages_scanned-:ps = 0 AS pages_scanned,pages_removed-:pr = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+ relname | vm_new_frozen_pages | tuples_deleted | relpages | pages_scanned | pages_removed 
+---------+---------------------+----------------+----------+---------------+---------------
+ vestat  | t                   | t              | f        | t             | t
+(1 row)
+
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps,pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+DELETE FROM vestat;
+TRUNCATE vestat;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- Store WAL advances into variables after removing all tuples from the table
+SELECT wal_records-:hwr AS dwr3, wal_bytes-:hwb AS dwb3, wal_fpi-:hfpi AS dfpi3
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+--There are nothing changed
+SELECT :dwr3>0 AS dWR, :dfpi3=0 AS dFPI, :dwb3>0 AS dWB;
+ dwr | dfpi | dwb 
+-----+------+-----
+ t   | t    | t
+(1 row)
+
+--
+-- Now, the table and index is compressed into zero number of pages. Check it
+-- in vacuum extended statistics.
+-- The vm_new_frozen_pages, pages_scanned values shouldn't be changed
+--
+SELECT vt.relname,vm_new_frozen_pages-:fp = 0 AS vm_new_frozen_pages,tuples_deleted-:td = 0 AS tuples_deleted,relpages -:rp = 0 AS relpages,pages_scanned-:ps = 0 AS pages_scanned,pages_removed-:pr = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+ relname | vm_new_frozen_pages | tuples_deleted | relpages | pages_scanned | pages_removed 
+---------+---------------------+----------------+----------+---------------+---------------
+ vestat  | t                   | t              | f        | t             | t
+(1 row)
+
+DROP TABLE vestat CASCADE;
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+-- must be empty
+SELECT vm_new_frozen_pages, vm_new_visible_pages, rev_all_frozen_pages,rev_all_visible_pages,vm_new_visible_frozen_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+ vm_new_frozen_pages | vm_new_visible_pages | rev_all_frozen_pages | rev_all_visible_pages | vm_new_visible_frozen_pages 
+---------------------+----------------------+----------------------+-----------------------+-----------------------------
+                   0 |                    0 |                    0 |                     0 |                           0
+(1 row)
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+-- backend defreezed pages
+SELECT vm_new_frozen_pages > 0 AS vm_new_frozen_pages,vm_new_visible_pages > 0 AS vm_new_visible_pages,vm_new_visible_frozen_pages > 0 AS vm_new_visible_frozen_pages,rev_all_frozen_pages = 0 AS rev_all_frozen_pages,rev_all_visible_pages = 0 AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+ vm_new_frozen_pages | vm_new_visible_pages | vm_new_visible_frozen_pages | rev_all_frozen_pages | rev_all_visible_pages 
+---------------------+----------------------+-----------------------------+----------------------+-----------------------
+ f                   | t                    | f                           | t                    | t
+(1 row)
+
+SELECT vm_new_frozen_pages AS pf, vm_new_visible_pages AS pv,vm_new_visible_frozen_pages AS pvf, rev_all_frozen_pages AS hafp,rev_all_visible_pages AS havp
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid \gset
+UPDATE vestat SET x = x + 1001;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+SELECT vm_new_frozen_pages > :pf AS vm_new_frozen_pages,vm_new_visible_pages > :pv AS vm_new_visible_pages,vm_new_visible_frozen_pages > :pvf AS vm_new_visible_frozen_pages,rev_all_frozen_pages > :hafp AS rev_all_frozen_pages,rev_all_visible_pages > :havp AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+ vm_new_frozen_pages | vm_new_visible_pages | vm_new_visible_frozen_pages | rev_all_frozen_pages | rev_all_visible_pages 
+---------------------+----------------------+-----------------------------+----------------------+-----------------------
+ f                   | t                    | f                           | f                    | f
+(1 row)
+
+SELECT vm_new_frozen_pages AS pf, vm_new_visible_pages AS pv, vm_new_visible_frozen_pages AS pvf, rev_all_frozen_pages AS hafp,rev_all_visible_pages AS havp
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid \gset
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+-- vacuum freezed pages
+SELECT vm_new_frozen_pages = :pf AS vm_new_frozen_pages,vm_new_visible_pages = :pv AS vm_new_visible_pages,vm_new_visible_frozen_pages = :pvf AS vm_new_visible_frozen_pages, rev_all_frozen_pages = :hafp AS rev_all_frozen_pages,rev_all_visible_pages = :havp AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+ vm_new_frozen_pages | vm_new_visible_pages | vm_new_visible_frozen_pages | rev_all_frozen_pages | rev_all_visible_pages 
+---------------------+----------------------+-----------------------------+----------------------+-----------------------
+ t                   | t                    | t                           | t                    | t
+(1 row)
+
+DROP TABLE vestat CASCADE;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e63ee2cf2bb..6493e9b4681 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -136,3 +136,8 @@ test: fast_default
 # run tablespace test at the end because it drops the tablespace created during
 # setup that other tests may use.
 test: tablespace
+
+# ----------
+# Check vacuum statistics
+# ----------
+test: vacuum_tables_statistics
\ No newline at end of file
diff --git a/src/test/regress/sql/vacuum_tables_statistics.sql b/src/test/regress/sql/vacuum_tables_statistics.sql
new file mode 100644
index 00000000000..5bc34bec64b
--- /dev/null
+++ b/src/test/regress/sql/vacuum_tables_statistics.sql
@@ -0,0 +1,183 @@
+--
+-- Test cumulative vacuum stats system
+--
+-- Check the wall statistics collected during vacuum operation:
+-- number of frozen and visible pages set by vacuum;
+-- number of frozen and visible pages removed by backend.
+-- Statistic wal_fpi is not displayed in this test because its behavior is unstable.
+--
+
+-- conditio sine qua non
+SHOW track_counts;  -- must be on
+\set sample_size 10000
+
+-- not enabled by default, but we want to test it...
+SET track_functions TO 'all';
+
+-- Test that vacuum statistics will be empty when parameter is off.
+SET track_vacuum_statistics TO 'off';
+
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+
+DELETE FROM vestat WHERE x % 2 = 0;
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+
+-- Must be empty.
+SELECT relname,total_blks_read, total_blks_hit, total_blks_dirtied, total_blks_written,rel_blks_read, rel_blks_hit,
+pages_scanned, pages_removed, vm_new_frozen_pages, vm_new_visible_pages, vm_new_visible_frozen_pages, missed_dead_pages,
+tuples_deleted, tuples_frozen, recently_dead_tuples, missed_dead_tuples, index_vacuum_count,
+wal_records, wal_fpi, wal_bytes, blk_read_time, blk_write_time,delay_time, total_time
+FROM pg_stat_vacuum_tables vt
+WHERE vt.relname = 'vestat';
+
+RESET track_vacuum_statistics;
+DROP TABLE vestat CASCADE;
+
+SHOW track_vacuum_statistics;  -- must be on
+
+-- ensure pending stats are flushed
+SELECT pg_stat_force_next_flush();
+
+--SET stats_fetch_consistency = snapshot;
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+
+SELECT oid AS roid from pg_class where relname = 'vestat' \gset
+
+DELETE FROM vestat WHERE x % 2 = 0;
+-- Before the first vacuum execution extended stats view is empty.
+SELECT vt.relname,vm_new_frozen_pages,tuples_deleted,relpages,pages_scanned,pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+SELECT relpages AS rp
+FROM pg_class c
+WHERE relname = 'vestat' \gset
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- The table and index extended vacuum statistics should show us that
+-- vacuum frozed pages and clean up pages, but pages_removed stayed the same
+-- because of not full table have cleaned up
+SELECT vt.relname,vm_new_frozen_pages > 0 AS vm_new_frozen_pages,tuples_deleted > 0 AS tuples_deleted,relpages-:rp = 0 AS relpages,pages_scanned > 0 AS pages_scanned,pages_removed = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps, pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+-- Look into WAL records deltas.
+SELECT wal_records > 0 AS dWR, wal_bytes > 0 AS dWB
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat';
+
+DELETE FROM vestat;;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- pages_removed must be increased
+SELECT vt.relname,vm_new_frozen_pages-:fp > 0 AS vm_new_frozen_pages,tuples_deleted-:td > 0 AS tuples_deleted,relpages -:rp = 0 AS relpages,pages_scanned-:ps > 0 AS pages_scanned,pages_removed-:pr > 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps, pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records-:hwr AS dwr, wal_bytes-:hwb AS dwb, wal_fpi-:hfpi AS dfpi
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+-- WAL advance should be detected.
+SELECT :dwr > 0 AS dWR, :dwb > 0 AS dWB;
+
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+DELETE FROM vestat WHERE x % 2 = 0;
+-- VACUUM FULL doesn't report to stat collector. So, no any advancements of statistics
+-- are detected here.
+VACUUM FULL vestat;
+-- It is necessary to check the wal statistics
+CHECKPOINT;
+
+-- Store WAL advances into variables
+SELECT wal_records-:hwr AS dwr2, wal_bytes-:hwb AS dwb2, wal_fpi-:hfpi AS dfpi2
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+-- WAL and other statistics advance should not be detected.
+SELECT :dwr2=0 AS dWR, :dfpi2=0 AS dFPI, :dwb2=0 AS dWB;
+
+SELECT vt.relname,vm_new_frozen_pages-:fp = 0 AS vm_new_frozen_pages,tuples_deleted-:td = 0 AS tuples_deleted,relpages -:rp < 0 AS relpages,pages_scanned-:ps = 0 AS pages_scanned,pages_removed-:pr = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+SELECT vm_new_frozen_pages AS fp,tuples_deleted AS td,relpages AS rp, pages_scanned AS ps,pages_removed AS pr
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records AS hwr,wal_bytes AS hwb,wal_fpi AS hfpi FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+DELETE FROM vestat;
+TRUNCATE vestat;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP OFF) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- Store WAL advances into variables after removing all tuples from the table
+SELECT wal_records-:hwr AS dwr3, wal_bytes-:hwb AS dwb3, wal_fpi-:hfpi AS dfpi3
+FROM pg_stat_vacuum_tables WHERE relname = 'vestat' \gset
+
+--There are nothing changed
+SELECT :dwr3>0 AS dWR, :dfpi3=0 AS dFPI, :dwb3>0 AS dWB;
+
+--
+-- Now, the table and index is compressed into zero number of pages. Check it
+-- in vacuum extended statistics.
+-- The vm_new_frozen_pages, pages_scanned values shouldn't be changed
+--
+SELECT vt.relname,vm_new_frozen_pages-:fp = 0 AS vm_new_frozen_pages,tuples_deleted-:td = 0 AS tuples_deleted,relpages -:rp = 0 AS relpages,pages_scanned-:ps = 0 AS pages_scanned,pages_removed-:pr = 0 AS pages_removed
+FROM pg_stat_vacuum_tables vt, pg_class c
+WHERE vt.relname = 'vestat' AND vt.relid = c.oid;
+
+DROP TABLE vestat CASCADE;
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+
+-- must be empty
+SELECT vm_new_frozen_pages, vm_new_visible_pages, rev_all_frozen_pages,rev_all_visible_pages,vm_new_visible_frozen_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+
+-- backend defreezed pages
+SELECT vm_new_frozen_pages > 0 AS vm_new_frozen_pages,vm_new_visible_pages > 0 AS vm_new_visible_pages,vm_new_visible_frozen_pages > 0 AS vm_new_visible_frozen_pages,rev_all_frozen_pages = 0 AS rev_all_frozen_pages,rev_all_visible_pages = 0 AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+SELECT vm_new_frozen_pages AS pf, vm_new_visible_pages AS pv,vm_new_visible_frozen_pages AS pvf, rev_all_frozen_pages AS hafp,rev_all_visible_pages AS havp
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid \gset
+
+UPDATE vestat SET x = x + 1001;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+
+SELECT vm_new_frozen_pages > :pf AS vm_new_frozen_pages,vm_new_visible_pages > :pv AS vm_new_visible_pages,vm_new_visible_frozen_pages > :pvf AS vm_new_visible_frozen_pages,rev_all_frozen_pages > :hafp AS rev_all_frozen_pages,rev_all_visible_pages > :havp AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+SELECT vm_new_frozen_pages AS pf, vm_new_visible_pages AS pv, vm_new_visible_frozen_pages AS pvf, rev_all_frozen_pages AS hafp,rev_all_visible_pages AS havp
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid \gset
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+
+-- vacuum freezed pages
+SELECT vm_new_frozen_pages = :pf AS vm_new_frozen_pages,vm_new_visible_pages = :pv AS vm_new_visible_pages,vm_new_visible_frozen_pages = :pvf AS vm_new_visible_frozen_pages, rev_all_frozen_pages = :hafp AS rev_all_frozen_pages,rev_all_visible_pages = :havp AS rev_all_visible_pages
+FROM pg_stat_vacuum_tables, pg_stat_all_tables WHERE pg_stat_vacuum_tables.relname = 'vestat' and pg_stat_vacuum_tables.relid = pg_stat_all_tables.relid;
+
+DROP TABLE vestat CASCADE;
\ No newline at end of file
-- 
2.34.1



  [text/x-patch] v19-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch (55.8K, 4-v19-0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch)
  download | inline diff:
From eed2ae2928f4082e9678e94d0adb4aa1386f9e66 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <[email protected]>
Date: Mon, 17 Feb 2025 17:28:55 +0300
Subject: [PATCH 3/4] Machinery for grabbing an extended vacuum statistics on 
 databases.

Database vacuum statistics information is the collected general
vacuum statistics indexes and tables owned by the databases, which
they belong to.

In addition to the fact that there are far fewer databases in a system
than relations, vacuum statistics for a database contain fewer statistics
than relations, but they are enough to indicate that something may be
wrong in the system and prompt the administrator to enable extended
monitoring for relations.

So, buffer, wal, statistics of I/O time of read and writen blocks
statistics will be observed because they are collected for both
tables, indexes. In addition, we show the number of errors caught
during operation of the vacuum only for the error level.

wraparound_failsafe_count is a number of times when the vacuum starts
urgent cleanup to prevent wraparound problem which is critical for
the database.

Authors: Alena Rybakina <[email protected]>,
   Andrei Lepikhov <[email protected]>,
   Andrei Zubkov <[email protected]>
Reviewed-by: Dilip Kumar <[email protected]>, Masahiko Sawada <[email protected]>,
       Ilia Evdokimov <[email protected]>, jian he <[email protected]>,
       Kirill Reshke <[email protected]>, Alexander Korotkov <[email protected]>,
       Jim Nasby <[email protected]>, Sami Imseih <[email protected]>
---
 src/backend/access/heap/vacuumlazy.c          | 290 ++++++++++++++----
 src/backend/catalog/system_views.sql          |  32 ++
 src/backend/commands/vacuumparallel.c         |  14 +
 src/backend/utils/activity/pgstat.c           |   4 +
 src/backend/utils/activity/pgstat_relation.c  |  48 ++-
 src/backend/utils/adt/pgstatfuncs.c           | 133 +++++++-
 src/backend/utils/misc/guc_tables.c           |   2 +-
 src/include/catalog/pg_proc.dat               |   9 +
 src/include/commands/vacuum.h                 |  25 ++
 src/include/pgstat.h                          |  58 +++-
 .../vacuum-extending-in-repetable-read.out    |   4 +-
 src/test/regress/expected/rules.out           |  22 ++
 .../expected/vacuum_index_statistics.out      | 183 +++++++++++
 src/test/regress/parallel_schedule            |   1 +
 .../regress/sql/vacuum_index_statistics.sql   | 151 +++++++++
 15 files changed, 871 insertions(+), 105 deletions(-)
 create mode 100644 src/test/regress/expected/vacuum_index_statistics.out
 create mode 100644 src/test/regress/sql/vacuum_index_statistics.sql

diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c
index 839bd41be62..a0c68ad2c8d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -411,6 +411,8 @@ typedef struct LVRelState
 	BlockNumber eager_scan_remaining_fails;
 
 	int32		wraparound_failsafe_count; /* the number of times to prevent workaround problem */
+
+	ExtVacReport extVacReport;
 } LVRelState;
 
 
@@ -422,19 +424,6 @@ typedef struct LVSavedErrInfo
 	VacErrPhase phase;
 } LVSavedErrInfo;
 
-/*
- * Counters and usage data for extended stats tracking.
- */
-typedef struct LVExtStatCounters
-{
-	TimestampTz starttime;
-	WalUsage	walusage;
-	BufferUsage bufusage;
-	double		VacuumDelayTime;
-	PgStat_Counter blocks_fetched;
-	PgStat_Counter blocks_hit;
-} LVExtStatCounters;
-
 /* non-export function prototypes */
 static void lazy_scan_heap(LVRelState *vacrel);
 static void heap_vacuum_eager_scan_setup(LVRelState *vacrel,
@@ -556,27 +545,25 @@ extvac_stats_end(Relation rel, LVExtStatCounters *counters,
 	endtime = GetCurrentTimestamp();
 	TimestampDifference(counters->starttime, endtime, &secs, &usecs);
 
-	memset(report, 0, sizeof(ExtVacReport));
-
 	/*
 	 * Fill additional statistics on a vacuum processing operation.
 	 */
-	report->total_blks_read = bufusage.local_blks_read + bufusage.shared_blks_read;
-	report->total_blks_hit = bufusage.local_blks_hit + bufusage.shared_blks_hit;
-	report->total_blks_dirtied = bufusage.local_blks_dirtied + bufusage.shared_blks_dirtied;
-	report->total_blks_written = bufusage.shared_blks_written;
+	report->total_blks_read += bufusage.local_blks_read + bufusage.shared_blks_read;
+	report->total_blks_hit += bufusage.local_blks_hit + bufusage.shared_blks_hit;
+	report->total_blks_dirtied += bufusage.local_blks_dirtied + bufusage.shared_blks_dirtied;
+	report->total_blks_written += bufusage.shared_blks_written;
 
-	report->wal_records = walusage.wal_records;
-	report->wal_fpi = walusage.wal_fpi;
-	report->wal_bytes = walusage.wal_bytes;
+	report->wal_records += walusage.wal_records;
+	report->wal_fpi += walusage.wal_fpi;
+	report->wal_bytes += walusage.wal_bytes;
 
-	report->blk_read_time = INSTR_TIME_GET_MILLISEC(bufusage.local_blk_read_time);
+	report->blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage.local_blk_read_time);
 	report->blk_read_time += INSTR_TIME_GET_MILLISEC(bufusage.shared_blk_read_time);
-	report->blk_write_time = INSTR_TIME_GET_MILLISEC(bufusage.local_blk_write_time);
-	report->blk_write_time = INSTR_TIME_GET_MILLISEC(bufusage.shared_blk_write_time);
-	report->delay_time = VacuumDelayTime - counters->VacuumDelayTime;
+	report->blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage.local_blk_write_time);
+	report->blk_write_time += INSTR_TIME_GET_MILLISEC(bufusage.shared_blk_write_time);
+	report->delay_time += VacuumDelayTime - counters->VacuumDelayTime;
 
-	report->total_time = secs * 1000. + usecs / 1000.;
+	report->total_time += secs * 1000. + usecs / 1000.;
 
 	if (!rel->pgstat_info || !pgstat_track_counts)
 		/*
@@ -585,12 +572,97 @@ extvac_stats_end(Relation rel, LVExtStatCounters *counters,
 		 */
 		return;
 
-	report->blks_fetched =
+	report->blks_fetched +=
 		rel->pgstat_info->counts.blocks_fetched - counters->blocks_fetched;
-	report->blks_hit =
+	report->blks_hit +=
 		rel->pgstat_info->counts.blocks_hit - counters->blocks_hit;
 }
 
+void
+extvac_stats_start_idx(Relation rel, IndexBulkDeleteResult *stats,
+				   LVExtStatCountersIdx *counters)
+{
+	if(!pgstat_track_vacuum_statistics)
+		return;
+
+	/* Set initial values for common heap and index statistics*/
+	extvac_stats_start(rel, &counters->common);
+	counters->pages_deleted = counters->tuples_removed = 0;
+
+	if (stats != NULL)
+	{
+		/*
+		 * XXX: Why do we need this code here? If it is needed, I feel lack of
+		 * comments, describing the reason.
+		 */
+		counters->tuples_removed = stats->tuples_removed;
+		counters->pages_deleted = stats->pages_deleted;
+	}
+}
+
+void
+extvac_stats_end_idx(Relation rel, IndexBulkDeleteResult *stats,
+					 LVExtStatCountersIdx *counters, ExtVacReport *report)
+{
+	memset(report, 0, sizeof(ExtVacReport));
+
+	extvac_stats_end(rel, &counters->common, report);
+	report->type = PGSTAT_EXTVAC_INDEX;
+
+	if (stats != NULL)
+	{
+		/*
+		 * if something goes wrong or an user doesn't want to track a database
+		 * activity - just suppress it.
+		 */
+
+		/* Fill index-specific extended stats fields */
+		report->tuples_deleted =
+							stats->tuples_removed - counters->tuples_removed;
+		report->index.pages_deleted =
+							stats->pages_deleted - counters->pages_deleted;
+	}
+}
+
+/* Accumulate vacuum statistics for heap.
+ *
+  * Because of complexity of vacuum processing: it switch procesing between
+  * the heap relation to index relations and visa versa, we need to store
+  * gathered statistics information for heap relations several times before
+  * the vacuum starts processing the indexes again.
+  *
+  * It is necessary to gather correct statistics information for heap and indexes
+  * otherwice the index statistics information would be added to his parent heap
+  * statistics information and it would be difficult to analyze it later.
+  *
+ * We can't subtract union vacuum statistics information for index from the heap relations
+  * because of total and delay time time statistics collecting during parallel vacuum
+  * procudure.
+*/
+static void
+accumulate_heap_vacuum_statistics(LVExtStatCounters *extVacCounters, LVRelState *vacrel)
+{
+	if (!pgstat_track_vacuum_statistics)
+		return;
+
+	/* Fill heap-specific extended stats fields */
+	vacrel->extVacReport.type = PGSTAT_EXTVAC_TABLE;
+	vacrel->extVacReport.table.pages_scanned += vacrel->scanned_pages;
+	vacrel->extVacReport.table.pages_removed += vacrel->removed_pages;
+	vacrel->extVacReport.table.vm_new_frozen_pages += vacrel->vm_new_frozen_pages;
+	vacrel->extVacReport.table.vm_new_visible_pages += vacrel->vm_new_visible_pages;
+	vacrel->extVacReport.table.vm_new_visible_frozen_pages += vacrel->vm_new_visible_frozen_pages;
+	vacrel->extVacReport.tuples_deleted += vacrel->tuples_deleted;
+	vacrel->extVacReport.table.tuples_frozen += vacrel->tuples_frozen;
+	vacrel->extVacReport.table.recently_dead_tuples += vacrel->recently_dead_tuples;
+	vacrel->extVacReport.table.recently_dead_tuples += vacrel->recently_dead_tuples;
+	vacrel->extVacReport.table.missed_dead_tuples += vacrel->missed_dead_tuples;
+	vacrel->extVacReport.table.missed_dead_pages += vacrel->missed_dead_pages;
+	vacrel->extVacReport.table.index_vacuum_count += vacrel->num_index_scans;
+	vacrel->extVacReport.table.wraparound_failsafe_count += vacrel->wraparound_failsafe_count;
+}
+
+
 /*
  * Helper to set up the eager scanning state for vacuuming a single relation.
  * Initializes the eager scan management related members of the LVRelState.
@@ -746,13 +818,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	BufferUsage startbufferusage = pgBufferUsage;
 	ErrorContextCallback errcallback;
 	LVExtStatCounters extVacCounters;
-	ExtVacReport extVacReport;
 	char	  **indnames = NULL;
-	ExtVacReport allzero;
-
-	/* Initialize vacuum statistics */
-	memset(&allzero, 0, sizeof(ExtVacReport));
-	extVacReport = allzero;
 
 	verbose = (params->options & VACOPT_VERBOSE) != 0;
 	instrument = (verbose || (AmAutoVacuumWorkerProcess() &&
@@ -772,7 +838,6 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 
 	pgstat_progress_start_command(PROGRESS_COMMAND_VACUUM,
 								  RelationGetRelid(rel));
-	extvac_stats_start(rel, &extVacCounters);
 	/*
 	 * Setup error traceback support for ereport() first.  The idea is to set
 	 * up an error context callback to display additional information on any
@@ -959,6 +1024,8 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	 */
 	lazy_scan_heap(vacrel);
 
+	extvac_stats_start(rel, &extVacCounters);
+
 	/*
 	 * Free resources managed by dead_items_alloc.  This ends parallel mode in
 	 * passing when necessary.
@@ -1037,26 +1104,6 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 						vacrel->NewRelfrozenXid, vacrel->NewRelminMxid,
 						&frozenxid_updated, &minmulti_updated, false);
 
-	/* Make generic extended vacuum stats report */
-	extvac_stats_end(rel, &extVacCounters, &extVacReport);
-
-	if(pgstat_track_vacuum_statistics)
-	{
-		/* Fill heap-specific extended stats fields */
-		extVacReport.pages_scanned = vacrel->scanned_pages;
-		extVacReport.pages_removed = vacrel->removed_pages;
-		extVacReport.vm_new_frozen_pages = vacrel->vm_new_frozen_pages;
-		extVacReport.vm_new_visible_pages = vacrel->vm_new_visible_pages;
-		extVacReport.vm_new_visible_frozen_pages = vacrel->vm_new_visible_frozen_pages;
-		extVacReport.tuples_deleted = vacrel->tuples_deleted;
-		extVacReport.tuples_frozen = vacrel->tuples_frozen;
-		extVacReport.recently_dead_tuples = vacrel->recently_dead_tuples;
-		extVacReport.missed_dead_tuples = vacrel->missed_dead_tuples;
-		extVacReport.missed_dead_pages = vacrel->missed_dead_pages;
-		extVacReport.index_vacuum_count = vacrel->num_index_scans;
-		extVacReport.wraparound_failsafe_count = vacrel->wraparound_failsafe_count;
-	}
-
 	/*
 	 * Report results to the cumulative stats system, too.
 	 *
@@ -1066,14 +1113,37 @@ heap_vacuum_rel(Relation rel, VacuumParams *params,
 	 * It seems like a good idea to err on the side of not vacuuming again too
 	 * soon in cases where the failsafe prevented significant amounts of heap
 	 * vacuuming.
+	 *
+	 * We are ready to send vacuum statistics information for heap relations.
 	 */
-	pgstat_report_vacuum(RelationGetRelid(rel),
+	if(pgstat_track_vacuum_statistics)
+	{
+		/* Make generic extended vacuum stats report and
+		 * fill heap-specific extended stats fields.
+		 */
+		extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+		accumulate_heap_vacuum_statistics(&extVacCounters, vacrel);
+
+		pgstat_report_vacuum(RelationGetRelid(rel),
 						 rel->rd_rel->relisshared,
 						 Max(vacrel->new_live_tuples, 0),
 						 vacrel->recently_dead_tuples +
-						 vacrel->missed_dead_tuples,
+ 						 vacrel->missed_dead_tuples,
 						 starttime,
-						 &extVacReport);
+						 &(vacrel->extVacReport));
+
+	}
+	else
+	{
+		pgstat_report_vacuum(RelationGetRelid(rel),
+							 rel->rd_rel->relisshared,
+							 Max(vacrel->new_live_tuples, 0),
+							 vacrel->recently_dead_tuples +
+							 vacrel->missed_dead_tuples,
+							 starttime,
+							 NULL);
+	}
+
 	pgstat_progress_end_command();
 
 	if (instrument)
@@ -1346,6 +1416,7 @@ lazy_scan_heap(LVRelState *vacrel)
 		PROGRESS_VACUUM_MAX_DEAD_TUPLE_BYTES
 	};
 	int64		initprog_val[3];
+	LVExtStatCounters extVacCounters;
 
 	/* Report that we're scanning the heap, advertising total # of blocks */
 	initprog_val[0] = PROGRESS_VACUUM_PHASE_SCAN_HEAP;
@@ -1369,6 +1440,13 @@ lazy_scan_heap(LVRelState *vacrel)
 										vacrel,
 										sizeof(uint8));
 
+	/*
+	 * Due to the fact that vacuum heap processing needs their index vacuuming
+	 * we need to track them separately and accumulate heap vacuum statistics
+	 * separately. So last processes are related to only heap vacuuming process.
+	 */
+	extvac_stats_start(vacrel->rel, &extVacCounters);
+
 	while (true)
 	{
 		Buffer		buf;
@@ -1415,8 +1493,25 @@ lazy_scan_heap(LVRelState *vacrel)
 
 			/* Perform a round of index and heap vacuuming */
 			vacrel->consider_bypass_optimization = false;
+
+			/*
+			 * Lazy vacuum stage includes index vacuuming and cleaning up stage, so
+			 * we prefer tracking them separately.
+			 * Before starting to process the indexes save the current heap statistics
+			*/
+			extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+			accumulate_heap_vacuum_statistics(&extVacCounters, vacrel);
 			lazy_vacuum(vacrel);
 
+			/*
+			 * After completion lazy vacuum, we start again tracking vacuum statistics for
+			 * heap-related objects like FSM, VM, provide heap prunning.
+			 * It seems dangerously that we have start tracking but there are no end, but
+			 * it is safe. The end tracking is located before lazy vacuum stage in the same
+			 * loop or after it.
+ 			*/
+			extvac_stats_start(vacrel->rel, &extVacCounters);
+
 			/*
 			 * Vacuum the Free Space Map to make newly-freed space visible on
 			 * upper-level FSM pages. Note that blkno is the previously
@@ -1639,6 +1734,13 @@ lazy_scan_heap(LVRelState *vacrel)
 
 	read_stream_end(stream);
 
+	/*
+	 * Vacuum can process lazy vacuum again and we save heap statistics now
+	 * just in case in tend to avoid collecting vacuum index statistics again.
+	 */
+	extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+	accumulate_heap_vacuum_statistics(&extVacCounters, vacrel);
+
 	/*
 	 * Do index vacuuming (call each index's ambulkdelete routine), then do
 	 * related heap vacuuming
@@ -1646,6 +1748,12 @@ lazy_scan_heap(LVRelState *vacrel)
 	if (vacrel->dead_items_info->num_items > 0)
 		lazy_vacuum(vacrel);
 
+	/*
+	 * We need to take into account heap vacuum statistics during process of
+	 * FSM.
+ 	 */
+	extvac_stats_start(vacrel->rel, &extVacCounters);
+
 	/*
 	 * Vacuum the remainder of the Free Space Map.  We must do this whether or
 	 * not there were indexes, and whether or not we bypassed index vacuuming.
@@ -1658,6 +1766,10 @@ lazy_scan_heap(LVRelState *vacrel)
 	/* report all blocks vacuumed */
 	pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, rel_pages);
 
+	/* Before starting final index clan up stage save heap statistics */
+	extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+	accumulate_heap_vacuum_statistics(&extVacCounters, vacrel);
+
 	/* Do final index cleanup (call each index's amvacuumcleanup routine) */
 	if (vacrel->nindexes > 0 && vacrel->do_index_cleanup)
 		lazy_cleanup_all_indexes(vacrel);
@@ -2575,6 +2687,7 @@ static void
 lazy_vacuum(LVRelState *vacrel)
 {
 	bool		bypass;
+	LVExtStatCounters extVacCounters;
 
 	/* Should not end up here with no indexes */
 	Assert(vacrel->nindexes > 0);
@@ -2587,6 +2700,9 @@ lazy_vacuum(LVRelState *vacrel)
 		return;
 	}
 
+	/* Set initial statistics values to gather vacuum statistics for the heap */
+	extvac_stats_start(vacrel->rel, &extVacCounters);
+
 	/*
 	 * Consider bypassing index vacuuming (and heap vacuuming) entirely.
 	 *
@@ -2643,6 +2759,14 @@ lazy_vacuum(LVRelState *vacrel)
 				  TidStoreMemoryUsage(vacrel->dead_items) < 32 * 1024 * 1024);
 	}
 
+	/*
+	 * Vacuum is likely to vacuum indexes again, so save vacuum statistics for
+	 * heap relations now.
+	 * The vacuum process below doesn't contain any useful statistics information
+	 * for heap if indexes won't be processed, but we will track them separately.
+	 */
+	extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+
 	if (bypass)
 	{
 		/*
@@ -2659,11 +2783,21 @@ lazy_vacuum(LVRelState *vacrel)
 	}
 	else if (lazy_vacuum_all_indexes(vacrel))
 	{
-		/*
-		 * We successfully completed a round of index vacuuming.  Do related
-		 * heap vacuuming now.
-		 */
-		lazy_vacuum_heap_rel(vacrel);
+		/* Now the vacuum is going to process heap relation, so
+		 * we need to set intial statistic values for tracking.
+		*/
+
+		/* Set initial statistics values to gather vacuum statistics for the heap */
+		extvac_stats_start(vacrel->rel, &extVacCounters);
+
+ 		/*
+ 		 * We successfully completed a round of index vacuuming.  Do related
+ 		 * heap vacuuming now.
+ 		 */
+ 		lazy_vacuum_heap_rel(vacrel);
+
+		extvac_stats_end(vacrel->rel, &extVacCounters, &(vacrel->extVacReport));
+		accumulate_heap_vacuum_statistics(&extVacCounters, vacrel);
 	}
 	else
 	{
@@ -3200,6 +3334,11 @@ lazy_vacuum_one_index(Relation indrel, IndexBulkDeleteResult *istat,
 {
 	IndexVacuumInfo ivinfo;
 	LVSavedErrInfo saved_err_info;
+	LVExtStatCountersIdx extVacCounters;
+	ExtVacReport extVacReport;
+
+	/* Set initial statistics values to gather vacuum statistics for the index */
+	extvac_stats_start_idx(indrel, istat, &extVacCounters);
 
 	ivinfo.index = indrel;
 	ivinfo.heaprel = vacrel->rel;
@@ -3226,6 +3365,15 @@ lazy_vacuum_one_index(Relation indrel, IndexBulkDeleteResult *istat,
 	istat = vac_bulkdel_one_index(&ivinfo, istat, vacrel->dead_items,
 								  vacrel->dead_items_info);
 
+	if(pgstat_track_vacuum_statistics)
+	{
+		/* Make extended vacuum stats report for index */
+		extvac_stats_end_idx(indrel, istat, &extVacCounters, &extVacReport);
+		pgstat_report_vacuum(RelationGetRelid(indrel),
+								indrel->rd_rel->relisshared,
+								0, 0, 0, &extVacReport);
+	}
+
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
 	pfree(vacrel->indname);
@@ -3250,6 +3398,11 @@ lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat,
 {
 	IndexVacuumInfo ivinfo;
 	LVSavedErrInfo saved_err_info;
+	LVExtStatCountersIdx extVacCounters;
+	ExtVacReport extVacReport;
+
+	/* Set initial statistics values to gather vacuum statistics for the index */
+	extvac_stats_start_idx(indrel, istat, &extVacCounters);
 
 	ivinfo.index = indrel;
 	ivinfo.heaprel = vacrel->rel;
@@ -3275,6 +3428,15 @@ lazy_cleanup_one_index(Relation indrel, IndexBulkDeleteResult *istat,
 
 	istat = vac_cleanup_one_index(&ivinfo, istat);
 
+	if(pgstat_track_vacuum_statistics)
+	{
+		/* Make extended vacuum stats report for index */
+		extvac_stats_end_idx(indrel, istat, &extVacCounters, &extVacReport);
+		pgstat_report_vacuum(RelationGetRelid(indrel),
+								indrel->rd_rel->relisshared,
+								0, 0, 0, &extVacReport);
+	}
+
 	/* Revert to the previous phase information for error traceback */
 	restore_vacuum_error_info(vacrel, &saved_err_info);
 	pfree(vacrel->indname);
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7526c8973ee..d8c7c179094 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1444,3 +1444,35 @@ FROM pg_class rel
   JOIN pg_namespace ns ON ns.oid = rel.relnamespace,
   LATERAL pg_stat_get_vacuum_tables(rel.oid) stats
 WHERE rel.relkind = 'r';
+
+CREATE VIEW pg_stat_vacuum_indexes AS
+SELECT
+  rel.oid as relid,
+  ns.nspname AS schemaname,
+  rel.relname AS relname,
+
+  total_blks_read AS total_blks_read,
+  total_blks_hit AS total_blks_hit,
+  total_blks_dirtied AS total_blks_dirtied,
+  total_blks_written AS total_blks_written,
+
+  rel_blks_read AS rel_blks_read,
+  rel_blks_hit AS rel_blks_hit,
+
+  pages_deleted AS pages_deleted,
+  tuples_deleted AS tuples_deleted,
+
+  wal_records AS wal_records,
+  wal_fpi AS wal_fpi,
+  wal_bytes AS wal_bytes,
+
+  blk_read_time AS blk_read_time,
+  blk_write_time AS blk_write_time,
+
+  delay_time AS delay_time,
+  total_time AS total_time
+FROM
+  pg_class rel
+  JOIN pg_namespace ns ON ns.oid = rel.relnamespace,
+  LATERAL pg_stat_get_vacuum_indexes(rel.oid) stats
+WHERE rel.relkind = 'i';
\ No newline at end of file
diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
index 7924c526cb0..000388a565f 100644
--- a/src/backend/commands/vacuumparallel.c
+++ b/src/backend/commands/vacuumparallel.c
@@ -868,6 +868,8 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
 	IndexBulkDeleteResult *istat = NULL;
 	IndexBulkDeleteResult *istat_res;
 	IndexVacuumInfo ivinfo;
+	LVExtStatCountersIdx extVacCounters;
+	ExtVacReport extVacReport;
 
 	/*
 	 * Update the pointer to the corresponding bulk-deletion result if someone
@@ -876,6 +878,9 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
 	if (indstats->istat_updated)
 		istat = &(indstats->istat);
 
+	/* Set initial statistics values to gather vacuum statistics for the index */
+	extvac_stats_start_idx(indrel, &(indstats->istat), &extVacCounters);
+
 	ivinfo.index = indrel;
 	ivinfo.heaprel = pvs->heaprel;
 	ivinfo.analyze_only = false;
@@ -904,6 +909,15 @@ parallel_vacuum_process_one_index(ParallelVacuumState *pvs, Relation indrel,
 				 RelationGetRelationName(indrel));
 	}
 
+	if(pgstat_track_vacuum_statistics)
+	{
+		/* Make extended vacuum stats report for index */
+		extvac_stats_end_idx(indrel, istat_res, &extVacCounters, &extVacReport);
+		pgstat_report_vacuum(RelationGetRelid(indrel),
+								indrel->rd_rel->relisshared,
+								0, 0, 0, &extVacReport);
+	}
+
 	/*
 	 * Copy the index bulk-deletion result returned from ambulkdelete and
 	 * amvacuumcleanup to the DSM segment if it's the first cycle because they
diff --git a/src/backend/utils/activity/pgstat.c b/src/backend/utils/activity/pgstat.c
index 826a5685b2a..363cbf2bb04 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -1180,6 +1180,10 @@ pgstat_build_snapshot(PgStat_Kind statKind)
 		if (p->dropped)
 			continue;
 
+		if (statKind != PGSTAT_KIND_INVALID && statKind != p->key.kind)
+			/* Load stat of specific type, if defined */
+			continue;
+
 		Assert(pg_atomic_read_u32(&p->refcount) > 0);
 
 		stats_data = dsa_get_address(pgStatLocal.dsa, p->body);
diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c
index 0272dd1f393..cd4ffb50bca 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -1007,6 +1007,9 @@ static void
 pgstat_accumulate_extvac_stats(ExtVacReport *dst, ExtVacReport *src,
 							   bool accumulate_reltype_specific_info)
 {
+	if(!pgstat_track_vacuum_statistics)
+		return;
+
 	dst->total_blks_read += src->total_blks_read;
 	dst->total_blks_hit += src->total_blks_hit;
 	dst->total_blks_dirtied += src->total_blks_dirtied;
@@ -1022,20 +1025,35 @@ pgstat_accumulate_extvac_stats(ExtVacReport *dst, ExtVacReport *src,
 	if (!accumulate_reltype_specific_info)
 		return;
 
-	dst->blks_fetched += src->blks_fetched;
-	dst->blks_hit += src->blks_hit;
-
-	dst->pages_scanned += src->pages_scanned;
-	dst->pages_removed += src->pages_removed;
-	dst->vm_new_frozen_pages += src->vm_new_frozen_pages;
-	dst->vm_new_visible_pages += src->vm_new_visible_pages;
-	dst->vm_new_visible_frozen_pages += src->vm_new_visible_frozen_pages;
-	dst->tuples_deleted += src->tuples_deleted;
-	dst->tuples_frozen += src->tuples_frozen;
-	dst->recently_dead_tuples += src->recently_dead_tuples;
-	dst->index_vacuum_count += src->index_vacuum_count;
-	dst->wraparound_failsafe_count += src->wraparound_failsafe_count;
-	dst->missed_dead_pages += src->missed_dead_pages;
-	dst->missed_dead_tuples += src->missed_dead_tuples;
+	if (dst->type == PGSTAT_EXTVAC_INVALID)
+		dst->type = src->type;
+
+	Assert(src->type == PGSTAT_EXTVAC_INVALID || src->type == dst->type);
+
+	if (dst->type == src->type)
+	{
+		dst->blks_fetched += src->blks_fetched;
+		dst->blks_hit += src->blks_hit;
 
+		if (dst->type == PGSTAT_EXTVAC_TABLE)
+		{
+			dst->table.pages_scanned += src->table.pages_scanned;
+			dst->table.pages_removed += src->table.pages_removed;
+			dst->table.vm_new_frozen_pages += src->table.vm_new_frozen_pages;
+			dst->table.vm_new_visible_pages += src->table.vm_new_visible_pages;
+			dst->table.vm_new_visible_frozen_pages += src->table.vm_new_visible_frozen_pages;
+			dst->tuples_deleted += src->tuples_deleted;
+			dst->table.tuples_frozen += src->table.tuples_frozen;
+			dst->table.recently_dead_tuples += src->table.recently_dead_tuples;
+			dst->table.index_vacuum_count += src->table.index_vacuum_count;
+			dst->table.missed_dead_pages += src->table.missed_dead_pages;
+			dst->table.missed_dead_tuples += src->table.missed_dead_tuples;
+			dst->table.wraparound_failsafe_count += src->table.wraparound_failsafe_count;
+		}
+		else if (dst->type == PGSTAT_EXTVAC_INDEX)
+		{
+			dst->index.pages_deleted += src->index.pages_deleted;
+			dst->tuples_deleted += src->tuples_deleted;
+		}
+	}
 }
\ No newline at end of file
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index e112762ed2f..80e867d773f 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -2358,18 +2358,19 @@ pg_stat_get_vacuum_tables(PG_FUNCTION_ARGS)
 									extvacuum->blks_hit);
 	values[i++] = Int64GetDatum(extvacuum->blks_hit);
 
-	values[i++] = Int64GetDatum(extvacuum->pages_scanned);
-	values[i++] = Int64GetDatum(extvacuum->pages_removed);
-	values[i++] = Int64GetDatum(extvacuum->vm_new_frozen_pages);
-	values[i++] = Int64GetDatum(extvacuum->vm_new_visible_pages);
-	values[i++] = Int64GetDatum(extvacuum->vm_new_visible_frozen_pages);
-	values[i++] = Int64GetDatum(extvacuum->missed_dead_pages);
+	values[i++] = Int64GetDatum(extvacuum->table.pages_scanned);
+	values[i++] = Int64GetDatum(extvacuum->table.pages_removed);
+	values[i++] = Int64GetDatum(extvacuum->table.vm_new_frozen_pages);
+	values[i++] = Int64GetDatum(extvacuum->table.vm_new_visible_pages);
+	values[i++] = Int64GetDatum(extvacuum->table.vm_new_visible_frozen_pages);
+	values[i++] = Int64GetDatum(extvacuum->table.missed_dead_pages);
 	values[i++] = Int64GetDatum(extvacuum->tuples_deleted);
-	values[i++] = Int64GetDatum(extvacuum->tuples_frozen);
-	values[i++] = Int64GetDatum(extvacuum->recently_dead_tuples);
-	values[i++] = Int64GetDatum(extvacuum->missed_dead_tuples);
-	values[i++] = Int32GetDatum(extvacuum->wraparound_failsafe_count);
-	values[i++] = Int64GetDatum(extvacuum->index_vacuum_count);
+	values[i++] = Int64GetDatum(extvacuum->table.tuples_frozen);
+	values[i++] = Int64GetDatum(extvacuum->table.recently_dead_tuples);
+	values[i++] = Int64GetDatum(extvacuum->table.missed_dead_tuples);
+
+	values[i++] = Int32GetDatum(extvacuum->table.wraparound_failsafe_count);
+	values[i++] = Int64GetDatum(extvacuum->table.index_vacuum_count);
 
 	values[i++] = Int64GetDatum(extvacuum->wal_records);
 	values[i++] = Int64GetDatum(extvacuum->wal_fpi);
@@ -2388,6 +2389,116 @@ pg_stat_get_vacuum_tables(PG_FUNCTION_ARGS)
 
 	Assert(i == PG_STAT_GET_VACUUM_TABLES_STATS_COLS);
 
+	/* Returns the record as Datum */
+	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
+}
+
+/*
+ * Get the vacuum statistics for the heap tables.
+ */
+Datum
+pg_stat_get_vacuum_indexes(PG_FUNCTION_ARGS)
+{
+	#define PG_STAT_GET_VACUUM_INDEX_STATS_COLS	16
+
+	Oid						relid = PG_GETARG_OID(0);
+	PgStat_StatTabEntry     *tabentry;
+	ExtVacReport 			*extvacuum;
+	TupleDesc				 tupdesc;
+	Datum					 values[PG_STAT_GET_VACUUM_INDEX_STATS_COLS] = {0};
+	bool					 nulls[PG_STAT_GET_VACUUM_INDEX_STATS_COLS] = {0};
+	char					 buf[256];
+	int						 i = 0;
+	ExtVacReport allzero;
+
+	/* Initialise attributes information in the tuple descriptor */
+	tupdesc = CreateTemplateTupleDesc(PG_STAT_GET_VACUUM_INDEX_STATS_COLS);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "relid",
+					   INT4OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_ blks_read",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_hit",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_dirtied",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_blks_written",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "rel_blks_read",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "rel_blks_hit",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "pages_deleted",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "tuples_deleted",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_records",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_fpi",
+					   INT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "wal_bytes",
+					   NUMERICOID, -1, 0);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "blk_read_time",
+					   FLOAT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "blk_write_time",
+					   FLOAT8OID, -1, 0);
+
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "delay_time",
+					   FLOAT8OID, -1, 0);
+	TupleDescInitEntry(tupdesc, (AttrNumber) ++i, "total_time",
+					   FLOAT8OID, -1, 0);
+
+	Assert(i == PG_STAT_GET_VACUUM_INDEX_STATS_COLS);
+
+	BlessTupleDesc(tupdesc);
+
+	tabentry = pgstat_fetch_stat_tabentry(relid);
+
+	if (tabentry == NULL)
+	{
+		/* If the subscription is not found, initialise its stats */
+		memset(&allzero, 0, sizeof(ExtVacReport));
+		extvacuum = &allzero;
+	}
+	else
+	{
+		extvacuum = &(tabentry->vacuum_ext);
+	}
+
+	i = 0;
+
+	values[i++] = ObjectIdGetDatum(relid);
+
+	values[i++] = Int64GetDatum(extvacuum->total_blks_read);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_hit);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_dirtied);
+	values[i++] = Int64GetDatum(extvacuum->total_blks_written);
+
+	values[i++] = Int64GetDatum(extvacuum->blks_fetched -
+									extvacuum->blks_hit);
+	values[i++] = Int64GetDatum(extvacuum->blks_hit);
+
+	values[i++] = Int64GetDatum(extvacuum->index.pages_deleted);
+	values[i++] = Int64GetDatum(extvacuum->tuples_deleted);
+
+	values[i++] = Int64GetDatum(extvacuum->wal_records);
+	values[i++] = Int64GetDatum(extvacuum->wal_fpi);
+
+	/* Convert to numeric, like pg_stat_statements */
+	snprintf(buf, sizeof buf, UINT64_FORMAT, extvacuum->wal_bytes);
+	values[i++] = DirectFunctionCall3(numeric_in,
+									  CStringGetDatum(buf),
+									  ObjectIdGetDatum(0),
+									  Int32GetDatum(-1));
+
+	values[i++] = Float8GetDatum(extvacuum->blk_read_time);
+	values[i++] = Float8GetDatum(extvacuum->blk_write_time);
+	values[i++] = Float8GetDatum(extvacuum->delay_time);
+	values[i++] = Float8GetDatum(extvacuum->total_time);
+
+	Assert(i == PG_STAT_GET_VACUUM_INDEX_STATS_COLS);
+
 	/* Returns the record as Datum */
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
\ No newline at end of file
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 6aa98f99317..43edb9169d2 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -1501,7 +1501,7 @@ struct config_bool ConfigureNamesBool[] =
 	},
 	{
 		{"track_vacuum_statistics", PGC_SUSET, STATS_CUMULATIVE,
-			gettext_noop("Collects vacuum statistics for table relations."),
+			gettext_noop("Collects vacuum statistics for relations."),
 			NULL
 		},
 		&pgstat_track_vacuum_statistics,
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3cc2eff3437..c6805c79ca1 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -12482,4 +12482,13 @@
   proname => 'pg_stat_get_rev_all_frozen_pages', provolatile => 's',
   proparallel => 'r', prorettype => 'int8', proargtypes => 'oid',
   prosrc => 'pg_stat_get_rev_all_frozen_pages' },
+{ oid => '8004',
+  descr => 'pg_stat_get_vacuum_indexes return stats values',
+  proname => 'pg_stat_get_vacuum_indexes', prorows => 1000, provolatile => 's', prorettype => 'record',proisstrict => 'f',
+  proretset => 't',
+  proargtypes => 'oid',
+  proallargtypes => '{oid,oid,int8,int8,int8,int8,int8,int8,int8,int8,int8,int8,numeric,float8,float8,float8,float8}',
+  proargmodes => '{i,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{reloid,relid,total_blks_read,total_blks_hit,total_blks_dirtied,total_blks_written,rel_blks_read,rel_blks_hit,pages_deleted,tuples_deleted,wal_records,wal_fpi,wal_bytes,blk_read_time,blk_write_time,delay_time,total_time}',
+  prosrc => 'pg_stat_get_vacuum_indexes' }
 ]
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index a5fe622d932..95a60534c12 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -25,6 +25,7 @@
 #include "storage/buf.h"
 #include "storage/lock.h"
 #include "utils/relcache.h"
+#include "pgstat.h"
 
 /*
  * Flags for amparallelvacuumoptions to control the participation of bulkdelete
@@ -295,6 +296,26 @@ typedef struct VacDeadItemsInfo
 	int64		num_items;		/* current # of entries */
 } VacDeadItemsInfo;
 
+/*
+ * Counters and usage data for extended stats tracking.
+ */
+typedef struct LVExtStatCounters
+{
+	TimestampTz starttime;
+	WalUsage	walusage;
+	BufferUsage bufusage;
+	double		VacuumDelayTime;
+	PgStat_Counter blocks_fetched;
+	PgStat_Counter blocks_hit;
+} LVExtStatCounters;
+
+typedef struct LVExtStatCountersIdx
+{
+	LVExtStatCounters common;
+	int64		pages_deleted;
+	int64		tuples_removed;
+} LVExtStatCountersIdx;
+
 /* GUC parameters */
 extern PGDLLIMPORT int default_statistics_target;	/* PGDLLIMPORT for PostGIS */
 extern PGDLLIMPORT int vacuum_freeze_min_age;
@@ -406,4 +427,8 @@ extern double anl_random_fract(void);
 extern double anl_init_selection_state(int n);
 extern double anl_get_next_S(double t, int n, double *stateptr);
 
+extern void extvac_stats_start_idx(Relation rel, IndexBulkDeleteResult *stats,
+					   LVExtStatCountersIdx *counters);
+extern void extvac_stats_end_idx(Relation rel, IndexBulkDeleteResult *stats,
+					 LVExtStatCountersIdx *counters, ExtVacReport *report);
 #endif							/* VACUUM_H */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index d369284101a..115cb736300 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -112,11 +112,19 @@ typedef struct PgStat_BackendSubEntry
 	PgStat_Counter conflict_count[CONFLICT_NUM_TYPES];
 } PgStat_BackendSubEntry;
 
+/* Type of ExtVacReport */
+typedef enum ExtVacReportType
+{
+	PGSTAT_EXTVAC_INVALID = 0,
+	PGSTAT_EXTVAC_TABLE = 1,
+	PGSTAT_EXTVAC_INDEX = 2
+} ExtVacReportType;
+
 /* ----------
  *
  * ExtVacReport
  *
- * Additional statistics of vacuum processing over a heap relation.
+ * Additional statistics of vacuum processing over a relation.
  * pages_removed is the amount by which the physically shrank,
  * if any (ie the change in its total size on disk)
  * pages_deleted refer to free space within the index file
@@ -145,18 +153,44 @@ typedef struct ExtVacReport
 	double		delay_time;		/* how long vacuum slept in vacuum delay point, in msec */
 	double		total_time;		/* total time of a vacuum operation, in msec */
 
-	int64		pages_scanned;		/* heap pages examined (not skipped by VM) */
-	int64		pages_removed;		/* heap pages removed by vacuum "truncation" */
-	int64		vm_new_frozen_pages;		/* pages marked in VM as frozen */
-	int64		vm_new_visible_pages;	/* pages marked in VM as all-visible */
-	int64		vm_new_visible_frozen_pages;	/* pages marked in VM as all-visible and frozen */
-	int64		missed_dead_tuples;		/* tuples not pruned by vacuum due to failure to get a cleanup lock */
-	int64		missed_dead_pages;		/* pages with missed dead tuples */
 	int64		tuples_deleted;		/* tuples deleted by vacuum */
-	int64		tuples_frozen;		/* tuples frozen up by vacuum */
-	int64		recently_dead_tuples;	/* deleted tuples that are still visible to some transaction */
-	int64		index_vacuum_count;	/* the number of index vacuumings */
-	int32		wraparound_failsafe_count;	/* the number of times to prevent workaround problem */
+
+	ExtVacReportType type;		/* heap, index, etc. */
+
+	/* ----------
+	 *
+	 * There are separate metrics of statistic for tables and indexes,
+	 * which collect during vacuum.
+	 * The union operator allows to combine these statistics
+	 * so that each metric is assigned to a specific class of collected statistics.
+	 * Such a combined structure was called per_type_stats.
+	 * The name of the structure itself is not used anywhere,
+	 * it exists only for understanding the code.
+	 * ----------
+	*/
+	union
+	{
+		struct
+		{
+			int64		pages_scanned;		/* heap pages examined (not skipped by VM) */
+			int64		pages_removed;		/* heap pages removed by vacuum "truncation" */
+			int64		pages_frozen;		/* pages marked in VM as frozen */
+			int64		pages_all_visible;	/* pages marked in VM as all-visible */
+			int64		tuples_frozen;		/* tuples frozen up by vacuum */
+			int64		recently_dead_tuples;	/* deleted tuples that are still visible to some transaction */
+			int64		vm_new_frozen_pages;		/* pages marked in VM as frozen */
+			int64		vm_new_visible_pages;	/* pages marked in VM as all-visible */
+			int64		vm_new_visible_frozen_pages;	/* pages marked in VM as all-visible and frozen */
+			int64		missed_dead_tuples;		/* tuples not pruned by vacuum due to failure to get a cleanup lock */
+			int64		missed_dead_pages;		/* pages with missed dead tuples */
+			int64		index_vacuum_count;	/* number of index vacuumings */
+			int32		wraparound_failsafe_count;	/* the number of times to prevent workaround problem */
+		}			table;
+		struct
+		{
+			int64		pages_deleted;		/* number of pages deleted by vacuum */
+		}			index;
+	} /* per_type_stats */;
 } ExtVacReport;
 
 /* ----------
diff --git a/src/test/isolation/expected/vacuum-extending-in-repetable-read.out b/src/test/isolation/expected/vacuum-extending-in-repetable-read.out
index 87f7e40b4a6..6d960423912 100644
--- a/src/test/isolation/expected/vacuum-extending-in-repetable-read.out
+++ b/src/test/isolation/expected/vacuum-extending-in-repetable-read.out
@@ -34,7 +34,7 @@ step s2_print_vacuum_stats_table:
 
 relname                   |tuples_deleted|recently_dead_tuples|missed_dead_tuples|missed_dead_pages|tuples_frozen
 --------------------------+--------------+--------------------+------------------+-----------------+-------------
-test_vacuum_stat_isolation|             0|                 100|                 0|                0|            0
+test_vacuum_stat_isolation|             0|                 600|                 0|                0|            0
 (1 row)
 
 step s1_commit: COMMIT;
@@ -48,6 +48,6 @@ step s2_print_vacuum_stats_table:
 
 relname                   |tuples_deleted|recently_dead_tuples|missed_dead_tuples|missed_dead_pages|tuples_frozen
 --------------------------+--------------+--------------------+------------------+-----------------+-------------
-test_vacuum_stat_isolation|           100|                 100|                 0|                0|          101
+test_vacuum_stat_isolation|           300|                 600|                 0|                0|          303
 (1 row)
 
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 7cc94fbf275..709e1d6a22b 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -2261,6 +2261,28 @@ pg_stat_user_tables| SELECT relid,
     rev_all_visible_pages
    FROM pg_stat_all_tables
   WHERE ((schemaname <> ALL (ARRAY['pg_catalog'::name, 'information_schema'::name])) AND (schemaname !~ '^pg_toast'::text));
+pg_stat_vacuum_indexes| SELECT rel.oid AS relid,
+    ns.nspname AS schemaname,
+    rel.relname,
+    stats.total_blks_read,
+    stats.total_blks_hit,
+    stats.total_blks_dirtied,
+    stats.total_blks_written,
+    stats.rel_blks_read,
+    stats.rel_blks_hit,
+    stats.pages_deleted,
+    stats.tuples_deleted,
+    stats.wal_records,
+    stats.wal_fpi,
+    stats.wal_bytes,
+    stats.blk_read_time,
+    stats.blk_write_time,
+    stats.delay_time,
+    stats.total_time
+   FROM (pg_class rel
+     JOIN pg_namespace ns ON ((ns.oid = rel.relnamespace))),
+    LATERAL pg_stat_get_vacuum_indexes(rel.oid) stats(relid, total_blks_read, total_blks_hit, total_blks_dirtied, total_blks_written, rel_blks_read, rel_blks_hit, pages_deleted, tuples_deleted, wal_records, wal_fpi, wal_bytes, blk_read_time, blk_write_time, delay_time, total_time)
+  WHERE (rel.relkind = 'i'::"char");
 pg_stat_vacuum_tables| SELECT ns.nspname AS schemaname,
     rel.relname,
     stats.relid,
diff --git a/src/test/regress/expected/vacuum_index_statistics.out b/src/test/regress/expected/vacuum_index_statistics.out
new file mode 100644
index 00000000000..e00a0fc683c
--- /dev/null
+++ b/src/test/regress/expected/vacuum_index_statistics.out
@@ -0,0 +1,183 @@
+--
+-- Test cumulative vacuum stats system
+--
+-- Check the wall statistics collected during vacuum operation:
+-- number of frozen and visible pages set by vacuum;
+-- number of frozen and visible pages removed by backend.
+-- Statistic wal_fpi is not displayed in this test because its behavior is unstable.
+--
+-- conditio sine qua non
+SHOW track_counts;  -- must be on
+ track_counts 
+--------------
+ on
+(1 row)
+
+\set sample_size 10000
+-- not enabled by default, but we want to test it...
+SET track_functions TO 'all';
+-- Test that vacuum statistics will be empty when parameter is off.
+SET track_vacuum_statistics TO 'off';
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+DELETE FROM vestat WHERE x % 2 = 0;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+-- Must be empty.
+SELECT *
+FROM pg_stat_vacuum_indexes vt
+WHERE vt.relname = 'vestat';
+ relid | schemaname | relname | total_blks_read | total_blks_hit | total_blks_dirtied | total_blks_written | rel_blks_read | rel_blks_hit | pages_deleted | tuples_deleted | wal_records | wal_fpi | wal_bytes | blk_read_time | blk_write_time | delay_time | total_time 
+-------+------------+---------+-----------------+----------------+--------------------+--------------------+---------------+--------------+---------------+----------------+-------------+---------+-----------+---------------+----------------+------------+------------
+(0 rows)
+
+RESET track_vacuum_statistics;
+DROP TABLE vestat CASCADE;
+SHOW track_vacuum_statistics;  -- must be on
+ track_vacuum_statistics 
+-------------------------
+ on
+(1 row)
+
+-- ensure pending stats are flushed
+SELECT pg_stat_force_next_flush();
+ pg_stat_force_next_flush 
+--------------------------
+ 
+(1 row)
+
+\set sample_size 10000
+SET vacuum_freeze_min_age = 0;
+SET vacuum_freeze_table_age = 0;
+--SET stats_fetch_consistency = snapshot;
+CREATE TABLE vestat (x int primary key) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+SELECT oid AS ioid from pg_class where relname = 'vestat_pkey' \gset
+DELETE FROM vestat WHERE x % 2 = 0;
+-- Before the first vacuum execution extended stats view is empty.
+SELECT vt.relname,relpages,pages_deleted,tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+   relname   | relpages | pages_deleted | tuples_deleted 
+-------------+----------+---------------+----------------
+ vestat_pkey |       30 |             0 |              0
+(1 row)
+
+SELECT relpages AS irp
+FROM pg_class c
+WHERE relname = 'vestat_pkey' \gset
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- The table and index extended vacuum statistics should show us that
+-- vacuum frozed pages and clean up pages, but pages_removed stayed the same
+-- because of not full table have cleaned up
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted = 0 AS pages_deleted,tuples_deleted > 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+   relname   | relpages | pages_deleted | tuples_deleted 
+-------------+----------+---------------+----------------
+ vestat_pkey | t        | t             | t
+(1 row)
+
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+-- Look into WAL records deltas.
+SELECT wal_records > 0 AS diWR, wal_bytes > 0 AS diWB
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey';
+ diwr | diwb 
+------+------
+ t    | t
+(1 row)
+
+DELETE FROM vestat;;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- pages_removed must be increased
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted-:ipd > 0 AS pages_deleted,tuples_deleted-:itd > 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+   relname   | relpages | pages_deleted | tuples_deleted 
+-------------+----------+---------------+----------------
+ vestat_pkey | t        | t             | t
+(1 row)
+
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records-:iwr AS diwr, wal_bytes-:iwb AS diwb, wal_fpi-:ifpi AS difpi
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+-- WAL advance should be detected.
+SELECT :diwr > 0 AS diWR, :diwb > 0 AS diWB;
+ diwr | diwb 
+------+------
+ t    | t
+(1 row)
+
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+DELETE FROM vestat WHERE x % 2 = 0;
+-- VACUUM FULL doesn't report to stat collector. So, no any advancements of statistics
+-- are detected here.
+VACUUM FULL vestat;
+-- It is necessary to check the wal statistics
+CHECKPOINT;
+-- Store WAL advances into variables
+SELECT wal_records-:iwr AS diwr2, wal_bytes-:iwb AS diwb2, wal_fpi-:ifpi AS difpi2
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+-- WAL and other statistics advance should not be detected.
+SELECT :diwr2=0 AS diWR, :difpi2=0 AS iFPI, :diwb2=0 AS diWB;
+ diwr | ifpi | diwb 
+------+------+------
+ t    | t    | t
+(1 row)
+
+SELECT vt.relname,relpages-:irp < 0 AS relpages,pages_deleted-:ipd = 0 AS pages_deleted,tuples_deleted-:itd = 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+   relname   | relpages | pages_deleted | tuples_deleted 
+-------------+----------+---------------+----------------
+ vestat_pkey | t        | t             | t
+(1 row)
+
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+DELETE FROM vestat;
+TRUNCATE vestat;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+-- Store WAL advances into variables after removing all tuples from the table
+SELECT wal_records-:iwr AS diwr3, wal_bytes-:iwb AS diwb3, wal_fpi-:ifpi AS difpi3
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+--There are nothing changed
+SELECT :diwr3=0 AS diWR, :difpi3=0 AS iFPI, :diwb3=0 AS diWB;
+ diwr | ifpi | diwb 
+------+------+------
+ t    | t    | t
+(1 row)
+
+--
+-- Now, the table and index is compressed into zero number of pages. Check it
+-- in vacuum extended statistics.
+-- The pages_frozen, pages_scanned values shouldn't be changed
+--
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted-:ipd = 0 AS pages_deleted,tuples_deleted-:itd = 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+   relname   | relpages | pages_deleted | tuples_deleted 
+-------------+----------+---------------+----------------
+ vestat_pkey | f        | t             | t
+(1 row)
+
+DROP TABLE vestat;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 6493e9b4681..bdb73ccab6b 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -140,4 +140,5 @@ test: tablespace
 # ----------
 # Check vacuum statistics
 # ----------
+test: vacuum_index_statistics
 test: vacuum_tables_statistics
\ No newline at end of file
diff --git a/src/test/regress/sql/vacuum_index_statistics.sql b/src/test/regress/sql/vacuum_index_statistics.sql
new file mode 100644
index 00000000000..ae146e1d23f
--- /dev/null
+++ b/src/test/regress/sql/vacuum_index_statistics.sql
@@ -0,0 +1,151 @@
+--
+-- Test cumulative vacuum stats system
+--
+-- Check the wall statistics collected during vacuum operation:
+-- number of frozen and visible pages set by vacuum;
+-- number of frozen and visible pages removed by backend.
+-- Statistic wal_fpi is not displayed in this test because its behavior is unstable.
+--
+-- conditio sine qua non
+SHOW track_counts;  -- must be on
+
+\set sample_size 10000
+
+-- not enabled by default, but we want to test it...
+SET track_functions TO 'all';
+
+-- Test that vacuum statistics will be empty when parameter is off.
+SET track_vacuum_statistics TO 'off';
+
+CREATE TABLE vestat (x int) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+
+DELETE FROM vestat WHERE x % 2 = 0;
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) vestat;
+
+-- Must be empty.
+SELECT *
+FROM pg_stat_vacuum_indexes vt
+WHERE vt.relname = 'vestat';
+
+RESET track_vacuum_statistics;
+DROP TABLE vestat CASCADE;
+
+SHOW track_vacuum_statistics;  -- must be on
+
+-- ensure pending stats are flushed
+SELECT pg_stat_force_next_flush();
+
+\set sample_size 10000
+SET vacuum_freeze_min_age = 0;
+SET vacuum_freeze_table_age = 0;
+--SET stats_fetch_consistency = snapshot;
+CREATE TABLE vestat (x int primary key) WITH (autovacuum_enabled = off, fillfactor = 10);
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+ANALYZE vestat;
+
+SELECT oid AS ioid from pg_class where relname = 'vestat_pkey' \gset
+
+DELETE FROM vestat WHERE x % 2 = 0;
+-- Before the first vacuum execution extended stats view is empty.
+SELECT vt.relname,relpages,pages_deleted,tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+SELECT relpages AS irp
+FROM pg_class c
+WHERE relname = 'vestat_pkey' \gset
+
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- The table and index extended vacuum statistics should show us that
+-- vacuum frozed pages and clean up pages, but pages_removed stayed the same
+-- because of not full table have cleaned up
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted = 0 AS pages_deleted,tuples_deleted > 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+-- Look into WAL records deltas.
+SELECT wal_records > 0 AS diWR, wal_bytes > 0 AS diWB
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey';
+
+DELETE FROM vestat;;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- pages_removed must be increased
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted-:ipd > 0 AS pages_deleted,tuples_deleted-:itd > 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records-:iwr AS diwr, wal_bytes-:iwb AS diwb, wal_fpi-:ifpi AS difpi
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+-- WAL advance should be detected.
+SELECT :diwr > 0 AS diWR, :diwb > 0 AS diWB;
+
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+INSERT INTO vestat SELECT x FROM generate_series(1,:sample_size) as x;
+DELETE FROM vestat WHERE x % 2 = 0;
+-- VACUUM FULL doesn't report to stat collector. So, no any advancements of statistics
+-- are detected here.
+VACUUM FULL vestat;
+-- It is necessary to check the wal statistics
+CHECKPOINT;
+
+-- Store WAL advances into variables
+SELECT wal_records-:iwr AS diwr2, wal_bytes-:iwb AS diwb2, wal_fpi-:ifpi AS difpi2
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+-- WAL and other statistics advance should not be detected.
+SELECT :diwr2=0 AS diWR, :difpi2=0 AS iFPI, :diwb2=0 AS diWB;
+
+SELECT vt.relname,relpages-:irp < 0 AS relpages,pages_deleted-:ipd = 0 AS pages_deleted,tuples_deleted-:itd = 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+SELECT vt.relname,relpages AS irp,pages_deleted AS ipd,tuples_deleted AS itd
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid \gset
+
+-- Store WAL advances into variables
+SELECT wal_records AS iwr,wal_bytes AS iwb,wal_fpi AS ifpi FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+DELETE FROM vestat;
+TRUNCATE vestat;
+VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128, INDEX_CLEANUP ON) vestat;
+-- it is necessary to check the wal statistics
+CHECKPOINT;
+
+-- Store WAL advances into variables after removing all tuples from the table
+SELECT wal_records-:iwr AS diwr3, wal_bytes-:iwb AS diwb3, wal_fpi-:ifpi AS difpi3
+FROM pg_stat_vacuum_indexes WHERE relname = 'vestat_pkey' \gset
+
+--There are nothing changed
+SELECT :diwr3=0 AS diWR, :difpi3=0 AS iFPI, :diwb3=0 AS diWB;
+
+--
+-- Now, the table and index is compressed into zero number of pages. Check it
+-- in vacuum extended statistics.
+-- The pages_frozen, pages_scanned values shouldn't be changed
+--
+SELECT vt.relname,relpages-:irp = 0 AS relpages,pages_deleted-:ipd = 0 AS pages_deleted,tuples_deleted-:itd = 0 AS tuples_deleted
+FROM pg_stat_vacuum_indexes vt, pg_class c
+WHERE vt.relname = 'vestat_pkey' AND vt.relid = c.oid;
+
+DROP TABLE vestat;
-- 
2.34.1



  [text/x-patch] v19-0004-Add-documentation-about-the-system-views-that-are-us.patch (24.5K, 5-v19-0004-Add-documentation-about-the-system-views-that-are-us.patch)
  download | inline diff:
From ac4783ee2451a57c9127eecdb3186e4dcf09b074 Mon Sep 17 00:00:00 2001
From: Alena Rybakina <[email protected]>
Date: Mon, 17 Feb 2025 17:29:48 +0300
Subject: [PATCH 4/4] Add documentation about the system views that are used in
 the machinery of vacuum statistics.

---
 doc/src/sgml/system-views.sgml | 755 +++++++++++++++++++++++++++++++++
 1 file changed, 755 insertions(+)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index be81c2b51d2..b05cb04a235 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -5069,4 +5069,759 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
   </table>
  </sect1>
 
+<sect1 id="view-pg-stat-vacuum-database">
+  <title><structname>pg_stat_vacuum_database</structname></title>
+
+  <indexterm zone="view-pg-stat-vacuum-database">
+   <primary>pg_stat_vacuum_database</primary>
+  </indexterm>
+
+  <para>
+   The view <structname>pg_stat_vacuum_database</structname> will contain
+   one row for each database in the current cluster, showing statistics about
+   vacuuming that database.
+  </para>
+
+  <table>
+   <title><structname>pg_stat_vacuum_database</structname> Columns</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>dbid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of a database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_read</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks read by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_hit</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times database blocks were found in the
+        buffer cache by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_dirtied</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks dirtied by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_written</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks written by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_records</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL records generated by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_fpi</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL full page images generated by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_bytes</structfield> <type>numeric</type>
+      </para>
+      <para>
+        Total amount of WAL bytes generated by vacuum operations
+        performed on this database
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_read_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Time spent reading database blocks by vacuum operations performed on
+        this database, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_write_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Time spent writing database blocks by vacuum operations performed on
+        this database, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>delay_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Time spent sleeping in a vacuum delay point by vacuum operations performed on
+        this database, in milliseconds (see <xref linkend="runtime-config-resource-vacuum-cost"/>
+        for details)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>system_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        System CPU time of vacuuming this database, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>user_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        User CPU time of vacuuming this database, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Total time of vacuuming this database, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wraparound_failsafe_count</structfield> <type>int4</type>
+      </para>
+      <para>
+        Number of times the vacuum was run to prevent a wraparound problem.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>errors</structfield> <type>int4</type>
+      </para>
+      <para>
+        Number of times vacuum operations performed on this database
+        were interrupted on any errors
+      </para></entry>
+     </row>
+    </tbody>
+   </tgroup>
+  </table>
+ </sect1>
+
+  <sect1 id="view-pg-stat-vacuum-indexes">
+  <title><structname>pg_stat_vacuum_indexes</structname></title>
+
+  <indexterm zone="view-pg-stat-vacuum-indexes">
+   <primary>pg_stat_vacuum_indexes</primary>
+  </indexterm>
+
+  <para>
+   The view <structname>pg_stat_vacuum_indexes</structname> will contain
+   one row for each index in the current database (including TOAST
+   table indexes), showing statistics about vacuuming that specific index.
+  </para>
+
+  <table>
+   <title><structname>pg_stat_vacuum_indexes</structname> Columns</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>relid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of an index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>schema</structfield> <type>name</type>
+      </para>
+      <para>
+        Name of the schema this index is in
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>relname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_read</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks read by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_hit</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times database blocks were found in the
+        buffer cache by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_dirtied</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks dirtied by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_written</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks written by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rel_blks_read</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of blocks vacuum operations read from this
+        index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rel_blks_hit</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times blocks of this index were already found
+        in the buffer cache by vacuum operations, so that a read was not necessary
+        (this only includes hits in the
+        project; buffer cache, not the operating system's file system cache)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pages_deleted</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of pages deleted by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_deleted</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of dead tuples vacuum operations deleted from this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_records</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL records generated by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_fpi</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL full page images generated by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_bytes</structfield> <type>numeric</type>
+      </para>
+      <para>
+        Total amount of WAL bytes generated by vacuum operations
+        performed on this index
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_read_time</structfield> <type>int8</type>
+      </para>
+      <para>
+        Time spent reading database blocks by vacuum operations performed on
+        this index, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_write_time</structfield> <type>int8</type>
+      </para>
+      <para>
+        Time spent writing database blocks by vacuum operations performed on
+        this index, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>delay_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Time spent sleeping in a vacuum delay point by vacuum operations performed on
+        this index, in milliseconds (see <xref linkend="runtime-config-resource-vacuum-cost"/>
+        for details)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>system_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        System CPU time of vacuuming this index, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>user_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        User CPU time of vacuuming this index, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Total time of vacuuming this index, in milliseconds
+      </para></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+ </sect1>
+
+ <sect1 id="view-pg-stat-vacuum-tables">
+  <title><structname>pg_stat_vacuum_tables</structname></title>
+
+  <indexterm zone="view-pg-stat-vacuum-tables">
+   <primary>pg_stat_vacuum_tables</primary>
+  </indexterm>
+
+  <para>
+   The view <structname>pg_stat_vacuum_tables</structname> will contain
+   one row for each table in the current database (including TOAST
+   tables), showing statistics about vacuuming that specific table.
+  </para>
+
+  <table>
+   <title><structname>pg_stat_vacuum_tables</structname> Columns</title>
+   <tgroup cols="1">
+    <thead>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       Column Type
+      </para>
+      <para>
+       Description
+      </para></entry>
+     </row>
+    </thead>
+
+    <tbody>
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>relid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of a table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>schema</structfield> <type>name</type>
+      </para>
+      <para>
+        Name of the schema this table is in
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>relname</structfield> <type>name</type>
+      </para>
+      <para>
+       Name of this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_read</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks read by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_hit</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times database blocks were found in the
+        buffer cache by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_dirtied</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of blocks written directly by vacuum or auto vacuum.
+        Blocks that are dirtied by a vacuum process can be written out by another process.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_blks_written</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of database blocks written by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rel_blks_read</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of blocks vacuum operations read from this
+        table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>rel_blks_hit</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times blocks of this table were already found
+        in the buffer cache by vacuum operations, so that a read was not necessary
+        (this only includes hits in the
+        project; buffer cache, not the operating system's file system cache)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pages_scanned</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of pages examined by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>pages_removed</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of pages removed from the physical storage by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_new_frozen_pages</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of the number of pages newly set all-frozen by vacuum
+        in the visibility map.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_new_visible_pages</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of the number of pages newly set all-visible by vacuum
+        in the visibility map.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>vm_new_visible_frozen_pages</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of the number of pages newly set all-visible and all-frozen
+        by vacuum in the visibility map.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_deleted</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of dead tuples vacuum operations deleted from this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>tuples_frozen</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of tuples of this table that vacuum operations marked as
+        frozen
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>recently_dead_tuples</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of dead tuples vacuum operations left in this table due
+        to their visibility in transactions
+      </para></entry>
+     </row>
+
+    <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>missed_dead_tuples</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of fully DEAD (not just RECENTLY_DEAD) tuples  that could not be
+        pruned due to failure to acquire a cleanup lock on a heap page.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>index_vacuum_count</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of times indexes on this table were vacuumed
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wraparound_failsafe_count</structfield> <type>int4</type>
+      </para>
+      <para>
+        Number of times the vacuum was run to prevent a wraparound problem.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>missed_dead_pages</structfield> <type>int8</type>
+      </para>
+      <para>
+        Number of pages that had at least one missed_dead_tuples.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_records</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL records generated by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_fpi</structfield> <type>int8</type>
+      </para>
+      <para>
+        Total number of WAL full page images generated by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>wal_bytes</structfield> <type>numeric</type>
+      </para>
+      <para>
+        Total amount of WAL bytes generated by vacuum operations
+        performed on this table
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_read_time</structfield> <type>int8</type>
+      </para>
+      <para>
+        Time spent reading database blocks by vacuum operations performed on
+        this table, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>blk_write_time</structfield> <type>int8</type>
+      </para>
+      <para>
+        Time spent writing database blocks by vacuum operations performed on
+        this table, in milliseconds (if <xref linkend="guc-track-io-timing"/> is enabled,
+        otherwise zero)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>delay_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Time spent sleeping in a vacuum delay point by vacuum operations performed on
+        this table, in milliseconds (see <xref linkend="runtime-config-resource-vacuum-cost"/>
+        for details)
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>system_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        System CPU time of vacuuming this table, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>user_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        User CPU time of vacuuming this table, in milliseconds
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>total_time</structfield> <type>float8</type>
+      </para>
+      <para>
+        Total time of vacuuming this table, in milliseconds
+      </para></entry>
+     </row>
+
+    </tbody>
+   </tgroup>
+  </table>
+  <para>Columns <structfield>total_*</structfield>, <structfield>wal_*</structfield>
+    and <structfield>blk_*</structfield> include data on vacuuming indexes on this table, while columns
+    <structfield>system_time</structfield> and <structfield>user_time</structfield> only include data
+    on vacuuming the heap.</para>
+ </sect1>
 </chapter>
-- 
2.34.1

view thread (37+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Vacuum statistics
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox