Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uUdPn-006CSp-4D for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Jun 2025 03:36:43 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uUdPk-007DBx-L2 for pgsql-hackers@arkaria.postgresql.org; Thu, 26 Jun 2025 03:36:41 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uUdPk-007DBi-23 for pgsql-hackers@lists.postgresql.org; Thu, 26 Jun 2025 03:36:40 +0000 Received: from mail-qt1-x82b.google.com ([2607:f8b0:4864:20::82b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uUdPi-0042Dx-1U for pgsql-hackers@lists.postgresql.org; Thu, 26 Jun 2025 03:36:39 +0000 Received: by mail-qt1-x82b.google.com with SMTP id d75a77b69052e-4a43e277198so4309291cf.1 for ; Wed, 25 Jun 2025 20:36:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750908997; x=1751513797; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=9LlByAS7kgf//LuwpiY6bp6/ah7+xkVvYGGS1cnVOaY=; b=hefmizcKe/HKQSlHE7ndgs5zImeuREc6Bwbr4iVfcrv8PcA+NCYvAk2pJ0DW816p5j UOrshFJ9RJxbOdSCfaDSIktAP5h1rdVJpLt2mwSSYSzHx7nYYMF4Aylm/llR/pBI5/QT K1IDLjps9GHBW7U8BDKOtm9LbZHJkBlGZpcfqUx6nARcAipC6ZuLTwH7D64Wx0vSxa12 0PcssZkekY0Xtuh7Po1Kn6H117uYt4AO6C0WwqHlnUPR7Gqa5yuq9JgDKay1XXh4SONr ZVX3KU6izRd8Ct+cUu8aIbI9FfXndeTABu6jlUj9ZWBUS47m1Do2xwvPxeOMwBPtn5XP Hjnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750908997; x=1751513797; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=9LlByAS7kgf//LuwpiY6bp6/ah7+xkVvYGGS1cnVOaY=; b=rK/10d+WlMTfBrw7R9cev/+dOO4bXPxzlbyMd/KxA/ltRB1CEjOE7vwGQho0s8WvRr Zxg9RPSFFdTG4ToN1WSEHbe5pnDOhnW83YRmC8B/vc2srxSzJ7FMR47gN+qLVt3r9tS4 8LLpjCHN8eMKUQRSkcz7MrCuZPIwzMTJmTpLryrsB52LTO7ZjYOEi9fPWLbH0Ue6LcI1 GETq+2RIV3eFM4rDs6gwtpjQCbYhczV0IyQO7sLaIbgc8COb74TsOARAUFHbD5SDNecE YKdG4/nNxZmB55M5ou3rR0KKydsaypp4iGtorj8aRYr0Y+BvZ5JLSUNyrJ1349AEQ2Px PgFQ== X-Forwarded-Encrypted: i=1; AJvYcCXkFOHRbf+oz8q2nn7IHBgI0/Ui2gHxG9GWMmOAX186MF9TPpNEXJoKXiAiPNixDtTTUaHHvvuF7xHGhyxN@lists.postgresql.org X-Gm-Message-State: AOJu0YwomN0BG+k5Lvo+bziPuWHBSOxJr5gSnfuzze6mdmiWdwBGuNp1 wM6uFjv/vpEWdrUcp9lQVV33owIlWSyln2a4ChgIfEfGHjKt1bAv+oPPrxgGZAyfLW5S2QfYKSJ gWHWEc6K8t8f56xga6LB4lHV3ZfqRPVs= X-Gm-Gg: ASbGncuJxSa2qzRJIlOqo0gTcHJlWKF1vWSsxU6uxmQnzEq6IkLylE662C672JxNmZG gxoPZFHFtG9ejcb5ShlH2tS/sTKXCKqj/VNVE4UYF9FjxHQTZ8BkOiu+4YPBXh+cRcavr4E2C2z QIv1MxvR2oCzBTbHtdSJMGbSjTwCzAyOi8znJFdHApFyQmLImn9Jzol9avMr7fBvNoEFGuSu/13 wx03g== X-Google-Smtp-Source: AGHT+IHCRSnRpu1zTcc1B/JTgzTRdPIb1BQCMV7zbivAsdGd9a3BIGa6kZzNCfzo+/+oxHfOaGoIaB7hiq8x6bFMh78= X-Received: by 2002:ac8:5915:0:b0:4a4:4202:e77e with SMTP id d75a77b69052e-4a7c0608f71mr84732751cf.6.1750908997140; Wed, 25 Jun 2025 20:36:37 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Smith Date: Thu, 26 Jun 2025 13:36:10 +1000 X-Gm-Features: Ac12FXwm5NBbCDuVK6eSaOMI4sE8NWlujXTaDBmW3PAwjVd2oxIpQ87d2ZTLcz0 Message-ID: Subject: Re: Skipping schema changes in publication To: Shlok Kyal Cc: Amit Kapila , "Zhijie Hou (Fujitsu)" , vignesh C , YeXiu <1518981153@qq.com>, Ian Lawrence Barwick , Bharath Rupireddy , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Shlok. Below are some review comments for v14-0003 ====== 1. GENERAL Since the new syntax uses EXCEPT, then, in my opinion, you should try to use that same term where possible when describing things. I understand it is hard to do this in text and I agree often it makes more sense to say "exclude" columns etc, but OTOH in the code there are lots of places where you could have named vars/params differently: e.g. 'except_collist' instead of 'exclude_collist' might have been better. ====== Commit message 2. Column list specifed with EXCEPT is stored in column "prattrs" in table "pg_publication_rel" and also column "prexcept" is set to "true", to maintain the column list that user wants to exclude from the publication. ~ That paragraph could do with some rewording. For example, AFAIK, "prattrs" is for all column lists -- not just except col-lists, but the way it is described here sounds different. Also, /specifed/specified/ ====== doc/src/sgml/catalogs.sgml 3. (52.42. pg_publication_rel) - True if the relation must be excluded + True if the relation or column list must be excluded. If publication is + created FOR ALL TABLES and it is specified as true, + the relation should be excluded. Else if it is true the columns in + prattrs should be excluded from being published. I felt this could be expressed more simply without mentioning anything about FOR ALL TABLES. SUGGESTION True if the column list or relation must be excluded from publication. If a column list is specified in prattrs, then exclude only those columns. If prattrs is NULL, then exclude the entire relation. ====== doc/src/sgml/logical-replication.sgml 4. (29.5. Column Lists) - Each publication can optionally specify which columns of each table are - replicated to subscribers. The table on the subscriber side must have at - least all the columns that are published. If no column list is specified, - then all columns on the publisher are replicated. + Each publication can optionally specify which columns of each table should be + replicated or excluded from replication. On the subscriber side, the table + must include at least all the columns that are published. If no column list + is provided, all columns from the publisher are replicated by default. See for details on the syntax. I felt this patch may have changed too much text. IMO, you only needed to say "... are replicated or excluded from replication.". The other changes did not seem necessary. ~~~ 5. - If no column list is specified, any columns added to the table later are - automatically replicated. This means that having a column list which names - all columns is not the same as having no column list at all. + If no column list or a column list with EXCEPT is specified, any columns + added to the table later are automatically replicated. This means that having + a column list which names all columns is not the same as having no + column list at all. If an column list is specified, any columns added to the + table later are automatically replicated. 5a. "This means that having a column list which names all columns is not the same as having no column list at all." -- That note does not make sense when you say EXCEPT. I think some rewording is needed here. ~ 5b. "If an column list is specified, any columns added to the table later are automatically replicated.". This made no sense -- some words missing? ~~~ 6. Generated columns can also be specified in a column list. This allows generated columns to be published, regardless of the publication parameter - publish_generated_columns. See - for details. + publish_generated_columns. Generated columns can + be included in column list specified with EXCEPT clause if publication + parameter + + publish_generated_columns is not set to + none. Specified generated columns will not be published. + See for details. I am not so sure about this. It seemed overly strict to me. Why can't it simply say: "Generated columns can also be specified in a column list. This allows generated columns to be published or excluded, regardless of the publication parameter..." Specifically, I don't know why you need to say: Generated columns can be included in column list specified with EXCEPT clause if publication parameter publish_generated_columns is not set to none. Specified generated columns will not be published. IIUC, then EXCEPT (gencol1, gencol2) is saying to exclude the named cols. So if param is "stored", then the named cols will be excluded. OTOH, if param is "none" then all generated cols will be excluded anyway, so why not just allow the EXCEPT (gencol,gencol2) here as well, because the result will be the same. ~~~ 7. (29.5.1. Examples) - Create a table t1 to be used in the following example. + Create tables t1, t2 to be used in the + following example. /Create tables t1, t2/Create tables t1 and t2/ ~~~ 8. Create a publication p1. A column list is defined for - table t1 to reduce the number of columns that will be - replicated. Notice that the order of column names in the column list does - not matter. + table t1 and a column list is defined for table + t2 with EXCEPT clause to reduce the number of columns that will be + replicated. Notice that the order of column names in the column lists does not matter. BEFORE A column list is defined for table t1 and a column list is defined for table t2... SUGGESTION (added comma, etc.) A column list is defined for table t1, and another column list is defined for table t2... ~~~ 9. The final example still says: "Only data from the column list of publication p1 is replicated." That doesn't seem quite appropriate now that you also have an EXCEPT column list. SUGGESTION: Only data specified by the column lists of publication p1 is replicated. ====== doc/src/sgml/ref/create_publication.sgml 10. + + When a column list is specified with EXCEPT, the named columns are not + replicated. Specifying a column list has no effect on + TRUNCATE commands. + I felt that to be clearer the preceding paragraph should be changed as follows: /When a column list is specified, only the named columns are replicated./When a column list without EXCEPT is specified, only the named columns are replicated./ ~~~ 11. CREATE PUBLICATION (NOTES section) 11a. The NOTES talk about replica identity columns -- should you mention EXCEPT here? ~ 11b. The NOTES talk about generated columns -- should you mention EXCEPT here? ====== src/backend/catalog/pg_publication.c 12. check_and_fetch_column_list + if (!isnull) + except = DatumGetBool(cfdatum); + + *except_columns = except && !pub->alltables; AFAICT, you can Assert(!pub->alltables) because you already checked that earlier up front. So you don't need 'except' var either. Just assign *except_cols up front and then overwrite it later if true. SUGGESTION: *except_cols = false; if (pub->alltables) return false; ... if (!isnull) *except_cols = DatumGetBool(cfdatum); ~~~ 13. publication_add_relation /* Validate and translate column names into a Bitmapset of attnums. */ - attnums = pub_collist_validate(pri->relation, pri->columns); + attnums = pub_collist_validate(pri->relation, pri->columns, + pri->except && !pub->alltables, + pub->pubgencols_type); I am wondering why we are even calling a function to validate column lists if pub->alltables was true. AFAIK, that combination of column-lists and FOR ALL TABLES is not even possible, so the code seems strange. ~~~ 14. pub_exclude_collist_validate . + /* + * Check if column list specified with EXCEPT have any stored + * generated column and 'publish_generated_columns' is not set to + * 'stored'. + */ + if (except_columns && + TupleDescAttr(tupdesc, attnum - 1)->attgenerated == ATTRIBUTE_GENERATED_STORED && + pubgencols_type != PUBLISH_GENCOLS_STORED) + ereport(ERROR, + errcode(ERRCODE_INVALID_COLUMN_REFERENCE), + errmsg("cannot use stored generated column \"%s\" in publication column list specified with EXCEPT when \"%s\" set to \"%s\"", + colname, "publish_generated_columns", "stored")); As mentioned in the above DOCS comments, I was having doubts about why we have this error. If the parameter says "none", then generated columns will not be replicated, so why should we care if the user also says EXCEPT(gencol1,gencol2). Either way, the result will be the same; the generated column will not be published. ~~~ 15. GetRelationPublications { HeapTuple tup = &pubrellist->members[i]->tuple; Oid pubid = ((Form_pg_publication_rel) GETSTRUCT(tup))->prpubid; + HeapTuple pubtup = SearchSysCache1(PUBLICATIONOID, ObjectIdGetDatum(pubid)); + bool is_table_excluded = ((Form_pg_publication) GETSTRUCT(pubtup))->puballtables && + ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept; - if (except_flag == ((Form_pg_publication_rel) GETSTRUCT(tup))->prexcept) + if (except_flag == is_table_excluded) result = lappend_oid(result, pubid); + + ReleaseS I'm not 100% sure you need the additional 'pubtup'... Can't you just look at the "prattrs" field to see if a column-list was specified? If "prattrs" is null and "prexcept" is true, isn't that the same combination as what you are looking for here? ~~~ 16. pg_get_publication_tables + columnsDatum = SysCacheGetAttr(PUBLICATIONRELMAP, pubtuple, + Anum_pg_publication_rel_prattrs, + &(nulls[2])); + + /* if column list is specified with EXCEPT */ + if (!pub->alltables && except) + columns = pub_collist_to_bitmapset(NULL, columnsDatum, NULL); + else + values[2] = columnsDatum; 16a. Something seems fishy here. Isn't there a pathway where you missed assigning value[2] to anything? ~ 16b. Also, I feel there should be some other boolean variable used here instead of checking bot (!pub->alltables && except) in multiple places. ====== src/backend/replication/pgoutput/pgoutput.c 17. RelationSyncEntry + + /* Indicate if no column is included in the publication */ + bool no_cols_published; Maybe this can have a more explanatory comment to explain why it is needed? ~~~ 18. check_and_init_gencol + bool found = false; + bool except_columns = false; + + found = check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, + NULL, &except_columns); + /* * The column list takes precedence over the * 'publish_generated_columns' parameter. Those will be checked later, - * see pgoutput_column_list_init. + * see pgoutput_column_list_init. But when a column list is specified + * with EXCEPT, it should be checked. */ - if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) + if (found && !except_columns) continue; The variable 'found' seems a poor name; how about 'has_column_list' or similar? ~~~ 19. pgoutput_change + /* + * If all columns of a table is present in column list specified with + * EXCEPT, skip publishing the changes. + */ + if (relentry->no_cols_published) + return; /is present/are present/ ====== src/bin/pg_dump/pg_dump.c 20. getPublicationTables + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) pubrinfo[j].dobj.objType = DO_PUBLICATION_EXCEPT_REL; + else + pubrinfo[j].dobj.objType = DO_PUBLICATION_REL; pubrinfo[j].dobj.catId.tableoid = atooid(PQgetvalue(res, i, i_tableoid)); @@ -4797,6 +4797,7 @@ getPublicationTables(Archive *fout, TableInfo tblinfo[], int numTables) pubrinfo[j].pubrelqual = NULL; else pubrinfo[j].pubrelqual = pg_strdup(PQgetvalue(res, i, i_prrelqual)); + pubrinfo[j].pubexcept = (strcmp(prexcept, "t") == 0); Why not assign pubrinfo[j].pubexcept earlier so you don't have to repeat the strcmp? ~~~ 21. - if (strcmp(prexcept, "t") == 0) + if (strcmp(prexcept, "t") == 0 && PQgetisnull(res, i, i_prattrs)) simple_ptr_list_append(&exceptinfo, &pubrinfo[j]); Why not assign pubrinfo[j].pubexcept earlier so you don't have to repeat the strcmp? Same also for the PQgetisnull(res, i, i_prattrs))... ~~~ 22. dumpPublicationTable if (pubrinfo->pubrattrs) - appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); + { + if (pubrinfo->pubexcept) + appendPQExpBuffer(query, " EXCEPT (%s)", pubrinfo->pubrattrs); + else + appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); + } SUGGESTION { if (pubrinfo->pubexcept) appendPQExpBuffer(query, " EXCEPT"); appendPQExpBuffer(query, " (%s)", pubrinfo->pubrattrs); } ====== Kind Regards, Peter Smith. Fujitsu Australia