Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uW7gB-008YFX-8f for pgsql-hackers@arkaria.postgresql.org; Mon, 30 Jun 2025 06:07:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uW7g9-00ERO1-Ao for pgsql-hackers@arkaria.postgresql.org; Mon, 30 Jun 2025 06:07:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uW7g8-00ERMK-Oh for pgsql-hackers@lists.postgresql.org; Mon, 30 Jun 2025 06:07:45 +0000 Received: from mail-qt1-x82d.google.com ([2607:f8b0:4864:20::82d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uW7g7-004i6L-0f for pgsql-hackers@lists.postgresql.org; Mon, 30 Jun 2025 06:07:44 +0000 Received: by mail-qt1-x82d.google.com with SMTP id d75a77b69052e-4a442a3a2bfso34991841cf.1 for ; Sun, 29 Jun 2025 23:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751263662; x=1751868462; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=GwK6/3QTgHqxOg4q20BqXBXmNlBNm0dABujHMvcgpKE=; b=DOzMWb6mbMG+rOSnfQOtpmvSnb/lcWrkptrGb8s7SozclWLsWL51lBGirAARWOw7n9 SQGPlmhXycHQ0mLWcPUhlDSdAFuTvBPsXzEtaCfWlo2/uVFjiztVrtcLUwFIGjxu4mZc YgR7r9x8+INTB0CfIR3Fkl3mPhHuE30siagRzw0GN2M3E6tahFqf1AHUFA39gt+6d4cs vcR9wfd5UIJswns1KUMY65tGerOl/sVtLudMnR7YCh0WqxQow1I7laenjB0UhEfQYwuh nsyaJIcJ1jEgmD+8hGpdS62P63+9+P9MfHN1Oj5hMXeiG9tBj7pmZxyTi5eykiB/jVot 4Aog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751263662; x=1751868462; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GwK6/3QTgHqxOg4q20BqXBXmNlBNm0dABujHMvcgpKE=; b=P+otWrqYy6arhRQzchskHuK5gl9XF5cDsARxxgWibqL5LR4V6KvNHRd4NVV06PfmHV PeG1Vdl5jK1vk2D29LhM4o4QkNI3+WaGqswgFTWjTyrybvCnPCrUgQgeyG3Jloi9W3Zy 8Qax5P62pLQ/yF/8vs4W1lXg3F4cCakpfsK9Be4B4GPmBGAgKlhjhr9b+kYsLK0fxsKB q/HLpmiS6cXuR5pPYg9I60GZamQ8dT0DRPzKG+NFGCnIuJChGoONt26rlL65rfmDbfLS MCV1DSM+Jqv793fbq1pweDfqExdBjmx0d/O2oSpMYwqweq+L1x1qS0FDWgPZ4cHHdKof 7c2g== X-Forwarded-Encrypted: i=1; AJvYcCUWgS+GctOBPaW3Qg/lorQ17YWSEaK6LHm6u/Sbtvebsx29qj0x/AN8xComwLB0o6jE6ZxdlH+l15F2LDEu@lists.postgresql.org X-Gm-Message-State: AOJu0YxiGmnpvG8YXU+a3QqxA3e7G9YdfKhxpJ5AadztiqSWQI5RRiil tr1Q3AqzSVfJhOubdrJBstmfZmS2WGCdb34LpHdQ29Uj0VB6VwA6ncB6RoA420GiDpc2Z941Vi9 VSQ4unWbmeqRvpyXwkf+Q64EZa7X8iLE= X-Gm-Gg: ASbGncsQARi/07IixGwP/U5zcGXsdBZ4rvcY2rN1XnWN+gUNwkU5B9X3Ufqj8vSdcuB lLF9FjhQ0Nqvcf//daP6J2it9VTVI0mFGypurN/dh3osmyX7nF5nLUErR677Y+4oDZImnSB10qs LthSIdL+Nd+BAVlBdv06vvN5MYI/jNUd/in27ES7Zh/B8= X-Google-Smtp-Source: AGHT+IFyDPQGvrIR/5zMOjiwkVWYm/iJo9CWxDbjOjCrAAfE50ikCdYZStXik+iOC2hacqTLugmNu4bLIc2QFA651lI= X-Received: by 2002:ac8:7c46:0:b0:477:1ee1:23d9 with SMTP id d75a77b69052e-4a7fc9f5910mr201099661cf.20.1751263661978; Sun, 29 Jun 2025 23:07:41 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Peter Smith Date: Mon, 30 Jun 2025 16:07:15 +1000 X-Gm-Features: Ac12FXx6xes3PKLvOoOidDUHIh_t5bABcPuTEvscfkaT7nqW5EXMshkWqVmJ-oU Message-ID: Subject: Re: Skipping schema changes in publication To: Shlok Kyal Cc: Amit Kapila , "Zhijie Hou (Fujitsu)" , vignesh C , YeXiu <1518981153@qq.com>, Ian Lawrence Barwick , Bharath Rupireddy , PostgreSQL Hackers Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Shlok. Some review comments for v15-0003. ====== doc/src/sgml/catalogs.sgml 1. - True if the relation must be excluded + True if the column list or relation must be excluded from publication. + If a column list is specified in prattrs, then + exclude only those columns. If prattrs is NULL, + then exclude the entire relation. I noticed other fields on this page say "null" instead of "NULL". It seems like "null" is more conventional. ====== doc/src/sgml/logical-replication.sgml 2. If no column list is specified, any columns added to the table later are automatically replicated. This means that having a column list which names - all columns is not the same as having no column list at all. + all columns is not the same as having no column list at all. Similarly, if an + column list is specified with EXCEPT, any columns added to the table later + are also replicated automatically. 2a. CURRENTLY If no column list or a column list with EXCEPT is specified, any columns added to the table later are automatically replicated. This means that having a column list which names all columns is not the same as having no column list at all. If an column list is specified, any columns added to the table later are automatically replicated. ~ That still doesn't quite make sense. I think instead of saying "This means..." it needs to say something a bit like below: However, a normal column list (without EXCEPT) only replicates the specified columns and no more. Therefore, having a column list that names all columns is not the same as having no column list at all, as more columns may be added to the table later. ~ 2b. And the final sentence "If an column list..." looks like a cut/paste error (??) ~ 2c. Maybe here EXCEPT should be written as EXCEPT ~~~ 2.5A. The description about generated columns still says this: CURRENT: Generated columns can also be specified in a column list. This allows generated columns to be published, regardless of the publication parameter publish_generated_columns. See Section 29.6 for details. ~ But I don't think it is quite correct. IMO gencols behaviour is much more subtle... e.g. a) Normal collist - these named cols are published REGARDLESS of the 'publish_generated_cols' parameter (same as before) b) EXCEPT collist - you can specify gencols in the list REGARDLESS of the 'publish_generated_cols' parameter, because since they are named as "except" then they will not be published anyhow.... c) BUT for EXCEPT collist case, I think any gencols that are *not* covered by that EXCEPT collist should follow the rules according to the 'publish_generated_cols' parameter. So, it is much more tricky than the docs currently say: Also 2.5B. - The text says "See Section 29.6 for details," but there are no examples of these combinations (e.g. EXCEPT collist and diff parameter setting) 2.5C, - The regression tests also need to be more complex to cover these 2.5D. - You might need to add something in the CREATE PUBLICATION "NOTES" section after all -- even if it just refers to here. ~~~ 3. Create a publication p1. A column list is defined for - table t1 to reduce the number of columns that will be - replicated. Notice that the order of column names in the column list does - not matter. + table t1, and another column list is defined for table + t2 using the EXCEPT clause to reduce the number of + columns that will be replicated. Note that the order of column names in + the column lists does not matter. -/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d); +/* pub # */ CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d), t2 EXCEPT (d, a); Maybe here EXCEPT should be written as EXCEPT ====== doc/src/sgml/ref/create_publication.sgml 4. - When a column list is specified, only the named columns are replicated. - The column list can contain stored generated columns as well. If the - column list is omitted, the publication will replicate all non-generated - columns (including any added in the future) by default. Stored generated - columns can also be replicated if publish_generated_columns - is set to stored. Specifying a column list has no - effect on TRUNCATE commands. See + When a column list without EXCEPT is specified, only the named columns are + replicated. The column list can contain stored generated columns as well. + If the column list is omitted, the publication will replicate + all non-generated columns (including any added in the future) by default. + Stored generated columns can also be replicated if + publish_generated_columns is set to + stored. Specifying a column list has no effect on + TRUNCATE commands. See for details about column lists. Maybe here EXCEPT should be written as EXCEPT ~~~ 5. + + When a column list is specified with EXCEPT, the named columns are not + replicated. Specifying a column list has no effect on + TRUNCATE commands. + Maybe here EXCEPT should be written as EXCEPT. ** Note all the extra subtleties that I mentioned in the review comment #2.5 above --- e.g. IMO any *un-listed* gencols still should follow the parameter rules. ~~~ 6. Any column list must include the REPLICA IDENTITY columns - in order for UPDATE or DELETE - operations to be published. There are no column list restrictions if the - publication publishes only INSERT operations. + and any column list specified with EXCEPT must not include the + REPLICA IDENTITY columns in order for + UPDATE or DELETE operations to be + published. There are no column list restrictions if the publication publishes + only INSERT operations. 6a. CURRENT: Any column list must include the REPLICA IDENTITY columns, and any column list specified with EXCEPT must not include the REPLICA IDENTITY columns in order for UPDATE or DELETE operations to be published. ~ I felt that might be better expressed the other way around. Also, it might be better to say "not name" instead of "not include" because EXCEPT + include seemed a bit contrary. SUGGESTION (maybe like this) In order for UPDATE or DELETE operations to work, all the REPLICA IDENTITY columns must be published. So, any column list must name all REPLICA IDENTITY columns, and any EXCEPT column list must not name any REPLICA IDENTITY columns. ~~ 6b. Maybe here EXCEPT should be written as EXCEPT ====== src/backend/catalog/pg_publication.c check_and_fetch_column_list: 7. + /* Lookup the except attribute */ + cfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, cftuple, + Anum_pg_publication_rel_prexcept, &isnull); + + if (!isnull) + { + Assert(!pub->alltables); + *except_columns = DatumGetBool(cfdatum); + } + I felt it would be safer to also assign *except_columns = false; up-front so the caller could be sure this flag was meaningful on return. ~~~ pub_form_cols_map: 8. Maybe use snake case like for other params, so /excepcols/except_cols/ ~~~ pg_get_publication_tables: 9. I felt all the logic in this function maybe can be simpler: e.g. If you just have "Bitmapset *except_columns = NULL;" then null nmeans there is no except columns; otherwise there is. This means you don't need a separate 'bool except_column' variable. e.g. Assign the Bitmapset *except_columns after you already have the values[2], instead of doing it later. e.g. The skip code if (except_columns && bms_is_member(att->attnum, columns)) could just check the list member, I think, without the additional bool. ~~~ 10. + /* + * We fetch pubtuple if publication is not FOR ALL TABLES and not + * FOR TABLES IN SCHEMA. So if prexcept is true, it indicate that + * prattrs contains columns to be excluded for replication. + */ + if (!isnull) + except_columns = DatumGetBool(exceptDatum); /indicate/indicates/ ====== src/backend/parser/gram.y 11. + | TABLE relation_expr EXCEPT opt_except_column_list OptWhereClause + { + $$ = makeNode(PublicationObjSpec); + $$->pubobjtype = PUBLICATIONOBJ_TABLE; + $$->pubtable = makeNode(PublicationTable); + $$->pubtable->relation = $2; + $$->pubtable->columns = $4; + $$->pubtable->whereClause = $5; + $$->pubtable->except = true; + $$->location = @1; + } I wasn't expecting you would need another 'opt_except_column_list' and all the code duplication that causes. AFAIK, the syntax is identical for 'opt_column_list' apart from the preceding EXCEPT so I thought all you need is to allow the 'opt_column_list' to have an optional EXCEPT qualifier. ====== src/backend/replication/pgoutput/pgoutput.c 12. + + /* + * Indicates whether no columns are published for a given relation. With + * the introduction of the EXCEPT clause in column lists, it is now + * possible to define a publication that excludes all columns of a table. + * However, the 'columns' attribute cannot represent this case, since a + * NULL value implies that all columns are published. To distinguish this + * scenario, the 'no_cols_published' flag is introduced. + */ + bool no_cols_published; } RelationSyncEntry; But, what about when Bitmapset *columns is not null, but has no bits set -- doesn't that mean the same as "no columns"? ====== src/include/catalog/pg_publication.h 13. extern Bitmapset *pub_form_cols_map(Relation relation, - PublishGencolsType include_gencols_type); + PublishGencolsType include_gencols_type, + Bitmapset *exceptcols); Maybe snake-case like the other params: /exceptcols/except_cols/ ====== src/test/regress/sql/publication.sql 14. +-- Verify that publication is created with EXCEPT +CREATE PUBLICATION testpub_except FOR TABLE pub_test_except1, pub_sch1.pub_test_except2 EXCEPT (b, c); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; + I think tests should also use psql \dRp+ commands in places to show that the "describe" stuff is working correctly. ~~~ 15. +-- Check for invalid cases +CREATE PUBLICATION testpub_except2 FOR TABLES IN SCHEMA pub_sch1, TABLE pub_test_except1 EXCEPT (b, c); +CREATE PUBLICATION testpub_except2 FOR TABLE pub_test_except1 EXCEPT; Should explain more about what you are testing here: a) cannot use EXCEPT col-lists combined with TABLES IN SCHEMA b) syntax error EXCEPT without a col-list ~~~ 16. +-- Verify that publication can be altered with EXCEPT +ALTER PUBLICATION testpub_except SET TABLE pub_test_except1 EXCEPT (a, b), pub_sch1.pub_test_except2; +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; The comment is a bit misleading because there are many kinds of "alter". Maybe say more like Verify ok - ALTER PUBLICATION ... SET ... EXCEPT (col-list) ~~~ 17. +-- Verify ALTER PUBLICATION ... DROP +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1 EXCEPT (a, b); +ALTER PUBLICATION testpub_except DROP TABLE pub_test_except1; Should explain more: +-- Verify fails - ALTER PUBLICATION ... DROP ... EXCEPT (col-list) +-- Verify ok - ALTER PUBLICATION ... DROP ... ~~~ 18. +ALTER PUBLICATION testpub_except ADD TABLE pub_test_except1 EXCEPT (c, d); +SELECT * FROM pg_publication_tables WHERE pubname = 'testpub_except'; Missing comment: +-- Verify ok - ALTER PUBLICATION ... ADD ... EXCEPT (col-list) ~~~ 19. +-- Verify excluded columns cannot be part of REPLICA IDENTITY +ALTER TABLE pub_test_except1 REPLICA IDENTITY FULL; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; +DROP INDEX pub_test_except1_a_idx; +CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a); +ALTER TABLE pub_test_except1 REPLICA IDENTITY USING INDEX pub_test_except1_a_idx; +UPDATE pub_test_except1 SET a = 3 WHERE a = 1; + +DROP INDEX pub_test_except1_a_idx; 19a. IIUC, really there are multiple tests here, so I think it should all be split and commented separately. a) Verify that EXCEPT col-list cannot contain RI cols (when using RI FULL) b) Verify that EXCEPT col-list cannot contain RI cols (when using INDEX) c) Verify that so long as no clash between RI cols and the EXCEPT col-list, then it is ok ~ 19b. IMO, some index names could be better: CREATE UNIQUE INDEX pub_test_except1_a_idx ON pub_test_except1 (a, c); How about 'pub_test_except1_ac_idx'? ~~~ 20. +DROP PUBLICATION testpub_except; +DROP TABLE pub_test_except1; +DROP TABLE pub_sch1.pub_test_except2; Add a "cleanup" comment. ====== Kind Regards, Peter Smith. Fujitsu Australia