Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uU1Xs-00ED4C-Qw for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Jun 2025 11:10:32 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uU1Xp-00BUh9-Js for pgsql-hackers@arkaria.postgresql.org; Tue, 24 Jun 2025 11:10:30 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uU1Xp-00BUh1-BK for pgsql-hackers@lists.postgresql.org; Tue, 24 Jun 2025 11:10:29 +0000 Received: from mail-wr1-x433.google.com ([2a00:1450:4864:20::433]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uU1Xo-003iZP-0s for pgsql-hackers@lists.postgresql.org; Tue, 24 Jun 2025 11:10:29 +0000 Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-3a6d77b43c9so2700814f8f.3 for ; Tue, 24 Jun 2025 04:10:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1750763427; x=1751368227; darn=lists.postgresql.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=sEO9opoLkfDs1EU+lynqtAiV4NLKqBkKQAzCNkAtt+I=; b=FE/eyFGW6lJIvr28z37aOAA+lZwFiNIwAQq7smK+W8T+Ins+tvGKP1hSB4Yt2lnvMX 0+mB5zZKnnnhsebU1ebGARNUp6JhNX9oI7HAbBy0e6E59GbCO85YAytxhLE3QBOwXP7n pzXLcS7z0eGJuiThGpyfEENzlaANT2Us0l2ouHW9HZkuKrACyu32yjmGlJRVmatqF4sz H5O9t8Z82Tz/MrEBJws4TfC/aiUHkr20vCsBOzHbFFuBn8E1SqWfAX1b/0w2vgsKvTd4 whK26a6/QRk8jkrwoHsow3ieU/D91t6/M3hiQaaTZiXZwYNLlV4pPUK17Bj10HfqQgtD 9meg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750763427; x=1751368227; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sEO9opoLkfDs1EU+lynqtAiV4NLKqBkKQAzCNkAtt+I=; b=SYhHmlSAxAwdYhv+DxhxRxj4T6ZyI3ZJqU+6f6PVDn5D0B0qMoADHIssE8ta6whFEN n4osdiwNDFc9l+Q5AWofrNti02Rb3sl/RAcqDoAbRVDOlsfClEROG5y0t9SlZPGB9Gyy LINGkciiDp5GXbNrJF8vVQkzF//I/+QWin5nh91OuumhnmM4vPn+Br1/ekUc/MDJWN2k lmq7os1z1/gmdmu71HEuu/36T5n8CgL1PAp0kN9hjngCHEJTbk7s+lTbYYl9GJv9nBbl +dsAUFkFTmk3PHhX4c96b5ozK+t3hyABKheZUoqMdmGU1e2zUysNp/QtXEUHl2BuHu4K SKsg== X-Forwarded-Encrypted: i=1; AJvYcCWy9VzHz3F6zKKpWvC/zsR6xc0Nyv8t6/xNK9WjSVmC/1GsEDLq1+sfRWzoX1LlcbuOUOFxUyKJFAr9PGaD@lists.postgresql.org X-Gm-Message-State: AOJu0Yy8CxlgKwXlmtIA19GNjk8++AmrEvSAeAV3nK4gZ5XC180BML9x gZFvegqx030It9tyQ8PChLc/mh4CRvN7uZ4VKhSc5Glf1idcNtDemVsH X-Gm-Gg: ASbGncvnNzGQ+mrjdcWwDPWVVW0XmhlNTaX8+rsomZF2i9XyoRInLcRbPZUq0euinyo SrKp2WVBP5cSMxtG8SmFmHLqBkzEGAnqHh9QdmoNdE30HckW08R7LLh//y1aSnRloDBirH/7w1q 2ip4Xs/ElcsSWDeQ7UAodiANgq1RSSMRQApHMpfMokDYGoV0VK7SKdPbZcyeEsZJKh/AbQUNgAX Wph9pxRW3ijsQhAOd9MB8vW6/aY1Jx3MfXBcjSfLXfdvp1XaUw6shiSZtvw5ArU8gZDalpayECX qGhySu0gK7c+shJpbT6J5NwFIh65goGVfLybJXfWeOETLPKmumJys45HkHSkYO4/U0SP7CLabno sl3THY7GJ5l0aZ56X/jSWupYPtBCPaBt1rC4V2ymvAQ7BLVXZAM1xsvjF+A7vdACOv328XgAlbw VoBD+0CHrUtaC1VeGAPQ== X-Google-Smtp-Source: AGHT+IH2WMSZTiwod+ydtTRIKX7tIWKnQqKTyacMIEQ5L3XxC07Ra5RMomiw0piOb51VQ3zwh9Z6QA== X-Received: by 2002:adf:b605:0:b0:3a5:2ec5:35a3 with SMTP id ffacd0b85a97d-3a6d1316184mr10973854f8f.45.1750763426322; Tue, 24 Jun 2025 04:10:26 -0700 (PDT) Received: from ip-10-97-1-34.eu-west-3.compute.internal (ec2-15-237-181-182.eu-west-3.compute.amazonaws.com. [15.237.181.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a6e8114697sm1695405f8f.98.2025.06.24.04.10.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Jun 2025 04:10:25 -0700 (PDT) Date: Tue, 24 Jun 2025 11:10:24 +0000 From: Bertrand Drouvot To: Tomas Vondra Cc: Christoph Berg , Andres Freund , Tomas Vondra , pgsql-hackers@lists.postgresql.org Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view Message-ID: References: <6c9f9f7e-947b-4fc3-bdb6-b0696d7492e5@vondra.me> <0643ae61-cf9d-482c-9b2c-fb861b24fd22@vondra.me> <6342f601-77de-4ee0-8c2a-3deb50ceac5b@vondra.me> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On Tue, Jun 24, 2025 at 11:20:15AM +0200, Tomas Vondra wrote: > On 6/24/25 10:24, Bertrand Drouvot wrote: > > Yeah, same for me with pg_get_shmem_allocations_numa(). It works if > > pg_numa_query_pages() is done on chunks <= 16 pages but fails if done on more > > than 16 pages. > > > > It's also confirmed by test_chunk_size.c attached: > > > > $ gcc-11 -m32 -o test_chunk_size test_chunk_size.c > > $ ./test_chunk_size > > 1 pages: SUCCESS (0 errors) > > 2 pages: SUCCESS (0 errors) > > 3 pages: SUCCESS (0 errors) > > 4 pages: SUCCESS (0 errors) > > 5 pages: SUCCESS (0 errors) > > 6 pages: SUCCESS (0 errors) > > 7 pages: SUCCESS (0 errors) > > 8 pages: SUCCESS (0 errors) > > 9 pages: SUCCESS (0 errors) > > 10 pages: SUCCESS (0 errors) > > 11 pages: SUCCESS (0 errors) > > 12 pages: SUCCESS (0 errors) > > 13 pages: SUCCESS (0 errors) > > 14 pages: SUCCESS (0 errors) > > 15 pages: SUCCESS (0 errors) > > 16 pages: SUCCESS (0 errors) > > 17 pages: 1 errors > > Threshold: 17 pages > > > > No error if -m32 is not used. > > > > We could work by chunks (16?) on 32 bits but would probably produce performance > > degradation (we mention it in the doc though). Also would always 16 be a correct > > chunk size? > > I don't see how this would solve anything? > > AFAICS the problem is the two places are confused about how large the > array elements are, and get to interpret that differently. > I don't see how using smaller array makes this correct. That it works is > more a matter of luck, Not sure it's luck, maybe the wrong pointers arithmetic has no effect if batch size is <= 16. So we have kernel_move_pages() -> kernel_move_pages() (because nodes is NULL here for us as we call "numa_move_pages(pid, count, pages, NULL, status, 0);"). So, if we look at do_pages_stat() ([1]), we can see that it uses an hardcoded "#define DO_PAGES_STAT_CHUNK_NR 16UL" and that this pointers arithmetic: " pages += chunk_nr; status += chunk_nr; " is done but has no effect since nr_pages will exit the loop if we use a batch size <= 16. So if this pointer arithmetic is not correct, (it seems that it should advance by 16 * sizeof(compat_uptr_t) instead) then it has no effect as long as the batch size is <= 16. Does test_chunk_size also fails at 17 for you? [1]: https://github.com/torvalds/linux/blob/master/mm/migrate.c Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com