Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1u1jLb-0078th-UP for pgsql-hackers@arkaria.postgresql.org; Mon, 07 Apr 2025 10:04:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1u1jLZ-00EfnE-Mj for pgsql-hackers@arkaria.postgresql.org; Mon, 07 Apr 2025 10:04:54 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1u1jLZ-00Efn5-1E for pgsql-hackers@lists.postgresql.org; Mon, 07 Apr 2025 10:04:53 +0000 Received: from lahtoruutu.iki.fi ([185.185.170.37]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1u1jLW-003Rqf-0L for pgsql-hackers@postgresql.org; Mon, 07 Apr 2025 10:04:51 +0000 Received: from [192.168.1.112] (iptv-hkibng21-58c090-167.dhcp.inet.fi [88.192.144.167]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: hlinnaka) by lahtoruutu.iki.fi (Postfix) with ESMTPSA id 4ZWPvY1sVgz49Q0l; Mon, 7 Apr 2025 13:04:41 +0300 (EEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1744020283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VPi08BS0S9j5Jsc/EZuHCEQ+DysTGNMxokgeQnyzV78=; b=bVKM1EoIP9QC3d55FOLZ6bxR1zYvxhpdAMplWDw2VQ7y2Kpgf9x6kp0E53qz6bmdXxYwLW QlR2d9zr58FmFfNYJG4s02VQSM3TXnYHpvpfP0j3NTdAvbfnaMeXZTp4ceqHMMF1kTT2AM MipiozTIb9CbSZ6ME2Kldt5UBcXWxxqB6me1Og8BkVKFb1UgzK5VhkyweqNfa3L2LCT80b spBm6SRDtHCaEEkYKpKbTIKsYxZ8r+u2Qtf3b5DsSUx6QH4qm1rlKGDTxsz8LF4+1VDaSi iJ7J8CjA8iLMfJ2ZQj6aIvkwLQaFdkkj2GzvcXTP3UsdWNPgIbLpd+pczbwVuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=iki.fi; s=lahtoruutu; t=1744020282; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VPi08BS0S9j5Jsc/EZuHCEQ+DysTGNMxokgeQnyzV78=; b=sSavcv90cqCdM0qDrI8O8OusHUqbftwXRCz5W+eIKBMYDmCaztG8abTH6+EPlrl2M6A2V0 a1Pbc4/zZ/55p/3Qzt/oISNBLM7LosB5FTar7PuKthGSRudQQ3wLDW6hzI7+ODMXng9WRO fCXthFtDYAtwW4VQc6JexqLp9MVW//q/q35x6R3MvuH+TOQeGPdPC0j5TPvpRoZgl7Jg+r 1s1VnhC/IfzYImR2Emgwex7oBFq4XrPWY6DoUzaxSdXLy6R0DW9+p5K90Ul30460c19lta l7nTmtiEA6mKsNu2lgDf2o4MjvAntR4sXNKwmuGwwZLaGqNvMkbYCsptoa2Gyg== ARC-Seal: i=1; s=lahtoruutu; d=iki.fi; t=1744020282; a=rsa-sha256; cv=none; b=Wj/I60f6bJemkEJqjwEEJmmkB+YL0M81A3bnpZ9O7w2SpMdcxELyNCzdmc0cwD5OW1p55E ekQoRGyhuRIUE0ppEsR9tZ0Pgl+Oau2ZYnDP2B0uq3PnoH9n9uEZ6Ksr+/9dpHpz/qrxiY KdycS1K1QqnL1q6uxIq8PnbQXeE7ONgtFfnumtlbXZu6BR34ARH9I+XpzAyt0ugrNcd/p9 SMYSiSVi3TfTje+mqKwjiDc4GZJC43RjPjmM+2ybRjeEOzbgSvyBedILb2LJfdeiD9gnRG 1PjpNyfxISxTKJsY5/7dYp3cr7v/4ba2NNpaVHjpnlB2U5m3uReLiYkJ/AvRog== ARC-Authentication-Results: i=1; ORIGINATING; auth=pass smtp.auth=hlinnaka smtp.mailfrom=hlinnaka@iki.fi Message-ID: <19422eb3-54dc-4afb-8046-5eee906edacd@iki.fi> Date: Mon, 7 Apr 2025 13:04:39 +0300 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Heikki Linnakangas Subject: Re: AIX support To: Srirama Kucherlapati , "pgsql-hackers@postgresql.org" Cc: Robert Haas , Bruce Momjian , Peter Eisentraut , Alvaro Herrera , Laurenz Albe , Noah Misch , Michael Paquier , Andres Freund , Thomas Munro , "tvk1271@gmail.com" , Tom Lane , Tristan Partin , wenhui qiu , "postgres-ibm-aix@wwpdl.vnet.ibm.com" References: <6e6ce337-93b9-4922-9a89-be2133738fe6@iki.fi> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 05/04/2025 21:29, Srirama Kucherlapati wrote: >>> - WRT to the MEMSET_LOOP_LIMIT flag, this is set to “0”, which would >>> internally use > >> Yes, I understand what it does. But why? Whatever benchmarking was done >> back in 2006 by is no longer relevant. > > We ran the program , mentioned in the below link and collected the > benchmark stats on our node (POWER_10). > > https://postgrespro.com/list/thread-id/1673194 list/thread-id/1673194> > > The native AIX memset() seems to performs better. The benchmark seems to > be still relevant, so I think we should continue to use the existing optimization for > AIX. At least it needs to be updated to match what MemSet() looks like nowadays. The changes may be just cosmetic, but better check. Should also check the effect on MemSetAligned(). That might matter more for performance in practice. A third thing to check is the performance of MemSet() when the pointer is, in fact, aligned. The other question is what do the results look like on other platforms? How much difference does the libc implementation make, vs. the compiler and CPU architecture? If the difference is related to compiler or CPU architecture, then this doesn't belong in the AIX template, but somewhere else. > Below are the stats (64bit Object mode). > >>> ./memset-aix > >         sizeof(int)  = 4 >         sizeof(long) = 8 MemSet() uses 'long', so the int tests are not relevant. I have omitted them below. >         memset by int (size=8) : 0.280301 >         Loop by long (size=8) : 0.202650 > >         memset by int (size=16) : 0.280979 >         Loop by long (size=16) : 0.246879 > >         memset by int (size=32) : 0.331691 >         Loop by long (size=32) : 0.422261 Ok, MemSet() is faster with very small sizes, the crossover is somewhere between 16 and 32 bytes. I'm actually surprised the compiler doesn't replace the memset() call with a few store instructions with these sizes. >         memset by int (size=1024) : 0.904048 >         Loop by long (size=1024) : 24.149871 So with larger sizes, memset() wins hands down. I'm surprised how big the difference is, because I actually expected the compiler to detect the memory-zeroing loop and replace it with some fancy vector instructions (does powerpc have any?). Or a call to memset(); I've seen compilers convert loops to memset() and vice versa. My gut feeling is actually that we should remove the MemSet() macro altogether and just use memset() everywhere. The compilers are much better at optimizing it in year 2025 than they were back in 2002. I'd love to see some rigorous benchmarks across different platforms and compilers to demonstrate that, and then just get rid of MemSet(). MemSetAligned() might still be worth keeping. Sometimes we know that a piece of memory is aligned, but the compiler does not. But maybe even that should just assert and hint the compiler that the input is aligned, and then call memset(). If you'd like to help the community in general, if you could do some more rigorous benchmarking along those lines, not just for AIX, and start a new thread to discuss that, that'd be much appreciated. That would be the best way to resolve this. For the more narrow question of what should the AIX template do, that comes down to whether there's some *AIX-specific* performance difference. The generated powerpc assembly code is presumably the same on AIX and other operating systems, so it comes down to whether there's some big difference in AIX's memset() implementation vs. glibc's. >> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h > >> Why is this change needed? > >> Yes, I know we've been over this many times already. I still don't >> understand why it's needed. The onus is on you to explain it adequately, >> in comments in the patch, so that I and others understand it. Or even >> better, remove it if it's not necessary. > > If you recall, we previously considered replacing this assembly code > with __sync_lock_test_and_set(). However, as you mentioned earlier, > this should be handled in a separate patch. For now, I'll make a > note and submit a separate patch for this later, as originally > planned. Below is the reference to older discussion. Yes, I do recall. Please read again my comment above: this all needs to be explained in comments in the code. To be precise, I have these questions: - Does GCC on AIX (still) use the IBM assembler? - Does the IBM assembler still not understand the label syntax? - Is there some other label syntax that would work on the IBM assembler? - Is it possible to use the GNU assembler instead? >>> +# -blibpath must contain ALL directories where we should look for libraries >>> +libpath := $(shell echo $(subst -L,:,$(filter -L/%,$(LDFLAGS))) | sed -e's/ //g'):/usr/lib:/lib > >> Is this still sensible on modern AIX systems? What happens if you leave >> it out? > > This is required as it is looking for the possible non-default > directories for the linker at the runtime. This is used along with > rpath. As suggested, I tested this by removing the libpath, but at > run time the linker is not able to find the dependent libraries path > as a result, the binaries are not getting loaded. After doing some > research, AIX uses a stricter, more*explicit* approach. The runtime > linker expects to tell it exactly where to look using -blibpath. Ok, some comments would be in order to explain that, maybe with links to the relevant AIX documentation. -- Heikki Linnakangas Neon (https://neon.tech)