Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ujyai-00EGh9-Bg for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Aug 2025 11:15:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ujyag-003mIo-QA for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Aug 2025 11:15:22 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ujyag-003mHr-CX for pgsql-hackers@lists.postgresql.org; Thu, 07 Aug 2025 11:15:22 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ujyad-001CwN-1w for pgsql-hackers@postgresql.org; Thu, 07 Aug 2025 11:15:21 +0000 Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-76c3607d960so979230b3a.1 for ; Thu, 07 Aug 2025 04:15:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754565319; x=1755170119; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=FUeqGxouJX1hv3rzmIxgj2xFdVEKGpZ4EyJLOoTNhfM=; b=f0w8+kpWLfY9PW8Dr5YMNMifQ0BK21KpH2X5oJZ7AGy9InBShhvZTwF86gvk8q1YIk nQZHF+A4lYvn4mALWmOAFfeC2wi5HMqI6RJBQKXvk5koiOCKiU5sLpWlftzZDU6Mh0gv lpoiAcmZ9Wad5snrD5oS+X8KOk9IrMPKS5CbyKzlHsl3ZwAlmXRHpLMM/GIwBrzyRD+n zdziuq5Rz+TntvqJGtRYtdLpHs3t75ISG12jewEo+jlfga/toPbc9Xki2iX/Ga5dIncH +sw1X3WOZl4U013xlN/+OiaYqi8wRDXsj6gsH4JSAVWfkEea7wH0wPoDTGm5LF8ezJI3 e/uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754565319; x=1755170119; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FUeqGxouJX1hv3rzmIxgj2xFdVEKGpZ4EyJLOoTNhfM=; b=wGLsBpx5BBsjlJ+tirYw0mnCniEWZEkCrYvDS/gpy409LYmNcvvBB9dRO1qeNavcw2 y56l+AMQI9/k+8sxX03ET4Em2Vsgu1HdoPwGGF1CvqbSxc8ReJtKKHL8P8mZF3NHLJc8 FN4gox1Kr5Cm+XxpgJDO/OVv3ybPEBYU9t8jkgK0kjjpN5+i9Tx2LpmeK0A75PTzAlgM /mXoz5RuSb11jCko/K5GQhusVt3QDimj78beb/TOSckEZB8bLD6N1UYJNYTR1QHtuHvG ptwmxiKRXV4IMojAPeGlIJ8coPR0opmQ1F+tezo6hbkkKno9mrNRGhC6V7YEX3U56Mc6 lCZQ== X-Gm-Message-State: AOJu0Yx3yEvVDy7AFhGcLvGx+2oOy04/NT7IRe+J7Gd5UAwtr7FMQCnz INq4jziiQrux8Hwizc+FsN6LIJVn8MGzutN7n9Q/aRVZjE4qk8pbpHkRep5QulgqP2g0oTT16FW re6+R98rZFinuV+TMdiZkwFpcV8u9ueCgtk4HZT1oNQ== X-Gm-Gg: ASbGncsdca/7fZNbIPrTu4KN6LacycLTzioYZKySwUZr2KNCwoWvmzlQDFA4/q5V0ao 0zyoWAL8rrnIvhR2RW88sbTJ3rP1wk0AnkfZzAuS4qhbyt9pv61zW3cQZF8A0OwtRD7lSjpY3Bj MPbwv+Z23BS9KITIZkOo4oPw9qL+LjQJGQvZyMv/xrkZqEeKX9gOyPxowFasaLIogJUU67sBwi+ 0UFxD4JUENSRGrwta4= X-Google-Smtp-Source: AGHT+IHpRc5pfG+4Lz38GFC5rBC8o4ZVyPJoARmpcm2EVVkwfnFesQ2PAMMk51ET3L1jbNq3i92ipZjrGGEd2mMDKss= X-Received: by 2002:a17:902:f706:b0:240:99d8:84 with SMTP id d9443c01a7336-2429f57983amr90214955ad.52.1754565319316; Thu, 07 Aug 2025 04:15:19 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nazir Bilal Yavuz Date: Thu, 7 Aug 2025 14:15:06 +0300 X-Gm-Features: Ac12FXxuPUKSuJ41KIQizhuIxkmtEt0KFayrrzeOPWKBozfCtw48QPBQjnE6UDE Message-ID: Subject: Re: Speed up COPY FROM text/CSV parsing using SIMD To: Shinya Kato Cc: pgsql-hackers@postgresql.org Content-Type: multipart/mixed; boundary="00000000000097ed71063bc492fb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000097ed71063bc492fb Content-Type: text/plain; charset="UTF-8" Hi, Thank you for working on this! On Thu, 7 Aug 2025 at 04:49, Shinya Kato wrote: > > Hi hackers, > > I have implemented SIMD optimization for the COPY FROM (FORMAT {csv, > text}) command and observed approximately a 5% performance > improvement. Please see the detailed test results below. I have been working on the same idea. I was not moving input_buf_ptr as far as possible, so I think your approach is better. Also, I did a benchmark on text format. I created a benchmark for line length in a table being from 1 byte to 1 megabyte.The peak improvement is line length being 4096 and the improvement is more than 20% [1], I saw no regression on your patch. > Idea > ==== > The current text/CSV parser processes input byte-by-byte, checking > whether each byte is a special character (\n, \r, quote, escape) or a > regular character, and transitions states in a state machine. This > sequential processing is inefficient and likely causes frequent branch > mispredictions due to the many if statements. > > I thought this problem could be addressed by leveraging SIMD and > vectorized operations for faster processing. > > Implementation Overview > ======================= > 1. Create a vector of special characters (e.g., Vector8 nl = > vector8_broadcast('\n');). > 2. Load the input buffer into a Vector8 variable called chunk. > 3. Perform vectorized operations between chunk and the special > character vectors to check if the buffer contains any special > characters. > 4-1. If no special characters are found, advance the input_buf_ptr by > sizeof(Vector8). > 4-2. If special characters are found, advance the input_buf_ptr as far > as possible, then fall back to the original text/CSV parser for > byte-by-byte processing. > ... > Thought? > I would appreciate feedback on the implementation and any suggestions > for further improvement. I have a couple of ideas that I was working on: --- + * However, SIMD optimization cannot be applied in the following cases: + * - Inside quoted fields, where escape sequences and closing quotes + * require sequential processing to handle correctly. I think you can continue SIMD inside quoted fields. Only important thing is you need to set last_was_esc to false when SIMD skipped the chunk. --- + * - When the remaining buffer size is smaller than the size of a SIMD + * vector register, as SIMD operations require processing data in + * fixed-size chunks. You run SIMD when 'copy_buf_len - input_buf_ptr >= sizeof(Vector8)' but you only call CopyLoadInputBuf() when 'input_buf_ptr >= copy_buf_len || need_data' so basically you need to wait at least the sizeof(Vector8) character to pass for the next SIMD. And in the worst case; if CopyLoadInputBuf() puts one character less than sizeof(Vector8), then you can't ever run SIMD. I think we need to make sure that CopyLoadInputBuf() loads at least the sizeof(Vector8) character to the input_buf so we do not encounter that problem. --- What do you think about adding SIMD to CopyReadAttributesText() and CopyReadAttributesCSV() functions? When I add your SIMD approach to CopyReadAttributesText() function, the improvement on the 4096 byte line length input [1] goes from 20% to 30%. --- I shared my ideas as a Feedback.txt file (.txt to stay off CFBot's radar for this thread). I hope these help, please let me know if you have any questions. -- Regards, Nazir Bilal Yavuz Microsoft --00000000000097ed71063bc492fb Content-Type: text/plain; charset="US-ASCII"; name="Feedback.txt" Content-Disposition: attachment; filename="Feedback.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_me1aqsxg0 RnJvbSBiMTNmNGNkZjEzNGVlZjVmYmVjZjllYTA2ZjliMWM5OTg5MGI3YzAyIE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBOYXppciBCaWxhbCBZYXZ1eiA8YnlhdnV6ODFAZ21haWwuY29t PgpEYXRlOiBUaHUsIDcgQXVnIDIwMjUgMTM6Mjc6MzQgKzAzMDAKU3ViamVjdDogW1BBVENIXSBG ZWVkYmFjawoKLS0tCiBzcmMvYmFja2VuZC9jb21tYW5kcy9jb3B5ZnJvbXBhcnNlLmMgfCA1NSAr KysrKysrKysrKysrKysrKysrKysrKysrKy0tCiAxIGZpbGUgY2hhbmdlZCwgNTEgaW5zZXJ0aW9u cygrKSwgNCBkZWxldGlvbnMoLSkKCmRpZmYgLS1naXQgYS9zcmMvYmFja2VuZC9jb21tYW5kcy9j b3B5ZnJvbXBhcnNlLmMgYi9zcmMvYmFja2VuZC9jb21tYW5kcy9jb3B5ZnJvbXBhcnNlLmMKaW5k ZXggNWFiYTBmYTZjYjcuLmRhZTVjMWY2OThjIDEwMDY0NAotLS0gYS9zcmMvYmFja2VuZC9jb21t YW5kcy9jb3B5ZnJvbXBhcnNlLmMKKysrIGIvc3JjL2JhY2tlbmQvY29tbWFuZHMvY29weWZyb21w YXJzZS5jCkBAIC02NzAsOCArNjcwLDEyIEBAIENvcHlMb2FkSW5wdXRCdWYoQ29weUZyb21TdGF0 ZSBjc3RhdGUpCiAJCS8qIElmIHdlIG5vdyBoYXZlIHNvbWUgdW5jb252ZXJ0ZWQgZGF0YSwgdHJ5 IHRvIGNvbnZlcnQgaXQgKi8KIAkJQ29weUNvbnZlcnRCdWYoY3N0YXRlKTsKIAotCQkvKiBJZiB3 ZSBub3cgaGF2ZSBzb21lIG1vcmUgaW5wdXQgYnl0ZXMgcmVhZHksIHJldHVybiB0aGVtICovCi0J CWlmIChJTlBVVF9CVUZfQllURVMoY3N0YXRlKSA+IG5ieXRlcykKKwkJLyoKKwkJICogSWYgd2Ug bm93IGhhdmUgYXQgbGVhc3Qgc2l6ZW9mKFZlY3RvcjgpIGlucHV0IGJ5dGVzIHJlYWR5LCByZXR1 cm4KKwkJICogdGhlbS4gVGhpcyBpcyBiZW5lZmljaWFsIGZvciBTSU1EIHByb2Nlc3NpbmcgaW4g dGhlCisJCSAqIENvcHlSZWFkTGluZVRleHQoKSBmdW5jdGlvbi4KKwkJICovCisJCWlmIChJTlBV VF9CVUZfQllURVMoY3N0YXRlKSA+IG5ieXRlcyArIHNpemVvZihWZWN0b3I4KSkKIAkJCXJldHVy bjsKIAogCQkvKgpAQCAtMTMyMiw3ICsxMzI2LDcgQEAgQ29weVJlYWRMaW5lVGV4dChDb3B5RnJv bVN0YXRlIGNzdGF0ZSwgYm9vbCBpc19jc3YpCiAJCSAqIHVuc2FmZSB3aXRoIHRoZSBvbGQgdjIg Q09QWSBwcm90b2NvbCwgYnV0IHdlIGRvbid0IHN1cHBvcnQgdGhhdAogCQkgKiBhbnltb3JlLgog CQkgKi8KLQkJaWYgKGlucHV0X2J1Zl9wdHIgPj0gY29weV9idWZfbGVuIHx8IG5lZWRfZGF0YSkK KwkJaWYgKGlucHV0X2J1Zl9wdHIgKyBzaXplb2YoVmVjdG9yOCkgPj0gY29weV9idWZfbGVuIHx8 IG5lZWRfZGF0YSkKIAkJewogCQkJUkVGSUxMX0xJTkVCVUY7CiAKQEAgLTEzNTksNyArMTM2Myw3 IEBAIENvcHlSZWFkTGluZVRleHQoQ29weUZyb21TdGF0ZSBjc3RhdGUsIGJvb2wgaXNfY3N2KQog CQkgKiAgIHZlY3RvciByZWdpc3RlciwgYXMgU0lNRCBvcGVyYXRpb25zIHJlcXVpcmUgcHJvY2Vz c2luZyBkYXRhIGluCiAJCSAqICAgZml4ZWQtc2l6ZSBjaHVua3MuCiAJCSAqLwotCQlpZiAoIWlu X3F1b3RlICYmIGNvcHlfYnVmX2xlbiAtIGlucHV0X2J1Zl9wdHIgPj0gc2l6ZW9mKFZlY3Rvcjgp KQorCQlpZiAoY29weV9idWZfbGVuIC0gaW5wdXRfYnVmX3B0ciA+PSBzaXplb2YoVmVjdG9yOCkp CiAJCXsKIAkJCVZlY3RvcjgJCWNodW5rOwogCQkJVmVjdG9yOAkJbWF0Y2g7CkBAIC0xMzk1LDYg KzEzOTksNyBAQCBDb3B5UmVhZExpbmVUZXh0KENvcHlGcm9tU3RhdGUgY3N0YXRlLCBib29sIGlz X2NzdikKIAkJCXsKIAkJCQkvKiBObyBzcGVjaWFsIGNoYXJhY3RlcnMgZm91bmQsIHNvIHNraXAg dGhlIGVudGlyZSBjaHVuayAqLwogCQkJCWlucHV0X2J1Zl9wdHIgKz0gc2l6ZW9mKFZlY3Rvcjgp OworCQkJCWxhc3Rfd2FzX2VzYyA9IGZhbHNlOwogCQkJCWNvbnRpbnVlOwogCQkJfQogCQl9CkBA IC0xNjUwLDYgKzE2NTUsMTEgQEAgQ29weVJlYWRBdHRyaWJ1dGVzVGV4dChDb3B5RnJvbVN0YXRl IGNzdGF0ZSkKIAljaGFyCSAgICpjdXJfcHRyOwogCWNoYXIJICAgKmxpbmVfZW5kX3B0cjsKIAor I2lmbmRlZiBVU0VfTk9fU0lNRAorCVZlY3RvcjgJCWJzID0gdmVjdG9yOF9icm9hZGNhc3QoJ1xc Jyk7CisJVmVjdG9yOAkJZGVsaW0gPSB2ZWN0b3I4X2Jyb2FkY2FzdChkZWxpbWMpOzsKKyNlbmRp ZgorCiAJLyoKIAkgKiBXZSBuZWVkIGEgc3BlY2lhbCBjYXNlIGZvciB6ZXJvLWNvbHVtbiB0YWJs ZXM6IGNoZWNrIHRoYXQgdGhlIGlucHV0CiAJICogbGluZSBpcyBlbXB0eSwgYW5kIHJldHVybi4K QEAgLTE3MTcsNiArMTcyNyw0MyBAQCBDb3B5UmVhZEF0dHJpYnV0ZXNUZXh0KENvcHlGcm9tU3Rh dGUgY3N0YXRlKQogCQl7CiAJCQljaGFyCQljOwogCisjaWZuZGVmIFVTRV9OT19TSU1ECisJCWlm IChsaW5lX2VuZF9wdHIgLSBjdXJfcHRyID49IHNpemVvZihWZWN0b3I4KSkKKwkJeworCQkJVmVj dG9yOAkJY2h1bms7CisJCQlWZWN0b3I4CQltYXRjaDsKKwkJCXVpbnQzMgkJbWFzazsKKworCQkJ LyogTG9hZCBhIGNodW5rIG9mIGRhdGEgaW50byBhIHZlY3RvciByZWdpc3RlciAqLworCQkJdmVj dG9yOF9sb2FkKCZjaHVuaywgKGNvbnN0IHVpbnQ4ICopIGN1cl9wdHIpOworCisJCQkvKiBDcmVh dGUgYSBtYXNrIG9mIGFsbCBzcGVjaWFsIGNoYXJhY3RlcnMgd2UgbmVlZCB0byBzdG9wIGF0ICov CisJCQltYXRjaCA9IHZlY3Rvcjhfb3IodmVjdG9yOF9lcShjaHVuaywgYnMpLCB2ZWN0b3I4X2Vx KGNodW5rLCBkZWxpbSkpOworCisJCQkvKiBDaGVjayBpZiB3ZSBmb3VuZCBhbnkgc3BlY2lhbCBj aGFyYWN0ZXJzICovCisJCQltYXNrID0gdmVjdG9yOF9oaWdoYml0X21hc2sobWF0Y2gpOworCQkJ aWYgKG1hc2sgIT0gMCkKKwkJCXsKKwkJCQkvKgorCQkJCSAqIEZvdW5kIGEgc3BlY2lhbCBjaGFy YWN0ZXIuIEFkdmFuY2UgdXAgdG8gdGhhdCBwb2ludCBhbmQgbGV0CisJCQkJICogdGhlIHNjYWxh ciBjb2RlIGhhbmRsZSBpdC4KKwkJCQkgKi8KKwkJCQlpbnQgYWR2YW5jZSA9IHBnX3JpZ2h0bW9z dF9vbmVfcG9zMzIobWFzayk7CisJCQkJbWVtY3B5KG91dHB1dF9wdHIsIGN1cl9wdHIsIGFkdmFu Y2UpOworCQkJCW91dHB1dF9wdHIgKz0gYWR2YW5jZTsKKwkJCQljdXJfcHRyICs9IGFkdmFuY2U7 CisJCQl9CisJCQllbHNlCisJCQl7CisJCQkJLyogTm8gc3BlY2lhbCBjaGFyYWN0ZXJzIGZvdW5k LCBzbyBza2lwIHRoZSBlbnRpcmUgY2h1bmsgKi8KKwkJCQltZW1jcHkob3V0cHV0X3B0ciwgY3Vy X3B0ciwgc2l6ZW9mKFZlY3RvcjgpKTsKKwkJCQlvdXRwdXRfcHRyICs9IHNpemVvZihWZWN0b3I4 KTsKKwkJCQljdXJfcHRyICs9IHNpemVvZihWZWN0b3I4KTsKKwkJCQljb250aW51ZTsKKwkJCX0K KwkJfQorI2VuZGlmCisKIAkJCWVuZF9wdHIgPSBjdXJfcHRyOwogCQkJaWYgKGN1cl9wdHIgPj0g bGluZV9lbmRfcHRyKQogCQkJCWJyZWFrOwotLSAKMi41MC4xCgo= --00000000000097ed71063bc492fb--