Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sQTbR-00A0hr-1d for pgsql-general@arkaria.postgresql.org; Sun, 07 Jul 2024 15:15:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sQTbP-00DJwJ-FR for pgsql-general@arkaria.postgresql.org; Sun, 07 Jul 2024 15:14:59 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sQTbP-00DJwA-4W for pgsql-general@lists.postgresql.org; Sun, 07 Jul 2024 15:14:59 +0000 Received: from fhigh8-smtp.messagingengine.com ([103.168.172.159]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sQTbH-000s2b-St for pgsql-general@lists.postgresql.org; Sun, 07 Jul 2024 15:14:57 +0000 Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 48865114029E; Sun, 7 Jul 2024 11:14:51 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Sun, 07 Jul 2024 11:14:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aklaver.com; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1720365291; x=1720451691; bh=jUnnxyxvGEgBHBXL/tbSbVVl52Ex4CKUFlChVy8hRDo=; b= DGcHmENlbMhZKNVr4NQyGFmCBRVSBK65M5nR/0KVTnWFsIHkqYHDK4C1KONh0jxL Q2BPqAQpsRPULtdQOrSc4HaFTXOPVYqehP+niL8Ms9S9mznojzp1vDuh8x8lcSNA twoHnZ/lCt4pUrAmn6JxV59eAfsibWeWOxp47FwC7c5SNf50PjBu9dpEdOHjiyFZ qykgnC0eVR3Gy2C4OLYMldfaK2SNP8hy8UUpGkEHcxFOlmM6aP1NOSK6ePITtPYI z7PW4J6xDtrwKljvD3xex0V0Mk36PiV1pEEMFgETqh4izseA/8yooktlHGQrg75/ 2Kp+DG/HEIf79lB9I7n3kg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1720365291; x= 1720451691; bh=jUnnxyxvGEgBHBXL/tbSbVVl52Ex4CKUFlChVy8hRDo=; b=j 00y4rdRKDlZmzyPCyrD3rAI2J5MuSSVPeuZtyL6I620Luj38m9rrO9+txNPBp922 A+DNf9Y/h6P0QMNu8P9c/gfXiyj8G6XxKaTJQpNAKb9pfGubqx/we4YFf0cO93j0 wCTpewU0Rad7vQXvBlmuw3XYU0I+NNiR1/dqm0+ns+ADYbWR2Bh/LNDTZFS8RFTt 2Dc8ETCr1uF2piCMM8FJkJ+agpSURmk6c5SE2FRsiM7eDEREccvHGqvs7KNcElzm JNT865m3GLB/6J1HHkGmbLEGJqI5S4/j1euDGOnhe0vMAaWnQD0I3XLTTDjIZ7gk RPFd2fBJ3b9XvVAwHXgrA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdehgdekkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepkfffgggfuffvvehfhfgjtgfgsehtke ertddtvdejnecuhfhrohhmpeetughrihgrnhcumfhlrghvvghruceorggurhhirghnrdhk lhgrvhgvrhesrghklhgrvhgvrhdrtghomheqnecuggftrfgrthhtvghrnhepfeegfeeiue dtgffgteeggfehkeejheetieeliefgteeikeejvdeiveeigfehvedtnecuvehluhhsthgv rhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprggurhhirghnrdhklhgrvh gvrhesrghklhgrvhgvrhdrtghomh X-ME-Proxy: Feedback-ID: i76984098:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Sun, 7 Jul 2024 11:14:50 -0400 (EDT) Message-ID: <5937dba2-8ac5-4c88-a419-7bd4f428e9bf@aklaver.com> Date: Sun, 7 Jul 2024 08:14:50 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Load a csv or a avro? To: sud Cc: pgsql-general References: <11d5753c-578e-41a2-af17-6de956f03058@aklaver.com> Content-Language: en-US From: Adrian Klaver In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 7/6/24 13:09, sud wrote: > On Fri, Jul 5, 2024 at 8:24 PM Adrian Klaver > wrote: > > On 7/5/24 02:08, sud wrote: > > Hello all, > > > > Its postgres database. We have option of getting files in csv > and/or in > > avro format messages from another system to load it into our > postgres > > database. The volume will be 300million messages per day across many > > files in batches. > > Are dumping the entire contents of each file or are you pulling a > portion of the data out? > > > > Yes, all the fields in the file have to be loaded to the columns in the > tables in postgres. But how will that matter here for deciding if we > should ask the data in .csv or .avro format from the outside system to > load into the postgres database in row and column format? Again my > understanding was that irrespective of anything , the .csv file load > will always faster as because the data is already stored in row and > column format as compared to the .avro file in which the parser has to > perform additional job to make it row and column format or map it to the > columns of the database table. Is my understanding correct here? If you are going to use complete rows and all rows then COPY of CSV in Postgres would be your best choice. -- Adrian Klaver adrian.klaver@aklaver.com