Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n7eiq-000735-Ov for pgsql-general@arkaria.postgresql.org; Wed, 12 Jan 2022 14:35:32 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.92) (envelope-from ) id 1n7eip-0002pJ-Eb for pgsql-general@arkaria.postgresql.org; Wed, 12 Jan 2022 14:35:31 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n7eip-0002pA-3w for pgsql-general@lists.postgresql.org; Wed, 12 Jan 2022 14:35:31 +0000 Received: from mail-lf1-x135.google.com ([2a00:1450:4864:20::135]) by magus.postgresql.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1n7eim-0000Au-Ng for pgsql-general@postgresql.org; Wed, 12 Jan 2022 14:35:30 +0000 Received: by mail-lf1-x135.google.com with SMTP id x6so8866710lfa.5 for ; Wed, 12 Jan 2022 06:35:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=BvQ48zvE9jsqjZJg35ubGgP2SaG0uQtpVgGbqdZ+714=; b=D60SDg4YANI5FoGQ+BiGvf9rWpvYJoDoNGx0DVPrlG9JYopUbLL5/C6yy7mK7AIv0s O2nKAdpBaf68kqSLlmNJvTP1S2cpd81p1W/Ir+jsy9M6LwsP9Y64INmrOCnAGfAUQ4Lv EIyNyDpBj0910d6x1X1rqRduRWNru1LlK8LCqkaFKL00PFknTxbKhKZsFelCUzRmQGfu Wdce9VGwnovJYOewGdNLd8CbczC/a9rgqhre+BCXMfv+cDIxDZhlYbd6Sr2t6Pd1LkFn vsKFw1pQ6DRIdOA7DI+ByXFkh4XGYXv5nPVtB403Fc4es3oD2G6ZCrl6SGvPSju3nJPP gjqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=BvQ48zvE9jsqjZJg35ubGgP2SaG0uQtpVgGbqdZ+714=; b=dOCg5x3RR30sn9+mag9aZ+k9Li+eR6s/16tCHinJG7HA2efKR9z/CNM81CWqWgUexd Wc/F/AOLXOceP3w4JulonK7TIjKYsvyUPyhmG9ejl45t6/OiRVcGsKl7pietGZCAM31+ Ja0AXGOTFywp1LMqeX0LBgUzsOmIZEzmmsSJrN7gLiOjwz/BzMxaNRoosYbnj/26MN71 gmyun2koPqybws77H4iUjE4P6bf7CGLLWaHqotj+xtyFTKibCjPQD9qShirPMSrtc8Su XWqjY9KxMNGgbJVzvyRGUfa1pAG0MIApjI7gMspUvhq9nDS/frwIOxHoz5Qj0/Jr0t69 FR+g== X-Gm-Message-State: AOAM532fPBGHBpA1SmW9UFMhPDGIHOAvP2mlPKsuLq/HALsNUj9aqu6X pPJaWTZ5Ta7/JLaZVhElX7tE35AhNnhnBv2MBSI= X-Google-Smtp-Source: ABdhPJwuvdrY5zcNU7OWWGbj/fyBiQPEKb9ZModG421eXWfM3XtdfAq/TbIiW9ZfiijbrtvMPNx53yvnlNZUM1zHlJU= X-Received: by 2002:a05:6512:22d6:: with SMTP id g22mr39268lfu.198.1641998127948; Wed, 12 Jan 2022 06:35:27 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Ian Lawrence Barwick Date: Wed, 12 Jan 2022 23:35:16 +0900 Message-ID: Subject: Re: How to read an external pdf file from postgres? To: Amine Tengilimoglu Cc: "pgsql-general@postgresql.org >> PG-General Mailing List" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk 2022=E5=B9=B41=E6=9C=8812=E6=97=A5(=E6=B0=B4) 20:16 Amine Tengilimoglu : > > Hi; > > I want to read an external pdf file from postgres. pdf file will exi= st on the disk. postgres only know the disk full path as metadata. Is there= any software or extension that can be used for this? Or do we have to deve= lop software for it? Or what is the best approach for this? I'd appreciate= it if anyone with experience could make suggestions. By "read" do you mean "open the file and meaningful extract data from it"? = If so, speaking from prior experience, don't. And if you really have to, make = sure the source PDF is guaranteed to be in a well-defined, predictable format enforceable by contract law and/or people with sharp pointy sticks. I have successfully suppressed the memories of whatever it is I once had to do wit= h reading data from PDFs, but though the data was eventually imported into PostgreSQL, there was a lot of mangling probably involving a Perl module (o= ther languages are probably available) before it got anywhere near the database. Reagrds Ian Barwick --=20 EnterpriseDB: https://www.enterprisedb.com