public inbox for [email protected]  
help / color / mirror / Atom feed
Archiving solutions
7+ messages / 4 participants
[nested] [flat]

* Archiving solutions
@ 2025-01-12 20:42  kasem adel <[email protected]>
  0 siblings, 2 replies; 7+ messages in thread

From: kasem adel @ 2025-01-12 20:42 UTC (permalink / raw)
  To: pgsql-admin

Dears ,

Appreciate your support I have 2.4 TB database and I need to implement
archiving solutions to prevent data growthing.

Thanks


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-12 20:55  Imran Khan <[email protected]>
  parent: kasem adel <[email protected]>
  1 sibling, 1 reply; 7+ messages in thread

From: Imran Khan @ 2025-01-12 20:55 UTC (permalink / raw)
  To: kasem adel <[email protected]>; +Cc: pgsql-admin

Hi,

Plan to use table partitioning and do regular housekeeping of the cluster.

Thanks,
Imran

On Sun, Jan 12, 2025, 11:42 PM kasem adel <[email protected]> wrote:

> Dears ,
>
> Appreciate your support I have 2.4 TB database and I need to implement
> archiving solutions to prevent data growthing.
>
> Thanks
>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-13 01:56  Ron Johnson <[email protected]>
  parent: kasem adel <[email protected]>
  1 sibling, 1 reply; 7+ messages in thread

From: Ron Johnson @ 2025-01-13 01:56 UTC (permalink / raw)
  To: pgsql-admin

COPY exists, and is very fast.

In my experience, DELETE is quite fast when you:
1. have a supporting index, and
2. don't try to delete too much in one statement.

Those are the only PG comments that I needed to archive a 6TB DB down to
2TB.

On Sun, Jan 12, 2025 at 3:42 PM kasem adel <[email protected]> wrote:

> Dears ,
>
> Appreciate your support I have 2.4 TB database and I need to implement
> archiving solutions to prevent data growthing.
>
> Thanks
>


-- 
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-13 04:33  Rajesh Kumar <[email protected]>
  parent: Ron Johnson <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: Rajesh Kumar @ 2025-01-13 04:33 UTC (permalink / raw)
  To: Ron Johnson <[email protected]>; +Cc: pgsql-admin

What are regular housekeeping activities @Imran?

On Mon, 13 Jan 2025, 07:27 Ron Johnson, <[email protected]> wrote:

> COPY exists, and is very fast.
>
> In my experience, DELETE is quite fast when you:
> 1. have a supporting index, and
> 2. don't try to delete too much in one statement.
>
> Those are the only PG comments that I needed to archive a 6TB DB down to
> 2TB.
>
> On Sun, Jan 12, 2025 at 3:42 PM kasem adel <[email protected]> wrote:
>
>> Dears ,
>>
>> Appreciate your support I have 2.4 TB database and I need to implement
>> archiving solutions to prevent data growthing.
>>
>> Thanks
>>
>
>
> --
> Death to <Redacted>, and butter sauce.
> Don't boil me, I'm still alive.
> <Redacted> lobster!
>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-13 04:48  Imran Khan <[email protected]>
  parent: Rajesh Kumar <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Imran Khan @ 2025-01-13 04:48 UTC (permalink / raw)
  To: Rajesh Kumar <[email protected]>; +Cc: Ron Johnson <[email protected]>; pgsql-admin

Hi Rajesh,

Housekeeping activities in PostgreSQL are essential to ensure the database
remains high performing, secure, and reliable over time. Below is a
categorized list of key housekeeping tasks:


---

1. Routine Maintenance Tasks

Vacuuming:

Use VACUUM to reclaim storage and update visibility maps.

Run VACUUM FULL for aggressive storage reclamation (requires downtime).


Analyze Statistics:

Run ANALYZE to update query planner statistics for efficient query
execution.

Use autovacuum for automatic vacuum and analyze.


Reindexing:

Use REINDEX to rebuild corrupted or bloated indexes.

Schedule periodic reindexing for heavily updated tables.




---

2. Backup and Recovery

Regular Backups:

Use pg_basebackup or pg_dump for full backups.

Implement incremental backups if using third-party tools (e.g., pgBackRest,
Barman).


Test Recovery:

Periodically restore backups in a test environment to ensure recovery
readiness.




---

3. Monitoring and Logging

Review Logs:

Check PostgreSQL logs for errors, slow queries, and unusual activities.

Adjust log_min_duration_statement to capture slow queries.


Monitor Performance Metrics:

Use tools like pg_stat_activity, pg_stat_replication, and
pg_stat_progress_vacuum.

Implement monitoring solutions (e.g., pgAdmin, Prometheus/Grafana,
pgwatch2).




---

4. Database Optimization

Index Maintenance:

Identify unused or bloated indexes using pg_stat_user_indexes.

Drop unused indexes and optimize queries.


Query Optimization:

Use EXPLAIN or EXPLAIN ANALYZE to analyze query performance.

Optimize slow queries by rewriting or indexing.


Partitioning:

Use table partitioning for large datasets to improve performance and
manageability.




---

5. Configuration Management

Parameter Tuning:

Adjust key parameters like work_mem, maintenance_work_mem,
effective_cache_size, and max_connections.

Regularly review and tune configuration files (postgresql.conf and
pg_hba.conf).


Autovacuum Settings:

Ensure autovacuum is enabled and tuned for your workload.

Monitor pg_stat_autovacuum for activity.




---

6. Security and User Management

User Roles and Permissions:

Regularly review roles and privileges.

Remove unused accounts and enforce strong passwords.


SSL/TLS Management:

Enable SSL for secure connections.

Rotate certificates periodically.


Audit Logs:

Use extensions like pgAudit for detailed logging of user activities.




---

7. Archiving and WAL Management

WAL Archiving:

Enable archive_mode and configure archive_command for point-in-time
recovery (PITR).


Cleanup Old WAL Files:

Ensure old WAL files are deleted or archived properly to free up storage.




---

8. Database Growth Management

Table and Index Bloat Analysis:

Use pgstattuple or pg_repack to identify and manage bloat.


Disk Usage Monitoring:

Regularly monitor disk usage with pg_database_size() and pg_relation_size().




---

9. Updates and Patches

Apply Updates:

Keep PostgreSQL updated with the latest patches for bug fixes and security.


Extension Updates:

Update installed extensions (e.g., PostGIS, pglogical) to their latest
versions.




---

10. Miscellaneous Tasks

Replication Monitoring:

Monitor replication lag and the health of replicas using
pg_stat_replication.


System Resource Monitoring:

Monitor CPU, memory, and I/O usage to ensure the system is not bottlenecked.


Disk Defragmentation:

Perform disk defragmentation (OS level) if necessary.




---

These activities, when performed periodically, will help keep your
PostgreSQL database healthy and performing optimally. Automating some of
these tasks via scripts or tools can further reduce manual intervention.


On Mon, Jan 13, 2025, 7:33 AM Rajesh Kumar <[email protected]>
wrote:

> What are regular housekeeping activities @Imran?
>
> On Mon, 13 Jan 2025, 07:27 Ron Johnson, <[email protected]> wrote:
>
>> COPY exists, and is very fast.
>>
>> In my experience, DELETE is quite fast when you:
>> 1. have a supporting index, and
>> 2. don't try to delete too much in one statement.
>>
>> Those are the only PG comments that I needed to archive a 6TB DB down to
>> 2TB.
>>
>> On Sun, Jan 12, 2025 at 3:42 PM kasem adel <[email protected]> wrote:
>>
>>> Dears ,
>>>
>>> Appreciate your support I have 2.4 TB database and I need to implement
>>> archiving solutions to prevent data growthing.
>>>
>>> Thanks
>>>
>>
>>
>> --
>> Death to <Redacted>, and butter sauce.
>> Don't boil me, I'm still alive.
>> <Redacted> lobster!
>>
>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-13 23:08  kasem adel <[email protected]>
  parent: Imran Khan <[email protected]>
  0 siblings, 1 reply; 7+ messages in thread

From: kasem adel @ 2025-01-13 23:08 UTC (permalink / raw)
  To: Imran Khan <[email protected]>; pgsql-admin

Dear imeran ,

Thanks for your support.

Dears,

Appreciate your support in simple solution to implement it.

Thanks



في الأحد، ١٢ يناير ٢٠٢٥ ١١:٢٣ م Imran Khan <[email protected]> كتب:

> Hi,
>
>  How the data can be stored. Do you have historical data? If yes , then
> you can keep that data in archived tables or if that data is still required
> by business then store it in partitions monthly wise . You need to design
> the data storage with proper planning. Also, if you have dynamic and ever
> changing data and you have date records then store with respect to
> partitions and retain for last 3 or 6 months then move to archived tables
> or purge if not required .. keep track of indexes and capture long running
> SQL statements through pg_stat_statements extension and run explain plan to
> get the proper indexes created based on the output of that plan. Also, keep
> track of memory related parameters to check if any memory leak occurs
> resulting in EXCESSIVE swap space usage on OS.
>
>  That's my opinion but we have more experts here who can help us to
> understand more.
>
> Thanks,
> Imran
>
> On Sun, Jan 12, 2025, 11:57 PM kasem adel <[email protected]> wrote:
>
>> Hi,
>>
>> What is the regular housekeeping.
>>
>> Thanks
>>
>> في الأحد، ١٢ يناير ٢٠٢٥ ١٠:٥٥ م Imran Khan <[email protected]> كتب:
>>
>>> Hi,
>>>
>>> Plan to use table partitioning and do regular housekeeping of the
>>> cluster.
>>>
>>> Thanks,
>>> Imran
>>>
>>> On Sun, Jan 12, 2025, 11:42 PM kasem adel <[email protected]> wrote:
>>>
>>>> Dears ,
>>>>
>>>> Appreciate your support I have 2.4 TB database and I need to implement
>>>> archiving solutions to prevent data growthing.
>>>>
>>>> Thanks
>>>>
>>>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread

* Re: Archiving solutions
@ 2025-01-24 03:36  Rajesh Kumar <[email protected]>
  parent: kasem adel <[email protected]>
  0 siblings, 0 replies; 7+ messages in thread

From: Rajesh Kumar @ 2025-01-24 03:36 UTC (permalink / raw)
  To: kasem adel <[email protected]>; +Cc: Imran Khan <[email protected]>; pgsql-admin

Thank you Imran

On Tue, 14 Jan 2025, 04:38 kasem adel, <[email protected]> wrote:

> Dear imeran ,
>
> Thanks for your support.
>
> Dears,
>
> Appreciate your support in simple solution to implement it.
>
> Thanks
>
>
>
> في الأحد، ١٢ يناير ٢٠٢٥ ١١:٢٣ م Imran Khan <[email protected]> كتب:
>
>> Hi,
>>
>>  How the data can be stored. Do you have historical data? If yes , then
>> you can keep that data in archived tables or if that data is still required
>> by business then store it in partitions monthly wise . You need to design
>> the data storage with proper planning. Also, if you have dynamic and ever
>> changing data and you have date records then store with respect to
>> partitions and retain for last 3 or 6 months then move to archived tables
>> or purge if not required .. keep track of indexes and capture long running
>> SQL statements through pg_stat_statements extension and run explain plan to
>> get the proper indexes created based on the output of that plan. Also, keep
>> track of memory related parameters to check if any memory leak occurs
>> resulting in EXCESSIVE swap space usage on OS.
>>
>>  That's my opinion but we have more experts here who can help us to
>> understand more.
>>
>> Thanks,
>> Imran
>>
>> On Sun, Jan 12, 2025, 11:57 PM kasem adel <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> What is the regular housekeeping.
>>>
>>> Thanks
>>>
>>> في الأحد، ١٢ يناير ٢٠٢٥ ١٠:٥٥ م Imran Khan <[email protected]> كتب:
>>>
>>>> Hi,
>>>>
>>>> Plan to use table partitioning and do regular housekeeping of the
>>>> cluster.
>>>>
>>>> Thanks,
>>>> Imran
>>>>
>>>> On Sun, Jan 12, 2025, 11:42 PM kasem adel <[email protected]> wrote:
>>>>
>>>>> Dears ,
>>>>>
>>>>> Appreciate your support I have 2.4 TB database and I need to implement
>>>>> archiving solutions to prevent data growthing.
>>>>>
>>>>> Thanks
>>>>>
>>>>


^ permalink  raw  reply  [nested|flat] 7+ messages in thread


end of thread, other threads:[~2025-01-24 03:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-01-12 20:42 Archiving solutions kasem adel <[email protected]>
2025-01-12 20:55 ` Imran Khan <[email protected]>
2025-01-13 23:08   ` kasem adel <[email protected]>
2025-01-24 03:36     ` Rajesh Kumar <[email protected]>
2025-01-13 01:56 ` Ron Johnson <[email protected]>
2025-01-13 04:33   ` Rajesh Kumar <[email protected]>
2025-01-13 04:48     ` Imran Khan <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox