public inbox for [email protected]  
help / color / mirror / Atom feed
From: Ian Lawrence Barwick <[email protected]>
To: David Johnston <[email protected]>
Cc: pgsql-docs <[email protected]>
Subject: Re: PATCH: Warn users about tablespace abuse data loss risk
Date: Wed, 12 Feb 2014 15:16:58 +0900
Message-ID: <CAB8KJ=iFi6V5xRojdWeSC+CWNksRsAB9MNkjB9Th37aQg38w8Q@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<CAB8KJ=hR+tmQAxdV8Gv3tJUN3cUp5o_x4t-mj5Ub=ZhZbYBcig@mail.gmail.com>
	<[email protected]>
List-Unsubscribe: <mailto:[email protected]?body=unsub%20pgsql-docs>

2014-02-12 14:06 GMT+09:00 David Johnston <[email protected]>:
> Ian Lawrence Barwick wrote
>> 2014-02-12 12:52 GMT+09:00 Craig Ringer <
>
>> craig@
>
>> >:
>>> Hi all
>>>
>>> I've just seen another case of data loss due to misuse of /
>>> misunderstanding of tablespaces:
>>>
>>> http://dba.stackexchange.com/questions/58704/how-do-i-access-a-old-saved-tablespace-after-reinstalli...
>>>
>>> and it's prompted me to write some docs amendments to make it more
>>> obvious that *you shouldn't do that*.
>>>
>>> Not that it'll stop people, but it'll at least mean they can't say we
>>> didn't warn them.
>>>
>>> This is actually quite important, because many users are used to MySQL's
>>> MyISAM, where each table contains its own metadata and is readable by
>>> simply copying the table into a different MySQL install's data
>>> directory. It doesn't even have to be the same version! Users are
>>> clearly surprised that PostgreSQL tablespaces don't have the same
>>> properties.
>>>
>>> Thoughts?
>>
>> People still use MyISAM!?
>>
>> I had a similar issue pop up at work a while back, having something
>> explicit to point to is definitely a good idea.
>>
>> Suggestion for the first paragraph of the patch (sorry I can't provide it
>> in
>> patch form right now):
>>
>>   Even if they are located outside the main PostgreSQL data directory,
>> tablespaces
>>   are an integral part of the database cluster and
>> <emphasis>
>> cannot
>> </emphasis>
>>  be
>>   treated as an autonomous collection of data files. They rely on
>> metadata contained
>>   in the main data directory, without which they are useless. In
>> particular, tablespaces
>>   cannot be reattached to a different database cluster, and backing up
>> individual
>>   tablespaces makes no sense as a backup/redundancy method. Similarly,
>> if you lose a
>>   tablespace (file deletion, disk failure, etc) the main database may
>> become unreadable
>>   or fail to start.
>>

> While providing additional warnings is good and necessary it may also help
> to be more descriptive as to in what situations tablespaces are appropriate
> and/or necessary so that people leave with a better understanding of why the
> feature exists and not just trying to know what not to use it for.  It goes
> against the more prescriptive tone of the documentation generally but both
> approaches work well together to tackle the knowledge/understanding gap some
> users seem to have.

The warning would appear on this page:

  http://www.postgresql.org/docs/current/static/manage-ag-tablespaces.html

which describes what tablespaces *can* do, but unless you're familiar with the
structure of the PostgreSQL data directories, it's not obvious what you *can't*
do. I recall reading a blog post a while back about tablespaces being "archived"
to the cloud with disastrous results, and a quick search pulls up
stuff like this:

  http://stackoverflow.com/questions/3534415/moving-postgres-tablespaces-and-tables-across-ec2-instanc...

so it's definitely not a niche issue. Something "official" to link to
would be very
useful in this kind of situation. That doesn't preclude the general
documentation
being improved of course.

Regards

Ian Barwick


-- 
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs



view thread (7+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: PATCH: Warn users about tablespace abuse data loss risk
  In-Reply-To: <CAB8KJ=iFi6V5xRojdWeSC+CWNksRsAB9MNkjB9Th37aQg38w8Q@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox