All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* OSD-Based Object Stubs
@ 2015-05-27  8:39 Marcel Lauhoff
  2015-05-27  9:24 ` Haomai Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Marcel Lauhoff @ 2015-05-27  8:39 UTC (permalink / raw
  To: ceph-devel

Hi,

I wrote a prototype for an OSD-based object stub feature. An object stub
being an object with it's data moved /elsewhere/. I hope to get some
feedback, especially whether I'm on the right path here and if it
is a feature you are interested in.



Code is in my "osd-stubs" branch:
 https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
 https://github.com/irq0/ceph/tree/osd-stubs

 Tools to toy around with osd-stubs + web server to send stubs to:
 https://github.com/irq0/ceph_osd-stub_tools



Related:
- https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects



Implementation:

Adds two new OSD OPs:
- STUB :: Move data away; Save location in xattr; Set
          object_info_t::stub_state to 'remote'
- UNSTUB :: Get data back; Remove xattr; Set object_info_t::stub_state
            to 'local'


STUB is meant to be called by an external archive agent. UNSTUB
implicitly when OPs come in that need the object's data. Operations are
classified as "may_use_obj_data" similar to how op->may_{write,read,cache} work.

The implicit UNSUB is implemented by prepending an UNSTUB operation to the
incoming OP list if "may_use_obj_data() and stub_state == REMOTE".
This sadly causes a waring in the client saying that
the reply doesn't match the request. He is of course right, but I found
it to be the simplest way to try that feature.

External storage in the prototype is just simple HTTP: PUT on STUB;
GET+DELETE on UNSTUB.

The Operations implement a kind of converter: STUB reads from the
primary OSD, store the object on the remote and then issue TRUNCATE(0)
and SETXATTR OPs. Similar UNSTUB retrieves the object, then does
WRITEFULL and RMXATTR.


~marcel

--
Marcel Lauhoff
Mail/XMPP: ml@irq0.org
http://irq0.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-27  8:39 OSD-Based Object Stubs Marcel Lauhoff
@ 2015-05-27  9:24 ` Haomai Wang
  2015-05-27 16:37 ` Robert LeBlanc
  2015-05-28  5:56 ` Gregory Farnum
  2 siblings, 0 replies; 9+ messages in thread
From: Haomai Wang @ 2015-05-27  9:24 UTC (permalink / raw
  To: Marcel Lauhoff; +Cc: ceph-devel@vger.kernel.org

I guess it should be something like sam designed in
CDS(https://wiki.ceph.com/Planning/Blueprints/Infernalis/osd%3A_Tiering_II_(Warm-%3ECold))

On Wed, May 27, 2015 at 4:39 PM, Marcel Lauhoff <ml@irq0.org> wrote:
> Hi,
>
> I wrote a prototype for an OSD-based object stub feature. An object stub
> being an object with it's data moved /elsewhere/. I hope to get some
> feedback, especially whether I'm on the right path here and if it
> is a feature you are interested in.
>
>
>
> Code is in my "osd-stubs" branch:
>  https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
>  https://github.com/irq0/ceph/tree/osd-stubs
>
>  Tools to toy around with osd-stubs + web server to send stubs to:
>  https://github.com/irq0/ceph_osd-stub_tools
>
>
>
> Related:
> - https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects
>
>
>
> Implementation:
>
> Adds two new OSD OPs:
> - STUB :: Move data away; Save location in xattr; Set
>           object_info_t::stub_state to 'remote'
> - UNSTUB :: Get data back; Remove xattr; Set object_info_t::stub_state
>             to 'local'
>
>
> STUB is meant to be called by an external archive agent. UNSTUB
> implicitly when OPs come in that need the object's data. Operations are
> classified as "may_use_obj_data" similar to how op->may_{write,read,cache} work.
>
> The implicit UNSUB is implemented by prepending an UNSTUB operation to the
> incoming OP list if "may_use_obj_data() and stub_state == REMOTE".
> This sadly causes a waring in the client saying that
> the reply doesn't match the request. He is of course right, but I found
> it to be the simplest way to try that feature.
>
> External storage in the prototype is just simple HTTP: PUT on STUB;
> GET+DELETE on UNSTUB.
>
> The Operations implement a kind of converter: STUB reads from the
> primary OSD, store the object on the remote and then issue TRUNCATE(0)
> and SETXATTR OPs. Similar UNSTUB retrieves the object, then does
> WRITEFULL and RMXATTR.
>
>
> ~marcel
>
> --
> Marcel Lauhoff
> Mail/XMPP: ml@irq0.org
> http://irq0.org
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best Regards,

Wheat

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-27  8:39 OSD-Based Object Stubs Marcel Lauhoff
  2015-05-27  9:24 ` Haomai Wang
@ 2015-05-27 16:37 ` Robert LeBlanc
  2015-05-28  9:45   ` Marcel Lauhoff
  2015-05-28  5:56 ` Gregory Farnum
  2 siblings, 1 reply; 9+ messages in thread
From: Robert LeBlanc @ 2015-05-27 16:37 UTC (permalink / raw
  To: Marcel Lauhoff; +Cc: ceph-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

At first I thought this was to allow the OSDs to stub the location of
the real data after a CRUSH map change so that it didn't have to
relocate the data right away (or at all) and reduce the  number of map
changes. Then I realized it was for cold storage tiering. Is the idea
to be able to move data off to near-line storage like tape?

Thanks,
- ----------------
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, May 27, 2015 at 2:39 AM, Marcel Lauhoff  wrote:
> Hi,
>
> I wrote a prototype for an OSD-based object stub feature. An object stub
> being an object with it's data moved /elsewhere/. I hope to get some
> feedback, especially whether I'm on the right path here and if it
> is a feature you are interested in.
>
>
>
> Code is in my "osd-stubs" branch:
>  https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
>  https://github.com/irq0/ceph/tree/osd-stubs
>
>  Tools to toy around with osd-stubs + web server to send stubs to:
>  https://github.com/irq0/ceph_osd-stub_tools
>
>
>
> Related:
> - https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects
>
>
>
> Implementation:
>
> Adds two new OSD OPs:
> - STUB :: Move data away; Save location in xattr; Set
>           object_info_t::stub_state to 'remote'
> - UNSTUB :: Get data back; Remove xattr; Set object_info_t::stub_state
>             to 'local'
>
>
> STUB is meant to be called by an external archive agent. UNSTUB
> implicitly when OPs come in that need the object's data. Operations are
> classified as "may_use_obj_data" similar to how op->may_{write,read,cache} work.
>
> The implicit UNSUB is implemented by prepending an UNSTUB operation to the
> incoming OP list if "may_use_obj_data() and stub_state == REMOTE".
> This sadly causes a waring in the client saying that
> the reply doesn't match the request. He is of course right, but I found
> it to be the simplest way to try that feature.
>
> External storage in the prototype is just simple HTTP: PUT on STUB;
> GET+DELETE on UNSTUB.
>
> The Operations implement a kind of converter: STUB reads from the
> primary OSD, store the object on the remote and then issue TRUNCATE(0)
> and SETXATTR OPs. Similar UNSTUB retrieves the object, then does
> WRITEFULL and RMXATTR.
>
>
> ~marcel
>
> --
> Marcel Lauhoff
> Mail/XMPP: ml@irq0.org
> http://irq0.org
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-----BEGIN PGP SIGNATURE-----
Version: Mailvelope v0.13.1
Comment: https://www.mailvelope.com

wsFcBAEBCAAQBQJVZfK8CRDmVDuy+mK58QAAQUAP/jCje1UlnbYJYlLc2aLC
OkXBPRTQH1+kza7uloibEpZLaaQpJ3IP+ttBjzbUz3w4DC6Gh31nJgWzidFL
qji/b9qIRNBRquypR575K17PtK2xuudFnIObcaGCgmQ36XHcMqjFZYON6I4Q
wE+kptkFCbSzo+5JBZsJwsjrQNpVMonSxaSZ08WNamslIWhKuLpFfWF49YyW
Aq9YiSehXxiWpUU880FyDY/WNLqflWrWUSXXt+T/cYzzGb9BXAH4jEJl2Dtn
QIothZZeabDs+QpF9dxL3teOlHhcqHjPdPOK102/67E+hZz91PTVzf5r3Egs
B2OEL3AbjsYk3A5EZK5rxxd4Judc+glDYKq4H+wVFhVFiuBrcv/UsIFx/ZHH
9nJijKQFu/JKJzrkl0ubZDZodXSyR7h+dBUnPwiKNZvBWqwR5BaYJepEl82H
hTr3GslhLg1MpaMGqST7yOtR7BxZTb4+U5LhCUZXW5JFvjkNM0kS8VgtnkPP
OsPLETq8kTAsnxCzbRtwFIZSW0236Rf2z9naazbgmaZW/uwjlz2rWaxSw8zA
+uYBo6dEO7GhhUPrgJW/EiVkqEpwXRqd2hYSeiD63Bz8/efxSExAaoFgHPRt
fDRJKHheDmzzYRtOHX21QTV6Lnw9hpjdmC27WAmEfvTNCmLWrzXeF4SS/g1g
3/+W
=7BjB
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-27  8:39 OSD-Based Object Stubs Marcel Lauhoff
  2015-05-27  9:24 ` Haomai Wang
  2015-05-27 16:37 ` Robert LeBlanc
@ 2015-05-28  5:56 ` Gregory Farnum
  2015-05-28 10:01   ` Marcel Lauhoff
  2 siblings, 1 reply; 9+ messages in thread
From: Gregory Farnum @ 2015-05-28  5:56 UTC (permalink / raw
  To: Marcel Lauhoff; +Cc: ceph-devel@vger.kernel.org

On Wed, May 27, 2015 at 1:39 AM, Marcel Lauhoff <ml@irq0.org> wrote:
> Hi,
>
> I wrote a prototype for an OSD-based object stub feature. An object stub
> being an object with it's data moved /elsewhere/. I hope to get some
> feedback, especially whether I'm on the right path here and if it
> is a feature you are interested in.
>
>
>
> Code is in my "osd-stubs" branch:
>  https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
>  https://github.com/irq0/ceph/tree/osd-stubs
>
>  Tools to toy around with osd-stubs + web server to send stubs to:
>  https://github.com/irq0/ceph_osd-stub_tools
>
>
>
> Related:
> - https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects

Do you have a shorter summary than the code of how these stub and
unstub operations relate to the object redirects? We didn't make a
great deal of use of them but the basic data structures are mostly
present in the codebase, are interpreted in at least some of the right
places, and were definitely intended to cover this kind of use case.
:)
-Greg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-27 16:37 ` Robert LeBlanc
@ 2015-05-28  9:45   ` Marcel Lauhoff
  0 siblings, 0 replies; 9+ messages in thread
From: Marcel Lauhoff @ 2015-05-28  9:45 UTC (permalink / raw
  To: Robert LeBlanc; +Cc: ceph-devel


Robert LeBlanc <robert@leblancnet.us> writes:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> At first I thought this was to allow the OSDs to stub the location of
> the real data after a CRUSH map change so that it didn't have to
> relocate the data right away (or at all) and reduce the  number of map
> changes. Then I realized it was for cold storage tiering. Is the idea
> to be able to move data off to near-line storage like tape?

Yes, cold storage. Basically, let the OSD move data elsewhere like tape or some
archive system.

~marcel
--
Marcel Lauhoff
Mail/XMPP: ml@irq0.org
http://irq0.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-28  5:56 ` Gregory Farnum
@ 2015-05-28 10:01   ` Marcel Lauhoff
  2015-06-09 23:58     ` Gregory Farnum
  0 siblings, 1 reply; 9+ messages in thread
From: Marcel Lauhoff @ 2015-05-28 10:01 UTC (permalink / raw
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org


Gregory Farnum <greg@gregs42.com> writes:

> On Wed, May 27, 2015 at 1:39 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>> Hi,
>>
>> I wrote a prototype for an OSD-based object stub feature. An object stub
>> being an object with it's data moved /elsewhere/. I hope to get some
>> feedback, especially whether I'm on the right path here and if it
>> is a feature you are interested in.
>>
>>
>>
>> Code is in my "osd-stubs" branch:
>>  https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
>>  https://github.com/irq0/ceph/tree/osd-stubs
>>
>>  Tools to toy around with osd-stubs + web server to send stubs to:
>>  https://github.com/irq0/ceph_osd-stub_tools
>>
>>
>>
>> Related:
>> - https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects
>
> Do you have a shorter summary than the code of how these stub and
> unstub operations relate to the object redirects? We didn't make a
> great deal of use of them but the basic data structures are mostly
> present in the codebase, are interpreted in at least some of the right
> places, and were definitely intended to cover this kind of use case.
> :)
> -Greg


As far as I understood the redirect feature it is about pointing to
other objects inside the Ceph cluster. The stubs feature allows
pointing to anything. An HTTP server in concept code.

Then stubs use an IMHO simpler approach to getting objects back: It's
the task of the OSD. Stubbed objects just take longer to access, due to
unstubbing it first.
Redirects on the other hand leave this to the client: Object redirected
-> Tell client to retrieve it elsewhere.

~marcel

--
Marcel Lauhoff
Mail/XMPP: ml@irq0.org
http://irq0.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-05-28 10:01   ` Marcel Lauhoff
@ 2015-06-09 23:58     ` Gregory Farnum
  2015-06-20 10:18       ` Marcel Lauhoff
  0 siblings, 1 reply; 9+ messages in thread
From: Gregory Farnum @ 2015-06-09 23:58 UTC (permalink / raw
  To: Marcel Lauhoff; +Cc: ceph-devel@vger.kernel.org

On Thu, May 28, 2015 at 3:01 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>
> Gregory Farnum <greg@gregs42.com> writes:
>
>> On Wed, May 27, 2015 at 1:39 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>>> Hi,
>>>
>>> I wrote a prototype for an OSD-based object stub feature. An object stub
>>> being an object with it's data moved /elsewhere/. I hope to get some
>>> feedback, especially whether I'm on the right path here and if it
>>> is a feature you are interested in.
>>>
>>>
>>>
>>> Code is in my "osd-stubs" branch:
>>>  https://github.com/ceph/ceph/compare/master...irq0:osd-stubs
>>>  https://github.com/irq0/ceph/tree/osd-stubs
>>>
>>>  Tools to toy around with osd-stubs + web server to send stubs to:
>>>  https://github.com/irq0/ceph_osd-stub_tools
>>>
>>>
>>>
>>> Related:
>>> - https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/osd:_tiering:_object_redirects
>>
>> Do you have a shorter summary than the code of how these stub and
>> unstub operations relate to the object redirects? We didn't make a
>> great deal of use of them but the basic data structures are mostly
>> present in the codebase, are interpreted in at least some of the right
>> places, and were definitely intended to cover this kind of use case.
>> :)
>> -Greg
>
>
> As far as I understood the redirect feature it is about pointing to
> other objects inside the Ceph cluster. The stubs feature allows
> pointing to anything. An HTTP server in concept code.
>
> Then stubs use an IMHO simpler approach to getting objects back: It's
> the task of the OSD. Stubbed objects just take longer to access, due to
> unstubbing it first.
> Redirects on the other hand leave this to the client: Object redirected
> -> Tell client to retrieve it elsewhere.

Ah, of course.

I got a chance to look at this briefly today. Some notes:

* You're using synchronous reads. That will prevent use of stubbing on
EC pools (which only do async reads, as they might need to hit another
OSD for the data), which seems sad.
* There seems to be a race if you need to unstub an op for two
separate requests that come in simultaneously, with nothing preventing
both of them from initiating the unstub.
* You can inject an unstub for read ops, but that turns them into a
write. That will cause problems in various cases where the object
isn't writeable yet.
* Why does a delete need the object data?
* You definitely wouldn't want to unstub data for scrubbing.
* There's a CEPH_OSD_OP_STAT which looks at what's in the object info;
that is broken here because you're using the normal truncation path.
There probably needs to be more cleverness or machinery distinguishing
between the "local" size used and the size of the object represented.
* I am concerned about recovery, but I think as long as the stub is
just a normal zero-sized object with an extra flag in the object info
that shouldn't be a big deal.
* I think snapshots are probably busted with this; did you check how
they interact?
-Greg

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-06-09 23:58     ` Gregory Farnum
@ 2015-06-20 10:18       ` Marcel Lauhoff
  2015-06-23 12:38         ` Gregory Farnum
  0 siblings, 1 reply; 9+ messages in thread
From: Marcel Lauhoff @ 2015-06-20 10:18 UTC (permalink / raw
  To: Gregory Farnum; +Cc: ceph-devel@vger.kernel.org


Hi,

thanks for the comments!

Gregory Farnum <greg@gregs42.com> writes:

> On Thu, May 28, 2015 at 3:01 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>>
>> Gregory Farnum <greg@gregs42.com> writes:
>>
>>> Do you have a shorter summary than the code of how these stub and
>>> unstub operations relate to the object redirects? We didn't make a
>>> great deal of use of them but the basic data structures are mostly
>>> present in the codebase, are interpreted in at least some of the right
>>> places, and were definitely intended to cover this kind of use case.
>>> :)
>>> -Greg
>>
>> As far as I understood the redirect feature it is about pointing to
>> other objects inside the Ceph cluster. The stubs feature allows
>> pointing to anything. An HTTP server in concept code.
>>
>> Then stubs use an IMHO simpler approach to getting objects back: It's
>> the task of the OSD. Stubbed objects just take longer to access, due to
>> unstubbing it first.
>> Redirects on the other hand leave this to the client: Object redirected
>> -> Tell client to retrieve it elsewhere.
>
> Ah, of course.
>
> I got a chance to look at this briefly today. Some notes:
>
> * You're using synchronous reads. That will prevent use of stubbing on
> EC pools (which only do async reads, as they might need to hit another
> OSD for the data), which seems sad.
Good point. I didn't look at how EC pools work, yet. I assumed that
a stub feature would be quite different for both pool types and tried
the replicated first.

> * There seems to be a race if you need to unstub an op for two
> separate requests that come in simultaneously, with nothing preventing
> both of them from initiating the unstub.
Right. I should probably add some "in flight" states there.

> * You can inject an unstub for read ops, but that turns them into a
> write. That will cause problems in various cases where the object
> isn't writeable yet.
I thought I fixed that by doing "ctx->op->set_write()" in the implicit
unstub code.

> * Why does a delete need the object data?
That was just a short cut: In the quite simplistic Remote API there is
only put and get. A unstub before delete also deletes the remote object.

> * You definitely wouldn't want to unstub data for scrubbing.
What's the alternative? The remote should do scrubbing or just skip the
stubbed object?

> * There's a CEPH_OSD_OP_STAT which looks at what's in the object info;
> that is broken here because you're using the normal truncation path.
> There probably needs to be more cleverness or machinery distinguishing
> between the "local" size used and the size of the object represented.
Of course.

> * I think snapshots are probably busted with this; did you check how
> they interact?
With this implementation I think they really are. Stubs+Snapshouts could
be a nice thing for backups. Just stub a read only snapshot.

> -Greg

~marcel

--
Marcel Lauhoff
Mail/XMPP: ml@irq0.org
http://irq0.org
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: OSD-Based Object Stubs
  2015-06-20 10:18       ` Marcel Lauhoff
@ 2015-06-23 12:38         ` Gregory Farnum
  0 siblings, 0 replies; 9+ messages in thread
From: Gregory Farnum @ 2015-06-23 12:38 UTC (permalink / raw
  To: Marcel Lauhoff; +Cc: ceph-devel@vger.kernel.org

On Sat, Jun 20, 2015 at 11:18 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>
> Hi,
>
> thanks for the comments!
>
> Gregory Farnum <greg@gregs42.com> writes:
>
>> On Thu, May 28, 2015 at 3:01 AM, Marcel Lauhoff <ml@irq0.org> wrote:
>>>
>>> Gregory Farnum <greg@gregs42.com> writes:
>>>
>>>> Do you have a shorter summary than the code of how these stub and
>>>> unstub operations relate to the object redirects? We didn't make a
>>>> great deal of use of them but the basic data structures are mostly
>>>> present in the codebase, are interpreted in at least some of the right
>>>> places, and were definitely intended to cover this kind of use case.
>>>> :)
>>>> -Greg
>>>
>>> As far as I understood the redirect feature it is about pointing to
>>> other objects inside the Ceph cluster. The stubs feature allows
>>> pointing to anything. An HTTP server in concept code.
>>>
>>> Then stubs use an IMHO simpler approach to getting objects back: It's
>>> the task of the OSD. Stubbed objects just take longer to access, due to
>>> unstubbing it first.
>>> Redirects on the other hand leave this to the client: Object redirected
>>> -> Tell client to retrieve it elsewhere.
>>
>> Ah, of course.
>>
>> I got a chance to look at this briefly today. Some notes:
>>
>> * You're using synchronous reads. That will prevent use of stubbing on
>> EC pools (which only do async reads, as they might need to hit another
>> OSD for the data), which seems sad.
> Good point. I didn't look at how EC pools work, yet. I assumed that
> a stub feature would be quite different for both pool types and tried
> the replicated first.

I'm not sure that will be necessary, actually. The advantage of only
doing GET/PUT (unstub/stub) is that you're doing only full-object
reads and writes; it doesn't require any of the features EC pools
don't provide.

>> * There seems to be a race if you need to unstub an op for two
>> separate requests that come in simultaneously, with nothing preventing
>> both of them from initiating the unstub.
> Right. I should probably add some "in flight" states there.
>
>> * You can inject an unstub for read ops, but that turns them into a
>> write. That will cause problems in various cases where the object
>> isn't writeable yet.
> I thought I fixed that by doing "ctx->op->set_write()" in the implicit
> unstub code.

No, the implicit unstub will have to be more involved than that. :(
RADOS writes aren't allowed to return any data to the user except for
a return code, and I believe that's enforced at the end by clearing
out/ignoring any of the return bufferlists we would otherwise pack up.
This is because we have to be able to return the exact same stuff on
replayed ops, in case the acting set of OSDs changes without the
client getting a response. Now, the unstub is a bit different in that
the data doesn't change in response to the user requiring an unstub,
but I think it still has some parallelism issues in that scenario.

>
>> * Why does a delete need the object data?
> That was just a short cut: In the quite simplistic Remote API there is
> only put and get. A unstub before delete also deletes the remote object.
>
>> * You definitely wouldn't want to unstub data for scrubbing.
> What's the alternative? The remote should do scrubbing or just skip the
> stubbed object?

I think you'd want to scrub both the "full" and "stub" metadata for
the object, but rely on the stub target to keep the actual bundle of
bytes safe.

>
>> * There's a CEPH_OSD_OP_STAT which looks at what's in the object info;
>> that is broken here because you're using the normal truncation path.
>> There probably needs to be more cleverness or machinery distinguishing
>> between the "local" size used and the size of the object represented.
> Of course.
>
>> * I think snapshots are probably busted with this; did you check how
>> they interact?
> With this implementation I think they really are. Stubs+Snapshouts could
> be a nice thing for backups. Just stub a read only snapshot.

Right, so all of these things will need to be worked out well before
we contemplate merging, and some of them are complicated enough that
they might require changing the core implementation to handle. You
probably don't want to delay it. :)
-Greg

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-23 12:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-27  8:39 OSD-Based Object Stubs Marcel Lauhoff
2015-05-27  9:24 ` Haomai Wang
2015-05-27 16:37 ` Robert LeBlanc
2015-05-28  9:45   ` Marcel Lauhoff
2015-05-28  5:56 ` Gregory Farnum
2015-05-28 10:01   ` Marcel Lauhoff
2015-06-09 23:58     ` Gregory Farnum
2015-06-20 10:18       ` Marcel Lauhoff
2015-06-23 12:38         ` Gregory Farnum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.