Skip to content

Conversation

@mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Dec 3, 2018

This adds types.BlobInfoCache recording the encountered compressed/uncompressed digest relationships and docker:// blob locations, and using them to:

  • Mount blobs on docker:// destinations instead of copying the data again
  • Reuse local layers on containers-storage: destinations instead of pulling them
  • Avoid blob decompression when converting from schema1 just to figure out the DiffID values

The blob info cache is stored in /var/lib/containers/blob-info-cache-v1.boltdb for EUID=0 users, and $XDG_DATA_HOME/containers/blob-info-cache-v1.boltdb for unprivileged users.

See individual commit messages for more details.

WIP items:

  • When canSubstitute, dockerImageDestination makes a mount request even within the same repo, which is unnecessary.
  • Documentation of the BlobInfoCache interface and possibly other code
  • Tests for corner cases of the BoltDB code
  • Logging on failures in the BoltDB code
@mtrmac mtrmac force-pushed the blob-info-caching branch 2 times, most recently from 0071274 to 090ac79 Compare December 3, 2018 17:48
@rhatdan
Copy link
Member

rhatdan commented Dec 3, 2018

@rhatdan
Copy link
Member

rhatdan commented Dec 3, 2018

@nalind PTAL

@mtrmac
Copy link
Collaborator Author

mtrmac commented Dec 3, 2018

Tests are failing here due to the API change, updated Skopeo is running the tests in containers/skopeo#570 .

@mtrmac
Copy link
Collaborator Author

mtrmac commented Dec 3, 2018

The blob info cache is stored in /var/lib/containers/blob-info-cache-v1.boltdb for EUID=0 users, and $XDG_DATA_HOME/containers/blob-info-cache-v1.boltdb for unprivileged users.

This might not be the right thing to do - apparently rootless buildah runs in a user namespace, which causes it to see EUID==0, and try to write to /var/lib/containers. That does not break builds because any error causes the code to fall back to memory-only caches, but it makes the optimizations unavailable.

return err
}
return nil
}) // FIXME? Log error?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you just logrus.Error when this happens?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably. That’s one of the WIP parts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The a bit tricky aspect to this is that failures using the cache are never fatal (which might not be a good idea, to be fair), so a failure could result in dozens of identical or similar error messages per copy; we probably want to throttle them somehow, or maybe even mark the cache as broken after a few failures.)

return err
}
return nil
}) // FIXME? Log error?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not logrus.Error()

}
return nil
}); err != nil { // Including os.IsNotExist(err)
return []types.BICReplacementCandidate{} // FIXME? Log?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we log everything except IsNotExist?

return types.BlobInfo{}, "", errors.Wrapf(err, "Error trying to reuse blob %s at destination", srcInfo.Digest)
}
if reused {
ic.c.Printf("Skipping fetch of repeat blob %s\n", srcInfo.Digest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reword this?

Skipping blob %s fetch, already exists?
Is this something we actually need to see, or would it be better to just see what is pulled?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(“fetch” in this message includes pushes.)

This can definitely be improved, at least s/blob/layer/, and maybe we don’t need the digests in the output at all. That’s all fairly independent from the functionality of this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(“fetch” in this message includes pushes.)

i.e. all of pull/push/copy go through this code path and may skip transferring a blob between transports here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this just become noise on the screen, would the user care? Seems to be more debugging information.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Blob transfer skip - more generic than fetch and more valid in the context of pushes

@rhatdan
Copy link
Member

rhatdan commented Dec 5, 2018

@rhatdan
Copy link
Member

rhatdan commented Dec 5, 2018

@mtrmac Is this still WIP? Can we get it vendored into skopeo so we can get tests to pass?

@mtrmac
Copy link
Collaborator Author

mtrmac commented Dec 5, 2018

@mtrmac Is this still WIP?

Yes, at least for documentation.

Can we get it vendored into skopeo so we can get tests to pass?

Already exists in
containers/skopeo#570 .

@rhatdan
Copy link
Member

rhatdan commented Dec 5, 2018

I'm willing to take it without Documentation, so we can move the testing along. Then open a separate PR for Docs

Why is the tests failing?

mtrmac added 14 commits December 6, 2018 18:59
This will be used to record information about copied layers, and avoid the copies
if possible.

Does not add any users, so does not change behavior yet.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will be useful in the next commit. (If mocks are useful at all), let's start
reusing them more widely.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The helpers have no users yet, they will be added momentarily along
with the implementations to be tested.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will be called by at least two of the implementations.  Has no users
yet, but they will be added momentarily.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is the primary production implementation.

Does not add any users yet, so should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This exists for tests, as a fallback when we can't use
the BoltDB locations, and possibly for non-copy.Image
users.

Has no users yet, so does not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This exists primarily for the benefit of Image.ConfigBlob
and similar readers, which don't really benefit from the DiffID matching,
blob mounting, and the like, or callers like ImageDestination.PutBlob,
which actively wants to avoid using any extra cached data (because the
cache lookup should have already happened and now it would just be
redundant).

Does not add any users yet, so should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This returns a default BoltDB instance (which almost everyone
should use to maximally benefit from the cache).  The default location
is chosen to exist near the containers/storage GraphRoot (i.e.
typically on the same partition, with the same SELinux context),
both for root and rootless modes of podman.

This adds no users yet, so should not immediately change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The cache is not yet used, and initializing it can only create a few directories
(without aborting the copy operation on failure), so this should not noticeably
affect behavior right now.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Now that we are not using an in-memory cache, the query may be more expensive.
OTOH we want to query the on-disk BlobInfoCache in any case, and the kernel is
likely to cache short-term disk accesses, so an extra in-memory cache would
likely only make things more expensive overall.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will, primarily, make it easier for the transport to use
alternate locations without having to somehow carry state from
HasBlob to ReapplyBlob.

Also, all instances of ReapplyBlob have been either trivial or redundant,
and there is a single primary caller of HasBlob/ReapplyBlob (+ a few
in-transport users who are even a bit cleaner with the move to
TryReusingBlob).

Should not change behavior, apart from not doing the
storageImageDestination.HasBlob check redundantly in
storageImageDestination.ReapplyBlob.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
For now, none of the transports actually use it, so should not change behavior.

copy.Image uses its existing cache object; config parses in image/* usually
use NoCache because they wouldn't appreciably benefit anyway.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will allow TryReusingBlob to substitute the required blob with an equivalent
based on cache information if possible, or not doing so if it would break signatures.

In addition, we set canSubstitute to false when signing an image, to make 100% sure
the signed blob is safe (e.g. that we are not signing a third-party-compressed version
which has been maliciously designed to attack some decompressor implementations).

Nothing implements this, so does not change behavior yet.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will allow us to reliably detect that we have processed the full
input (i.e. that PutBlob did not get the data from elsewhere and ignore it).

The value is not uset yet, so does not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…during a copy

We have the data already available, so just use it.

This primarily helps when pushing the same local layer twice to the
same destination while compressing it.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…arerToken

This is a trivial move, does not change behavior; but it will allow
setupRequestAuth to be a bit more complex about the scopes and token caching
in the future.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Now that the code has been moved, integrate it better into its uses.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This will make it easier to cache several different tokens in a single client.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This allows using cross-repository mounts, which require
a token for both reading the source and reading and writing to the destination.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…ryReusingBlob

We will add more calls to blobExists from TryReusingBlob soon.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The implementation already computes both digests locally, so just
record the data so that all transports benefit.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Because storage implements a flat layer namespace, and it already
contains a layer<->DiffID mapping, we don't need to record everything
redundantly in the cache and use cache.CandidateLocations; just
use the cache for digest->DiffID lookups, and look up a layer
by the DiffID using the existing data.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac force-pushed the blob-info-caching branch from 5c5e6b9 to 3a6fc86 Compare December 6, 2018 18:02
@mtrmac
Copy link
Collaborator Author

mtrmac commented Dec 6, 2018

Updated to avoid a no-op mount within a single repo, and added documentation to the interface/implementations; I can live with this being merged.

Tests fail here due to the API change, should eventually pass in containers/skopeo#570 (and after merging this, a variant of containers/skopeo#570 that vendors from master of this repo will need to be quickly merged).

@mtrmac mtrmac changed the title WIP: Blob info caching + mount/reuse Dec 6, 2018
@rhatdan
Copy link
Member

rhatdan commented Dec 6, 2018

@runcom
Copy link
Member

runcom commented Dec 6, 2018

Reviewing this in a bit

@runcom
Copy link
Member

runcom commented Dec 6, 2018

I had just a comment which we can live with for now/beginning of January.
This PR also passes our demo and code is pretty solid as well.

I'm gonna merge this, if you have anything against, please fill out issues/PRs

@runcom
Copy link
Member

runcom commented Dec 6, 2018

Tests in skopeo pass as well containers/skopeo#570

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants