Skip to content

zstd compress serialized decls#9147

Open
Atry wants to merge 2 commits into
facebook:masterfrom
Atry:export-D38624288
Open

zstd compress serialized decls#9147
Atry wants to merge 2 commits into
facebook:masterfrom
Atry:export-D38624288

Conversation

@Atry

@Atry Atry commented Aug 15, 2022

Copy link
Copy Markdown
Contributor

Summary:
Should we unconditionally compress decls (when serializing)?

pro: in practice, these are almost always cached somewhere, so it
saves space and only requires recompression when a source file changes.

con: has not been evaluated in integrated tests yet

Differential Revision: D38624288

Summary:
With Eval.EnableDecl==true, hackc may request type, func, constant, or module
decls from other files, while it is compiling bytecode for a given file. Whole program
builds are now distributed, and hphpc compiles source code to bytecode remotely
in groups of ~500 files at a time.

To make this work, bytecode compilation (aka "parsing") is now iterative, by making
use of a new BatchDeclProvider which accumulates a list of requested decls it did
not have. This list of missing decls drives the iteration in steps:
1. initially, each source file's list of decls is empty.
2. after compiling, if any missing decls turn out to be available in the global UnitIndex,
add them to the source file's list and try to compile that group of files again.
3. Stop iterating when no additional decls are discovered. This doesn't mean the
missing symbol list was empty, it just means those symbols are permanently missing.

In practice, hackc currently only requests decls it sees in the original source file, so
this loop terminates after two iterations. In the future, hackc may request transitive
decls (e.g. fetch A, then A's parents, and so on), causing more iteration.

As an optimization, if EnableDecls==false, IndexJob no longer stores any decls in
external storage, since nothing will fetch them.

Differential Revision: D37280035

fbshipit-source-id: 67a4aac29c0fe2aa19eec87161755b30f1c8debb
Summary:
Should we unconditionally compress decls (when serializing)?

pro: in practice, these are almost always cached somewhere, so it
saves space and only requires recompression when a source file changes.

con: has not been evaluated in integrated tests yet

Differential Revision: D38624288

fbshipit-source-id: 4fd512cf03f6d12e2d1bdfea66329e4c6b7267b4
@facebook-github-bot

Copy link
Copy Markdown
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot

Copy link
Copy Markdown
Contributor

This pull request was exported from Phabricator. Differential Revision: D38624288

@facebook-github-bot

Copy link
Copy Markdown
Contributor

Hi @Atry!

Thank you for your pull request.

We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment