Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler] useMemo calls directly induce memoization blocks #30177

Open
wants to merge 3 commits into
base: gh/mvitousek/8/base
Choose a base branch
from

Conversation

mvitousek
Copy link
Contributor

@mvitousek mvitousek commented Jul 2, 2024

Stack from ghstack (oldest at bottom):

Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs.

This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing @ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, obj.x and f(). This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible.

We build the user-level reactive scopes by simply adding all memoizable instructions between StartMemo and FinishMemo to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal.

Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs.

This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing @ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, `obj.x` and `f()`. This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible.

We build the user-level reactive scopes by simply adding all memoizable instructions between `StartMemo` and `FinishMemo` to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal.

[ghstack-poisoned]
Copy link

vercel bot commented Jul 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
react-compiler-playground ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jul 2, 2024 5:09pm
@react-sizebot
Copy link

react-sizebot commented Jul 2, 2024

Comparing: 100dfd7...adc8705

Critical size changes

Includes critical production bundles, as well as any change greater than 2%:

Name +/- Base Current +/- gzip Base gzip Current gzip
oss-stable/react-dom/cjs/react-dom.production.js = 6.68 kB 6.68 kB +0.05% 1.83 kB 1.83 kB
oss-stable/react-dom/cjs/react-dom-client.production.js = 497.99 kB 497.99 kB = 89.27 kB 89.27 kB
oss-experimental/react-dom/cjs/react-dom.production.js = 6.69 kB 6.69 kB +0.05% 1.83 kB 1.83 kB
oss-experimental/react-dom/cjs/react-dom-client.production.js = 502.81 kB 502.81 kB = 89.97 kB 89.97 kB
facebook-www/ReactDOM-prod.classic.js = 597.08 kB 597.08 kB = 105.33 kB 105.33 kB
facebook-www/ReactDOM-prod.modern.js = 571.42 kB 571.42 kB = 101.27 kB 101.27 kB
test_utils/ReactAllWarnings.js Deleted 62.88 kB 0.00 kB Deleted 15.69 kB 0.00 kB

Significant size changes

Includes any change greater than 0.2%:

Expand to show
Name +/- Base Current +/- gzip Base gzip Current gzip
test_utils/ReactAllWarnings.js Deleted 62.88 kB 0.00 kB Deleted 15.69 kB 0.00 kB

Generated by 🚫 dangerJS against adc8705

@mvitousek mvitousek requested a review from mofeiZ July 2, 2024 06:37
Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs.

This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, `obj.x` and `f()`. This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible.

We build the user-level reactive scopes by simply adding all memoizable instructions between `StartMemo` and `FinishMemo` to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal.

[ghstack-poisoned]
Summary: To support the always-bailing-out and change-detection modes for the compiler, and to potentially support end-user codebases in some situations, we previously built a mode where user-level useMemos weren't dropped. This, however, results in codegen that is quite vastly different from the compiler's default mode, and which is likely to exercise different bugs.

This diff introduces a new mode that attempts to preserve user-level memoization in a way that is more compatible with the normal output of the compiler, dropping the literal useMemo calls and producing reactive scopes. The result of this is different from the existing ensurePreserveMemoizationGuarantees in that the reactive scope is marked as originating from a useMemo, and cannot be merged with other memoization blocks, and that some operations are memoized that are not necessarily memoized today: specifically, `obj.x` and `f()`. This is to account for the fact that current useMemo calls may call non-idempotent functions inside useMemo--this is a violation of React's rules and is unsupported, but this mode attempts to support this behavior to make the compiler's behavior as close to user-level behavior as possible.

We build the user-level reactive scopes by simply adding all memoizable instructions between `StartMemo` and `FinishMemo` to their own reactive scope, possibly overwriting an existing scope. We do so before the scopes have been populated with dependencies or outputs so those passes can operate over these new scopes as normal.

[ghstack-poisoned]
} else if (instruction.value.kind === "FinishMemoize") {
currentScope = null;
} else if (currentScope != null) {
const operands = collectMutableOperands(fn, instruction, true);
Copy link
Contributor

@mofeiZ mofeiZ Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic overall makes sense but I'm a bit confused here -- why do we want to collect mutable or allocating instructions here instead of assigning a scope to all lvalues / operands. Similarly, when would we not want to create a reactive scope to represent the manual memo (i.e. the lazy scope creation logic)?

I can't imagine anyone writing this code, but I think this mode should probably preserve its useMemo semantics (playground link)

// @enablePreserveExistingManualUseMemo
let myVar = 2;

function increment() {
  "use no forget";
  myVar += 1;
}
function useFoo() {
  useEffect(increment);
  const res = useMemo(() => myVar, []);
  return res;
}
Copy link
Contributor

@mofeiZ mofeiZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall makes sense!

Short summary of some offline discussions: this mode will let us enable the compiler in "ultra safe" mode for initial testing and validation. This mode preserves all existing memo and maybe doesn't even insert new memo. This means:

  • This means we can start compiling in runtime validation before/without starting rollout for devX and testing. For current rollouts, both compiler on+off builds can now get runtime validation like hook guards or change detection.
  • This also lets us use a much bigger set of tests to sanity check the compiler. If we're preserving existing memo and adding no new memo, the compiler should essentially be the identity transform (and nothing should break).

As a followup, we probably should freeze / end the mutable range of useMemo dependencies and declarations (see playground example)

function useFoo() {
  const myVar = {};
  const res = useMemo(() => myVar.length, [myVar]);
  
  // myVar should be inferred as a read here
  maybeMutate(myVar);
  return res;
}

Note that this is not just important for avoiding bailouts but also to make sure these useMemos get their own (non-nested; non-overlapping) reactive scope ranges. It might also be worth writing a pass that runs after all the scope merging passes to ensure that source scope ranges never overlap with other scope ranges (Disregard, nesting source scopes within other scopes is correct as the source memo is still retained)

Here is another example why we might want to end mutable ranges of scope dependencies before the scope begins. The implementation in this PR derives the mutable range of the newly created scope to be {min(scopeOperands.map(o => o.mutableRange.start), max(scopeOperands.map(o => o.mutableRange.end)} (without freezing deps).

  • This means that the newly created scopes may begin before the useMemo. If we expect to use this mode as an identity transform (i.e. creating only source scopes), this doesn't quite work because the new memo scopes will include additional instructions.
  • If the newly created scopes overlap with other scopes (overlap := overlapping but not nested mutable ranges), MergeOverlappingScopes must merge it to produce valid scope blocks.
function useFoo(a) {
  // here, x gets assigned scope 0 with range (0, 2) by InferReactiveScopeVars.
0:  const x = { a }; 
  // and y gets assigned scope 1 with range (0, 2) by MemoizeExistingUseMemos.
  // note that the resulting output has scopes 0 and 1 merged because they have
  // identical mutable ranges
1:  const y = useMemo(() => foo(x), []);
2:  return y;
}

As a sidenote, the PromoteUsedTemporaries pass (can't think of any others right now but perhaps @gsathya can correct) relies on dependencies to be valid to produce valid programs. This should be fine as useMemo operands shouldn't consume rvalues ("unnamed" identifiers) from prior instructions. We should also just rewrite this logic anyways (see bug repro), but wanted to note that manually manipulating scopes might make other passes / existing invariants more complex to reason about

@@ -1429,6 +1429,8 @@ export type ReactiveScope = {
merged: Set<ScopeId>;

loc: SourceLocation;

source: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should check source in all passes that act on scopes.

// flattenScopesWithHooksOrUse
function useFoo() {
  // Assume that useARandomFunction is not a hook
  const res = useMemo(() => useARandomFunction(), []);
  return res;
}
// flattenReactiveLoops
function useFoo() {
  const res = []
  for (let i = 0; i < 5; i++) {
    res.push(useMemo(() => foo(), []));
  }
  return res;
}

(etc)

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall makes sense, but i have similar questions to @mofeiZ about the details of assigning the scopes.

@@ -1429,6 +1429,8 @@ export type ReactiveScope = {
merged: Set<ScopeId>;

loc: SourceLocation;

source: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name feels a bit too abbreviated to be immediately obvious what it means (when looking at a usage site in particular). Maybe fromManualMemoization or something descriptive?

let currentScope = scope;
for (const instruction of block.instructions) {
if (instruction.value.kind === "StartMemoize") {
currentScope = { kind: "pending", id: nextId() };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add an assertion that currentScope == null, since useMemo can't be nested

Comment on lines +417 to +419
if (current.scope.source !== next.scope.source) {
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this seems like it would allow merging consecutive manual memoization units, is that what we want? i would have expected we'd want to leave them alone, return false if source is true for current/next

Comment on lines +26 to +29
let ctr = 0;
function nextId(): number {
return ctr++;
}
Copy link
Contributor

@josephsavona josephsavona Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this not a module-global, so that the count resets? Or even better, it seems like we should be able to just get env.nextScopeId as soon as we encounter a StartMemoize instruction, since we know that will map 1:1 with the scopes we create?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants