<Back to Updates

Improving Data Handling in Hydrogen and Remix

October 3, 2023

At Shopify we spend a lot of time helping merchants build more performant web applications. More often than not, performance starts with how data is loaded onto each of your pages. In this post, we’ll dive into different strategies for loading and caching data. By getting these right, you can make your webpage faster and more user-friendly.

Let’s start off by creating a product detail page using Hydrogen. Hydrogen, built on Remix, uses loaders for server data queries. It also has a GraphQL client to request data from Shopify’s Storefront API:

Code Example


import {json} from '@shopify/remix-oxygen';
import {useLoaderData} from '@remix-run/react';

export async function loader({context}) {
  const {product} = await context.storefront.query(`#graphql
  {
    product(id: "gid://shopify/Product/7982905098262") {
      id
      title
      description
      featuredImage {
        id
        url
      }
    }
  }
  `);
  return json({product});
}

export default function () {
  const {product} = useLoaderData();
  return <section>
    <h1>{product.title}</h1>
    <p>{product.description}</p>
  </section>
}

GraphQL makes it easy for us to customize a query with everything we need for a page. So consider all that we might want on our page. We’ll obviously need basic product information like the title, description, a few images, and the price. But we’d also like to show at the bottom of the page a list of recommended products associated with the current one. We’d also like the page to render differently when the current product is already added to the cart. For example, instead of an “Add to Cart” button, we’ll render an “adjust quantity” component.

Putting this all together makes a query that is quite large, so for brevity I’ll use a shorthand:

Code Example


export async function loader({context}) {
  const {product} = await context.storefront.query(`#graphql
    /** Basic product information **/
    /** Product price and availability **/
    /** Product Recommendations **/
    /** Cart **/
  `);
  return json({product});
}

Not only does the query get quite large, but also we notice that the size of the response is large, and the subsequent query performance is less than what we’d prefer. To improve this, we often cache the requests, and we can do the same with ours:

Code Example


export async function loader({context}) {
  const {product} = await context.storefront.query(`#graphql
    /** Basic product information **/
    /** Product price and availability **/
    /** Product Recommendations **/
    /** Cart **/
  `, {cache: context.storefront.CacheLong()});
  return json({product});
}

Side-note: When you write queries with Hydrogen, we automatically add a default cache for you.

While caching vastly improves performance, we have also introduced a few serious problems. First, because the request includes the cart, we have cached a personalized request. This means that the cart is shared across everyone that visits your app. You should never cache a personalized request inside a Remix loader. But if we don’t cache at all, then the performance of our page suffers. We get around this limitation by making two requests instead of only one:

Code Example


export async function loader({context}) {
  const {product} = await context.storefront.query(`#graphql
    /** Basic product information **/
    /** Product price and availability **/
    /** Product Recommendations **/
  `, {cache: context.storefront.CacheLong()});

  const {cart} = await context.storefront.query(`#graphql
    /** Cart **/
  `, {cache: context.storefront.CacheNone()});

  return json({product, cart});
}

However, we still face another issue related to caching product availability. Certain product properties rarely change and can be cached for longer periods, while others change frequently. Even these frequently changing properties can benefit from caching, but they should be revalidated more often. To address this, we can divide our query once more:

Code Example


export async function loader({context}) {
  const {product} = await context.storefront.query(`#graphql
    /** Basic product information **/
    /** Product Recommendations **/
  `, {cache: context.storefront.CacheLong()});

  const {availability} = await context.storefront.query(`#graphql
    /** Product price and availability **/
  `, {cache: context.storefront.CacheShort()});

  const {cart} = await context.storefront.query(`#graphql
    /** Cart **/
  `, {cache: context.storefront.CacheNone()});

  return json({product, cart, availability});
}

We’ve divided our initial query into three parts, each with its own caching strategy. However, this has led to a 'request waterfall', where each request is made one after the other. Unless one request's data is needed to make another, it's better to execute requests simultaneously within your data loaders. Let’s do that with Promise.all():

Code Example


export async function loader({ context }) {
  const [{ product }, { availability }, { cart }] = await Promise.all([
    context.storefront.query(`#graphql
      /** Basic product information **/
      /** Product Recommendations **/
     `,
      { cache: context.storefront.CacheLong() },
    ),
    context.storefront.query(`#graphql
      /** Product price and availability **/
     `,
      {cache: context.storefront.CacheShort()},
    ),
    context.storefront.query(`#graphql
      /** Cart **/
     `,
      { cache: context.storefront.CacheNone() },
    ),
  ]);

  return json({ product, cart, availability });
}

With effective caching and parallel data loading, our page should load much quicker. But can we optimize it further? Let’s distinguish between primary, essential data and secondary data for the page. What’s absolutely necessary to render a product detail page, and what isn’t? This can vary depending on the app. In our case, product recommendations aren't essential. If they fail or are slow to load, we still want a fully functional product page. So, let's “defer” loading the product recommendations and the cart:

Code Example


export async function loader({ context }) {
  // Notice there is no await
  const cart = context.storefront.query(
    `#graphql
      /** Cart **/
     `,
    { cache: context.storefront.CacheNone() },
  );
  // Notice there is no await
  const recommendations = context.storefront.query(
    `#graphql
      /** Product Recommendations **/
     `,
    { cache: context.storefront.CacheLong() },
  );

  const [{ product }, { availability }] = await Promise.all([
    context.storefront.query(
      `#graphql
      /** Basic product information **/
     `,
      { cache: context.storefront.CacheLong() },
    ),
    context.storefront.query(
      `#graphql
      /** Product price and availability **/
     `,
      { cache: context.storefront.CacheShort() },
    ),
  ]);

  return defer({ product, cart, availability, recommendations });
}

There’s a few things that we changed. Instead of returning `json` from the loader, we are returning `defer`. Defer allows you to serialize a promise from the loader to the browser. So instead of awaiting both the cart and recommendation promises, we pass each directly to defer. We can then access each promise within our component in the browser:

Code Example


import {useLoaderData, Await} from '@remix-run/react';

/** ... **/

export default function () {
  cons {product, cart, availability, recommendations} = useLoaderData();

  return <section>
    <h1>{product.title}</h1>
    <p>{product.description}</p>
    <Suspense fallback={<LoadingRecommendations />}>
      <Await resolve={recommendations}>
        {(reccomendations) => <Recommendations recommendations={recommendations} />}
      </Await>
    </Suspense>
  </section>;
}

While the promise is loading, the suspense boundary fallback is displayed. Once the promise is resolved, the actual component is rendered. It may be tempting to defer everything on a page, but this is usually a bad decision. By deferring data, the time to first byte to the browser improves, but the data that shows up in the browser includes a pending user interface. This interface is not actionable, and in the worst scenarios causes jank when the page eventually loads. It’s better to show the user immediately actionable content. This is why primary data should be awaited within the loader, and only secondary data should be deferred.

When developing a page, it's crucial to classify the types of data involved in rendering. First, identify data that changes often and data that rarely changes. Both can benefit from caching, but frequently changing data should be revalidated more often. Second, distinguish between primary and secondary data. Primary data is essential for the page to function, while secondary data, though useful, isn't critical. For instance, product details are primary, while recommendations are secondary. Efficiently managing these data types, such as deferring secondary data and prioritizing primary data, can significantly enhance page performance.

Get building

Spin up a new Hydrogen app in minutes.

See documentation