Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance testing environment focused on WP core repo #45

Closed
PaulHHM opened this issue Dec 7, 2021 · 16 comments
Closed

Performance testing environment focused on WP core repo #45

PaulHHM opened this issue Dec 7, 2021 · 16 comments
Labels
Needs Discussion Anything that needs a discussion/agreement [Type] Epic A high-level project / epic that will encompass several sub-issues

Comments

@PaulHHM
Copy link

PaulHHM commented Dec 7, 2021

This issue is to discuss creating a standard VM for performance testing of WordPress.

The idea is to create a reproducible VM (using Docker/Vagrant/whatever) for testing WordPress with all tooling included' and 'Base WordPress installs' that could encompass a generic blogging setup (possibly using https://wordpress.org/plugins/fakerpress/), a generic ecommerce setup (100's of products, categories, etc), and a generic business full-site editor setup (Divi/Elementor/etc).

For a local VM, a few options exist that could be 'the standard':

https://hub.docker.com/_/wordpress
https://bitnami.com/stack/wordpress/installer
https://marketplace.cloud.vmware.com/services/details/turnkey-wordpress-appliance/?slug=true
http://vccw.cc
https://varyingvagrantvagrants.org

There would need to be a common list of plugins that would be installed as well for each reference installation as well as its database (pages, posts, media). Thoughts ?

@mehigh
Copy link

mehigh commented Dec 14, 2021

Including some plug-ins from the most popular ones will give a snapshot that's closer to the reality
https://wordpress.org/plugins/browse/popular/

  1. Contact Form 7
  2. Yoast SEO
  3. Akismet
  4. WooCommerce
  5. Jetpack
  6. MonsterInsights

Maybe more? I recall reading at some point that the average site has upwards of 20 plug-ins active, but might be wrong.

A level of consideration for performance plug-ins... like LiteSpeed Cache (? would checking we don't step any toes with the popular performance plug-ins be of interest)
Thinking that excluding from the exploration the page builders / classic editors might be best as it's a initiative with the goal of going into Core.

@davidegreenwald
Copy link

davidegreenwald commented Dec 14, 2021

👋 10up's WP Local Docker is an open-source solution to building repeatable, customizable local dev environments: https://github.com/10up/wp-local-docker-v2

Now compatible with M1 chips as well.

@dustinrue
Copy link

What is the overall goal of the performance testing? When would these performance tests be run? Who would run these tests?

Are you looking to spot performance regressions in core itself? If that is the case, then I would suggest not installing any plugins as once you install other plugins you can't ensure that a performance regression was introduced due to WP or the plugin or even a combination of the two.

I guess, I think further defining what the goals are to performance testing so that a person can better narrow down the best solution.

@felixarntz
Copy link
Member

@boogah pointed out here that it is now unclear whether Docker will continue to be used for the official WordPress development environment, so it's worth taking note of that for this issue as well: https://make.wordpress.org/core/2021/11/24/wordpress-development-environment/

That article doesn't clarify or define any new requirements though, at this point it's more at the proposal stage. While we should keep an eye out for this, we should focus this issue on figuring out what would make a WordPress performance testing environment, not whether Docker is the right tool or not for WordPress overall. So unless there is a technical reason to use another tool than Docker, I'd suggest we stick with that now, to align with WordPress core, Gutenberg etc. - the performance plugin itself uses Docker for its development environment too.

@felixarntz
Copy link
Member

What @dustinrue is asking above is precisely what we need to answer here first IMO:

What is the overall goal of the performance testing? When would these performance tests be run? Who would run these tests?

My 2 cents on these questions are:

  • What is the goal? I would start simple with a single goal: Create a performance testing environment for WordPress core, and only WordPress core. While we should definitely look at plugins and themes too and allow for them to be tested, that makes things much more complicated and IMO would be better to do as an iteration. I also like the idea of some "realistic" WordPress setup, but that also goes a step further, comes with more complexity, and could be explored as a follow-up.
  • When would the tests be run? We could aim to create a custom package or GitHub action using this environment, which would allow comparing a before/after state. We could then add a new GitHub workflow to https://github.com/WordPress/wordpress-develop, where we would run this action on every new commit and pull request. Not necessarily as a build check that could fail, but at least for informational purposes.
    • As a starting point, we could fork https://github.com/WordPress/wordpress-develop and set it up there, to evaluate how reliable it is - one of the biggest challenges will be to make the performance tests themselves consistent so that the results don't vary too greatly between different runs of the same test.
  • Who would run these tests? Per my point above, my suggestion would be a GitHub workflow in WordPress core, for starters. Eventually, as we expand the setup to allow for plugins and themes to use it (as a follow-up issue), it could then be used by any plugin and theme developers to measure performance of their extension.

Curious to hear your thoughts.

@josephscott
Copy link

We could aim to create a custom package or GitHub action using this environment, which would allow comparing a before/after state.

How consistent are the environments for GitHub actions? In order to have high confidence in the results of a before/after comparison we should control for as many variables as possible. From the test agent end that usually isn't too bad, use the same browser, network limits, device type, etc. For the server hosting WordPress we are looking more at CPU, memory, storage, database.

@josephscott
Copy link

In regards to the details, I agree that we should start with the basics. Default WordPress, default theme ( 2022 ). As that gets flushed out and any testing adjustments that come up get made, then we could consider what additional variations should be tested.

@lolautruche
Copy link

Hello!
As a follow-up to #44 , I repeat here the offering of Platform.sh 😉 .
The project has already been created, with @josephscott as an admin. We may add additional users of course!
We usually suggest to use Composer flavors of WP for several reasons, but we might use a vanilla instance as well (to be decided).

P.sh offers different templates for WP:

We can use one of them or start from scratch.
One of the advantages of using a template is that everything is also pre-configured for local development using Lando.

@felixarntz
Copy link
Member

@josephscott

How consistent are the environments for GitHub actions?

I think it would depend more on where exactly the environment would be hosted and which service we use how for measurement. What I meant was that a GitHub action would merely trigger the latest code to be deployed into that setup/environment (whatever we define here) and then a performance test would be run against it. So the result of the test itself would be independent of GitHub actions.

@josephscott
Copy link

Thank you, I understand better what you mean now.

@PaulHHM
Copy link
Author

PaulHHM commented Dec 16, 2021

Including some plug-ins from the most popular ones will give a snapshot that's closer to the reality https://wordpress.org/plugins/browse/popular/

Yes, I feel that including a specific set of plugins that would replicate three different scenarios would be best. One for a generic e-commerce site, one for a generic blog site, and one for a generic company site.

A level of consideration for performance plug-ins... like LiteSpeed Cache

I think that including any performance plugins would cause issues with reporting on the performance any specific change would make (positive or negative).

Thinking that excluding from the exploration the page builders / classic editors might be best as it's a initiative with the goal of going into Core.

I don't think excluding plugins that are in use by a vast percentage of the end-users would be realistic. Remember, though, that this issue is to explore creating a reproducible environment for someone to test whether a specific change is beneficial or not in a 'real world' scenario - not whether a plugin has a positive/negative effect on WordPress. In fact, the installed plugins are really irrelevant and they can/should change often (ie: quarterly) so as to not seem partial to any one group/company/project.

By way of example, a simple "generic company website" could consist of the following:

WordPress
(random theme)
(random pagebuilder)
(random form builder)
(random social media sharing plugin)
5-7 pages (home, about, services, news, blog page, contact page) with generic/lorem ipsum text and placeholder images

Kinda like this random site that I pulled from a google search - https://tlctotallawncare.com / https://www.whatruns.com/website/tlctotallawncare.com

I'm probably not the best person to ask about a 'generic blog website' or 'generic ecommerce website' but I envision both of those having different requirements for plugins.

What is the overall goal of the performance testing?

My thought is that it would allow people - in a "real world" scenario - to test whether a change has a positive or negative impact on the running of WordPress. It should be easy enough that computer users such as ourselves can test/benchmark the website before the change and after the change using a predefined wordpress database/setup.

When would the tests be run?

I'd like the tests to be able to be run on the developers system while they are programming and/or by the casual user who wants to test out if a change in code impacts the running of a wordpress installation. The more people that could 'press a button' or 'follow these steps' to have a self-contained installation of WordPress running locally to test a specific change could only help WordPress and this performance group.

The benefit of running locally is (i feel) a more controlled environment with regards to "same browser, network limits, device type, CPU, memory, storage, database".

@felixarntz
Copy link
Member

@PaulHHM

My thought is that it would allow people - in a "real world" scenario - to test whether a change has a positive or negative impact on the running of WordPress. It should be easy enough that computer users such as ourselves can test/benchmark the website before the change and after the change using a predefined wordpress database/setup.

I am not clear what specific goal such "real world" scenario test environments would address. I agree that they could be potential options for some benchmarking, e.g. we could say something like "over the last x months, performance on the average blogger setup has improved by x%". But those "real world" scenarios would hardly be a reliable indicator for measuring the impact of a particular change.

In order to measure the effects of a change in core, we need to rely on "just core", the same as for measuring plugins and themes, we need to rely on the particular plugin/theme. Sometimes a plugin or theme change may rely on a new core API, so in those cases we would need that particular plugin or theme and core together.

For real-world testing, I think we need actual real-world sites ("field data") to assess how the changes affect WordPress sites overall.

I'd like the tests to be able to be run on the developers system while they are programming and/or by the casual user who wants to test out if a change in code impacts the running of a wordpress installation.

Agreed. Eventually the environment and measurement tool we define here should be usable by core developers, plugin developers, and theme developers. I think it's worth having it both as a tool you can run/trigger locally, as well as a tool that can run within a GitHub action so that this could be automated and "force-run" on every change.

From my suggestion in #45 (comment), I don't think we can do it all at once though - I think we need to first build an environment that allows testing core, which is the simplest. That will allow us to really focus on the measurement tooling and nail that. Afterwards we should expand the environment to also allow testing individual plugins and themes.

@felixarntz felixarntz moved this from Triage to To do in [Focus] Measurement Dec 22, 2021
@felixarntz
Copy link
Member

For reference, we have now #63 to discuss specifically which metrics we want to measure and #64 to discuss how (which tooling approach) we want to measure them. Let's leave the issue here focused on in the exact scope of where we're going to dedicate our initial efforts, but the what and the how should be discussed in those two issues, to not make the one here way too confusing.

Which metrics we care about and how we would measure them applies to whichever approach we take for our first measurement project, so I suggest we focus on #63 and #64 first.

@eclarke1 eclarke1 added Needs Discussion Anything that needs a discussion/agreement and removed [Type] Discussion labels Jan 17, 2022
@dainemawer
Copy link
Contributor

Something to take into account here while talking about environments...as much as we need a desktop environment, thats a given, we should be more focused on how these environments can be configured to better reflect mobile. Performance metrics will, as a general rule of thumb always be better on Desktop, than Mobile.

This is because Desktops in general are just far more power in terms of resources (RAM, Memory, CPU etc) - so what we really want to ensure here is that whatever approach we take can easily be spoofed / tested for realistic mobile capabilities. If we get performance right on mobile, it will naturally compliment Desktop.

@eclarke1 eclarke1 added the [Type] Epic A high-level project / epic that will encompass several sub-issues label Feb 15, 2022
@bethanylang bethanylang changed the title Standard WP performance testing environment Apr 26, 2022
@joemcgill
Copy link
Member

Since this issue was initially created, we now have some documented ways to set up performance testing environments for different scenarios in our Team Handbook: https://make.wordpress.org/performance/handbook/measuring-performance/

Unfortunately, I don't think a one-size-fits all solution for setting up environments to mimic real-world setups including plugins, etc. is achievable and is best left to people to set up based on their particular use case.

In the future, WordPress Playground could be used for this type of easy setup, but for now I'm going to close this as no longer applicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Discussion Anything that needs a discussion/agreement [Type] Epic A high-level project / epic that will encompass several sub-issues