A web site I manage has been frequently overloaded by thousands of the same requests from about a thousand Automattic IPs (192.0.64.0–192.0.127.255; Automattic owns Wordpress) for a feed requested via a Wordpress widget on a site hosted at Wordpress.com. The load can be several thousand identical requests per hour. Example request:
"GET /news/feed/ HTTP/1.1" 304 - "https://www.xyz.org/news/feed/" "WordPress.com; https://abc1.com"
Most of the reply codes are 304 (not updated), so it doesn't actually cause the feed to be generated anew, but even so the load is often noticeable (and seems totally unnecessary).
There is another Wordpress site with a widget for a feed, and it too makes a lot of requests, but from only one IP and not as many (which may of course simply reflect the popularity of the 2 sites as well as the hosting set-up). Example:
"GET /news/category/scotland/feed/ HTTP/1.1" 304 - "-" "WordPress/6.8.2; https://abc2.org"
Curiously, the Wordpress.com requests include our site as the referer. They explained that as something the Simplepie feed parser, which they use, does, although the WordPress/6.8.2 agent doesn't do so.
There are a couple of other Automattic/Wordpress fetches that occasionally appear:
"GET /news/feed/ HTTP/1.1" 200 54932 "https://www.xyz.org/news/feed/" "Automattic Feed Fetcher 1.0"
"GET /news/feed/rss2/ HTTP/1.1" 301 - "https://www.xyz.org/news/feed/rss2/" "Automattic Feed Fetcher 1.0" (redirected to /news/feed/)
"GET /news/feed/ HTTP/1.1" 200 54932 "https://www.xyz.org/news/feed/" "wp.com feedbot/1.0 (+https://wp.com)"
Another fetcher, Inoreader, adds not the feed URL but the domain root as referer:
"GET /news/feed/ HTTP/1.1" 304 - "https://www.xyz.org/" "Inoreader/1.0 (+http://www.inoreader.com/feed-fetcher; x subscribers; )"
Back to the original issue, after much discussion with Wordpress.com it’s clear that they don’t see any problem with their 1,000 IPs repeatedly making the same exact request. So the only answer was to increase caching, which does seem to have helped relieve the load.
First, I added this to the Apache config file:
<Location "/home/xyz/public_html/news/feed/">
<IfModule mod_headers.c>
Header unset ETag
Header unset Vary
Header append Vary: Accept-Encoding
Header set Cache-Control "max-age=86400, public"
</IfModule>
FileETag None
</Location>
<Location "/home/xyz/public_html/news/category/scotland/feed/">
<IfModule mod_headers.c>
Header unset ETag
Header unset Vary
Header append Vary: Accept-Encoding
Header set Cache-Control "max-age=86400, public"
</IfModule>
FileETag None
</Location>
(86,400 seconds is 1 day.) That probably duplicates the following, but I have them both:
<IfModule mod_headers.c>
Header unset ETag
Header unset Vary
Header append Vary: Accept-Encoding
<FilesMatch "\.(rss|txt|xml)$">
Header set Cache-Control "max-age=86400, public"
</FilesMatch>
</IfModule>
The following was already in the config file, but I increased the expiry times to match the above:
<IfModule mod_mime.c>
AddType application/rss .rss
AddType application/rss+xml .rss
</IfModule>
<IfModule mod_expires.c>
ExpiresByType application/rss A86400
ExpiresByType application/rss+xml A86400
</IfModule>
Finally, I increased the expiry time for the WP Super Cache plug-in to 86,400 as well.