How to understand the Google Safe Browsing Diagnostic report for malicious or hacked websites
When the Google web crawler visits a site and gets attacked by malware, Google flags the site as suspicious with a "This
site may harm your computer" warning in search results. The search result links no longer go to the site, but go instead to an
explanatory page about the warning. The Firefox browser, which looks up sites in the Google Safe Browsing database, displays a "Reported
Attack Site!" warning, with a link to the explanation. By either of these routes, you can end up at a Google Safe
Browsing Diagnostic report.
Another way to view the Safe Browsing Diagnostic, for any site, is to enter this URL in your browser address bar. Replace
EXAMPLE.COM with the name of the site:
http://www.google.com/safebrowsing/diagnostic?site=EXAMPLE.COM
The report is short and lacks explanation, but it contains useful information for webmasters who are trying to
clean up their legitimate sites that have been turned dangerous by hackers.
Below are explanations
of the sections of the Safe Browsing Diagnostic report, directed toward webmasters who are trying to
clean up their websites. This page was previously part of a
long article about how
to diagnose the reason a site is flagged and how to get the Google warning removed.
What is the current listing status for _____?
Site is listed as suspicious - visiting this web site may harm your computer.
This tells you whether your site is listed right now as suspicious. If it is, it means that Google has determined
that at least one of your pages, by one method or another, under at least some circumstances, is causing visitors to get
attacked by malware. There will be warnings (as described above) in Google search results and in the Firefox and Chrome
browsers. Internet Explorer does not use the Google Safe Browsing database (it uses a Microsoft database), so IE might not give
any warning message. That does not mean the site is clean. If the Google Safe Browsing diagnostic says that visiting a site can
cause malicious content to be downloaded to your computer without your permission,
you can be almost 100% sure that their assessment is correct.
If this report disagrees with what you see in search results (for example, you know that your site currently is flagged
in search results, but the diagnostic says it is not listed as suspicious), it's possible your site has more than one
diagnostic report and you need to find the other(s). There are at least two situations where you can have more than one diagnostic
report:
- Only part of your site, not the whole site, is flagged. In the search results, click the link to one of the pages that
is flagged, to get the diagnostic report for that part of the site, such as example.com, example.com/forum, or
blog.example.com.
- In the past, it was possible to have separate diagnostic reports (which sometimes did not agree with each other) for
example.com and www.example.com. Google seems to
have resolved that problem in most cases, but check out this possibility anyway if the diagnostic does not seem to be
accurate for your situation.
If the diagnostic report says your site is not listed as suspicious, but you still get warnings in Firefox, it
is due to a delay in Firefox updating from the Google database, and is normally resolved within a day or so.
Part of this site was listed for suspicious activity 9 time(s) over the past 90 days.
This tells you the recent history. In the example above, the site has been flagged and unflagged 9 separate times, which
is a lot. Its webmaster has probably been removing malicious code over and over again but not fixing the site's security
vulnerabilities, so the site keeps getting hacked repeatedly.
What happened when Google visited this site?
Of the 110 pages we tested on the site over the past 90 days, 11 page(s) resulted in malicious software being downloaded and
installed without user consent.
This is mostly self-explanatory. It gives you an indication how widespread the infection is in the pages of your site. You can
get a partial listing of the pages Google considers suspicious at Webmaster Tools at
Google Webmaster Central.
The last time Google visited this site was on 2009-11-20, and the last time
suspicious content was found on this site was on 2009-11-20.
When the first and second dates are the same, it means that the most recent review found malware. The site is still
infected.
The last time Google visited this site was on 2009-11-20, and the last time
suspicious content was found on this site was on 2009-11-18.
This means that the most recent review did not find malware. If your site is still shown as "suspicious" even though the
last scan did not find malware, the status should change to "not suspicious" within approximately 1 day, unless the site has been flagged many times recently.
In that case, there might be a several-day delay
while Google waits to see if the site stays clean. Another reason for a delay is if you deleted the infected pages instead
of cleaning them. Google wants to see cleaned pages. They do not want you to delete pages, get the flag removed, and then
put infected pages back online.
Malicious software includes 1 scripting exploit(s), 1 trojan(s). Successful infection resulted in an average of 1 new
process(es) on the target machine.
This itemizes the kinds of malware that attacked the Google crawler when it visited your pages.
Malicious software is hosted on 2 domain(s), including gumblar.cn/, beladen.net/.
When your pages cause malware to be loaded into a visitor's browser, it means just that: they cause it to
happen. It does not necessarily mean the actual virus code is in your page. It probably isn't, and it probably is not even
in some other file on your website. Usually, the virus code is stored at some other site. But if your page contains an
iframe that fetches its content from that other site, it will cause the malicious code to be loaded into the
visitor's browser.
This line in the diagnostic is the list of sites where the malware is actually hosted (stored). The visitor's browser is
fetching the virus code from there. If you are hunting for malicious iframes in your website files, these domain names are
ones you should be hunting for. Unfortunately, they might be encoded in a way that makes them hard to find with a text
search, and it is also possible that other domains, rather than these, are referenced in your iframes. The reason is that
sometimes there is a chain or sequence of events, involving other intermediary websites, that eventually, but not
immediately, causes malware from the above sites to be loaded. I will discuss intermediaries in the next section.
This list of hosting domains can be very helpful. In the first of the examples above, the reference to gumblar.cn
means that it is certain your site was hacked as the result of a virus infection on the PC of one of your website
administrators, which stole the FTP password. In the second example, the reference to beladen.net means that it is
not just your website that is compromised; the entire server is infected, and so are all the websites on it. A web search
on the domain names you find in this list can help discover what type of infection your website has and also can indicate
what type of security vulnerability it has that allowed it to be infected. Unfortunately, it doesn't often lead to such
definitive conclusions as it does for gumblar or beladen.
3 domain(s) appear to be functioning as intermediaries for distributing malware to visitors of this site,
including...
As mentioned above, when your site is loaded in a browser, elements in your page such as iframes can trigger a chain of
events that bring malicious content to the visitor's browser. That chain could involve several hops, through several
different websites, before the malicious code gets delivered.
For example, let's say your page
contains an iframe that loads a page from site A, but that page consists of JavaScript code that fetches and executes a VBScript from site B,
which fetches a Trojan downloader (the payload, the first part of the actual malicious software, which might consist of
several parts carrying several different types of attacks) from site C.
In this scenario, your site will certainly be flagged for causing the malicious content to get loaded into the visitor's browser
(initiating the sequence).
Sites A and B are intermediaries in the chain, and site C is the host of the code that carries out the
attack.
When you are searching your code for hidden iframes, search for domains listed in this section of the report as intermediaries, in addition to the ones listed in the previous section as hosts.
This site was hosted on 1 network(s) including...
This tells you the internet network where your site is hosted. You might recognize the name of your webhost here, or
the name of a larger network that your host is part of. This does not seem to be particularly useful information. Any
large network will have many compromised websites in it, and no large network will consist 100% of compromised websites.
Has this site acted as an intermediary resulting in further distribution of malware?
Over the past 90 days, _____ appeared to function as an intermediary for the infection of __ site(s) including _____, _____,
...
Is your site one of the intermediaries as described in the previous section? In addition to the general scenario presented
above, here are two more specific ones:
- Let's say you host a PHP script that other sites call to get dynamically generated content from you. Your site gets
hacked, and someone injects your PHP code with iframes that point to a third site that hosts malicious code. As long as your own site's
pages don't call your own PHP script, you're not causing malware to be loaded into a visitor's browser, and you're not the
host
of the malware, either, but your PHP script is facilitating the distribution of malware by acting as a middle link. You're an intermediary.
- Let's say you are an advertising distributor. You accept ads submitted to you by companies who want to advertise, and you
place those ads on the sites in your publisher network. One of your advertisers submits a malicious ad. When the ad appears on
your publisher websites, you're an intermediary. This Safe Browsing
report is an example of an advertiser listed
as an intermediary at the time of this writing. Note that although they are not flagged as suspicious, and their own pages
aren't flagged in search results with "This site may harm your computer", they can be causing their publisher
network sites to get flagged.
Has this site hosted malware?
This part of the report usually says No. As mentioned earlier, most sites, even compromised ones, do not actually
host (contain) the virus code. The hackers store the virus code at a central location. Then they hack many sites,
injecting iframe code that points to the central location. With this arrangement, they can change the virus code quickly
and easily. The changes get propagated throughout the internet without their having to re-hack thousands of sites to
update the code to the new and improved version.
If your report says Yes, your site is hosting malware, then you are one of the chosen few where they actually are storing
the virus code. When web surfers load pages from other sites, those pages contain iframes that point
to your site and fetch the virus code from your site. Obviously, you need to find where the virus code is being
stored in your website files.
If your report says No, this site has not hosted malware, that does not mean your site is clean. It only means your site is not a central location
where the virus code is being stored.
Notes
- In the scenario described earlier where your site is flagged because its pages initiate the sequence of malware delivery and "sites A
and B are intermediaries, and
C is the host", the intermediaries and hosts will not necessarily be flagged as suspicious. This is counterintuitive
because the intermediaries and hosts are a danger to the internet because they are either conduits to the flow of malware or
store it so it can be used in attacks against web surfers or against other websites.
The internet danger level would be reduced if these sites were flagged as suspicious. It would alert the webmasters (at least the
ones who are innocent victims) that their sites need to be cleaned and better secured. Without such
warning, many webmasters of sites that are intermediaries or hosts have no idea that they have a
problem.
The best sense I can make of this situation is that the Google search result warning is intended to help protect web surfers by
giving them information they can do something about: they can avoid going to a flagged site.
Intermediary and host sites usually play their part through "orphan" files hidden inside their sites. These are files
that have no ordinary hyperlinks pointing to them from anywhere on the internet. Because they are not pages
that web surfers can get to by following links, and because Google's intent is to protect web surfers using
their search results (not necessarily to "make the internet safer"), they do not bother to flag intermediaries and hosts. Once
a web surfer visits a site that initiates the delivery of malware, the chain through the intermediaries and hosts is automatic. There is nothing
a web surfer could do about it even if they had advance warning, so there is no point in
creating such a warning.
There is something a web surfer can do for protection, however: turn JavaScript Off. In many cases, that will prevent
you from being redirected into the chain of intermediaries and hosts, and prevent the malware from being delivered to your
browser.
Unusual situations observed
1) Firefox blocked access to a "Reported Attack Site", but the site was not flagged in Google search results. The Safe Browsing
Diagnostic report said suspicious content was "never found", yet it also said that the suspicious software was hosted on 3
domains, and gave their names. The reason: the website was hacked and did redirect visitors to the sites that Google knew were
malicious. However, the malicious sites had already been shut down, so they weren't serving any actual malware.
Comments, questions, and discussion are welcome in the
Forum.
|