29.2. Zend_Service_Akismet

29.2.1. Introduction

Zend_Service_Akismet provides a client for the Akismet API. The Akismet service is used to determine if incoming data is potentially spam; it also exposes methods for submitting data as known spam or as false positives (ham). Originally intended to help categorize and identify spam for Wordpress, it can be used for any type of data.

Akismet requires an API key for usage. You may get one for signing up for a WordPress.com account. You do not need to activate a blog; simply acquiring the account will provide you with the API key.

Additionally, Akismet requires that all requests contain a URL to the resource for which data is being filtered, and, because of Akismet's origins in WordPress, this resource is called the blog url. This value should be passed as the second argument to the constructor, but may be reset at any time using the setBlogUrl() accessor, or overridden by specifying a 'blog' key in the various method calls.

29.2.2. Verify an API key

Zend_Service_Akismet::verifyKey($key) is used to verify that an Akismet API key is valid. In most cases, you will not need to check, but if you need a sanity check, or to determine if a newly acquired key is active, you may do so with this method.

<?php
require_once 'Zend/Service/Akismet.php';

// Instantiate with the API key and a URL to the application or resource being
// used
$akismet = new Zend_Service_Akismet($apiKey, 'http://framework.zend.com/wiki/');
if ($akismet->verifyKey($apiKey) {
echo "Key is valid.\n";
} else {
echo "Key is not valid\n";
}
?>

If called with no arguments, verifyKey() uses the API key provided to the constructor.

verifyKey() implements Akismet's verify-key REST method.

29.2.3. Check for spam

Zend_Service_Akismet::isSpam($data) is used to determine if the data provided is considered spam by Akismet. It accepts an associative array as the sole argument. That array requires the following keys be set:

  • user_ip, the IP address of the user submitting the data (not your IP address, but that of a user on your site).

  • user_agent, the reported UserAgent string (browser and version) of the user submitting the data.

The following keys are also recognized specifically by the API:

  • blog, the fully qualified URL to the resource or application. If not specified, the URL provided to the constructor will be used.

  • referrer, the content of the HTTP_REFERER header at the time of submission. (Note spelling; it does not follow the header name.)

  • permalink, the permalink location, if any, of the entry the data was submitted to.

  • comment_type, the type of data provided. Values specifically specified in the API include 'comment', 'trackback', 'pingback', and an empty string (''), but it may be any value.

  • comment_author, name of the person submitting the data.

  • comment_author_email, email of the person submitting the data.

  • comment_author_url, URL or home page of the person submitting the data.

  • comment_content, actual data content submitted.

You may also submit any other environmental variables you feel might be a factor in determining if data is spam. Akismet suggests the contents of the entire $_SERVER array.

The isSpam() method will return either true or false, and throw an exception if the API key is invalid.

Exemple 29.1. isSpam() Usage

<?php
$data = array(
    'user_ip'              => '111.222.111.222',
    'user_agent'           => 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-GB; rv:1.8.1) Gecko/20061010 Firefox/2.0',
    'comment_type'         => 'contact',
    'comment_author'       => 'John Doe',
    'comment_author_email' => 'nospam@myhaus.net',
    'comment_content'      => "I'm not a spammer, honest!"
);
if ($akismet->isSpam($data)) {
    echo "Sorry, but we think you're a spammer.";
} else {
    echo "Welcome to our site!";
}
?>

isSpam() implements the comment-check Akismet API method.

29.2.4. Submitting known spam

Occasionally spam data will get through the filter. If in your review of incoming data you discover spam that you feel should have been caught, you can submit it to Akismet to help improve their filter.

Zend_Service_Akismet::submitSpam() takes the same data array as passed to isSpam(), but does not return a value. An exception will be raised if the API key used is invalid.

Exemple 29.2. submitSpam() Usage

<?php
$data = array(
    'user_ip'              => '111.222.111.222',
    'user_agent'           => 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-GB; rv:1.8.1) Gecko/20061010 Firefox/2.0',
    'comment_type'         => 'contact',
    'comment_author'       => 'John Doe',
    'comment_author_email' => 'nospam@myhaus.net',
    'comment_content'      => "I'm not a spammer, honest!"
);
$akismet->submitSpam($data));
?>

submitSpam() implements the submit-spam Akismet API method.

29.2.5. Submitting false positives (ham)

Occasionally data will be trapped erroneously as spam by Akismet. For this reason, you should probably keep a log of all data trapped as spam by Akismet and review it periodically. If you find such occurrences, you can submit the data to Akismet as "ham", or a false positive (ham is good, spam is not).

Zend_Service_Akismet::submitHam() takes the same data array as passed to isSpam() or submitSpam(), and, like submitSpam(), does not return a value. An exception will be raised if the API key used is invalid.

Exemple 29.3. submitHam() Usage

<?php
$data = array(
    'user_ip'              => '111.222.111.222',
    'user_agent'           => 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-GB; rv:1.8.1) Gecko/20061010 Firefox/2.0',
    'comment_type'         => 'contact',
    'comment_author'       => 'John Doe',
    'comment_author_email' => 'nospam@myhaus.net',
    'comment_content'      => "I'm not a spammer, honest!"
);
$akismet->submitHam($data));
?>

submitHam() implements the submit-ham Akismet API method.

29.2.6. Zend-specific Accessor Methods

While the Akismet API only specifies four methods, Zend_Service_Akismet has several additional accessors that may be used for modifying internal properties.

  • getBlogUrl() and setBlogUrl() allow you to retrieve and modify the blog URL used in requests.

  • getApiKey() and setApiKey() allow you to retrieve and modify the API key used in requests.

  • getCharset() and setCharset() allow you to retrieve and modify the character set used to make the request.

  • getPort() and setPort() allow you to retrieve and modify the TCP port used to make the request.

  • getUserAgent() and setUserAgent() allow you to retrieve and modify the HTTP user agent used to make the request. Note: this is not the user_agent used in data submitted to the service, but rather the value provided in the HTTP User-Agent header when making a request to the service.

    The value used to set the user agent should be of the form some user agent/version | Akismet/version. The default is Zend Framework/0.7.0 | Akismet/1.11.