|
Weapons for hire
Danny Lim / Singapore
If you feel that you are drowning in spam, do not despair. Lifelinesin
the form of technologies like fingerprinting techniques, Bayesian filters and
heuristicscan pull you out of the ditch.
Each one is part of a puzzle that when completely formed makes up a complete
anti-spam solution that is very effective in blocking spam and preventing false
positives, reassured Michael Lok, CEO and president of Akili Networks,
the authorised distributor for Barracuda Networks in the Asia-Pacific.
Black and white lists
Black and white lists are simple to implement and can be an effective first
defence against spam.
White lists are set up to accept e-mail only from domain names and addresses
that are placed on a list. Black lists work on the opposite principle, barring
certain e-mail addresses and domains.
According to Lok, there are two types of black lists: black list IPs in which
organisations manually keep a list of the IP addresses of known spammers so
that e-mail from those addresses are blocked; and real-time black hole lists
(RBLs) which check every incoming e-mails IP address against a list of
IP addresses. If the IP address is on the list, the e-mail is identified as
spam and blocked.
Black and white lists, however, are not feasible for companies with large user
bases.
It is just too tedious to compile lists of known spammers and filter them
out because those lists are growing quickly and larger by the day, said
Ang Ah Sin, Trend Micros regional marketing manager for Asia South.
Fingerprinting
Fingerprinting techniques examine the characteristics, or fingerprint, of messages
previously identified as spam, and use this information to identify the same
or similar messages each time they are intercepted.
As they are continuously updated, these real-time fingerprint checks provide
a method of identifying spam with near zero false positives, said Lok.
Companies, however, should realise that creating fingerprints of e-mail messages
can be a resource-intensive task, especially in high-volume enterprise environments.
They are also relatively ineffective against rare or unusual spam messages,
said Mark Trudinger, SurfControls vice president for Asia.
Lexical analysis
Lexical analysis works by examining the context for all of the words and phrases
in a message. The presence of a particular suspicious word or phrase by itself
does not necessarily mean that the message is spam. Instead, each word or phrase
is assigned a weight depending primarily on the context in which they are found.
For example, Viagra in the context of a discussion of its
generic name sildenafil citrate would most likely be in a legitimate
e-mail, while its presence anywhere near the word free is a good
indicator that the message is spam, said Trudinger.
Once the whole message is analysed, the weights for the found elements are combined,
and the resulting score is compared to a preset threshold. If the score is above
the threshold, the message is considered spam.
Lexical analysis can also be applied to catch variations of words and phrasesa
technique which is becoming very popular among spammers. For instance, it is
possible to catch not only Viagra, but also V1agra or
Vayagra.
The overall effectiveness and efficiency of lexical analysis algorithms in filtering
spam is highly dependent on the quality of the rules, and their assigned weights.
Bayesian filtering
Bayesian spam filtering breaks a message down into individual words after which
they are used to calculate statistically if the message is spam. The Bayesian
engine is constantly trained with both spam and legitimate e-mail for it to
become highly effective.
Before mail can be filtered using the Bayesian method, a database needs to be
created with words and tokens ($ signs, domains, IP addresses, etc), collected
from legitimate and spam e-mail, said Ang. A probability value is then given
to them, which takes into account how often that word occurs in spam mail as
opposed to legitimate e-mail.
Heuristics
Heuristics has been successfully used to identify unknown viruses, and is now
widely used to fight spam, said Trudinger.
In the context of e-mail, heuristics is a method of applying successive tests
to a message to determine if it is likely to be spam.
Heuristics is not so much a unique method as a framework for combining various
tests, and assigning relative scores to their results. The sum of the resulting
scores of the tests performed on a message is compared to a preset thresholdif
this threshold is reached, no further tests need to be performed.
In addition to investigating the content of the e-mail, message attributes like
time and date, size, number of attachments, and MIME-types can also be tested.
In an enterprise environment, the best way to structure tests within a heuristics
framework is to perform the least resource-intensive and most accurate tests
first, and only run successive tests if the previous ones return inconclusive
or negative results.
Honey pots
Honeypots, or decoy e-mail addresses, are used for collecting large amounts
of spam. These decoy e-mail addresses do not belong to actual end-users, but
are made public to attract spammers who will think the address is legitimate.
Once the spam is collected, identification techniques, such as hashing systems
or fingerprinting, are used to process the spam and create a database of known
spam.
Selecting the technologies
When it is time to part with its money, an organisation should look for a solution
that uses a multi-layered approach to fight spam.
No single spam-filtering technology can work on its own. Solutions should
also be easily configurable with systems already in place at the organisation
and should not require a lot of client-side maintenance, said Lok.
The solution must allow itself to be easily updated for the latest spam
definitions because spammers are always on the lookout for new ways to avert
the filters and other anti-spam technologies, and it is nearly impossible for
IT staff to be on top of these changes all the time.
An anti-spam solution should therefore have processes already in place to keep
the solution current on new spam definitions and forms.
Deployment of the anti-spam technologies can be done in two ways.
According to Ang, large organisations should deploy these technologies at the
gateway and mail server level to filter out spam before they are able to make
their way to the desktops.
Small companies without in-house e-mail servers should install their anti-spam
solution (usually integrated into the anti-virus software) on individual computers.
Beyond the technologies
Implementation of the suitable technologies, however, is not the end of your
problems.
Anthony Lim, brand director, Security, Asia South, Computer Associates, said
that as human beings play a key role in spam, standards, policies, responsibilities
and compliances are needed.
All the technology in the world cannot save you if they are not implemented,
maintained or run properly, he said.
Education, enforcement, as well as management involvement and support are also
very important. On a higher level, legislation is needed as well to protect
individuals and organisations
This article first appeared in Asia Computer Weekly.
|