What is email filtering ?
Email filtering is the process of analyzing inbound and/or outbound emails to determine if they should or should not be delivered, depending on security rules and requirements.
It revolves around different analysis techniques and technologies, as well as different technical configurations from on-premise to cloud configurations.
4,258
Billion
active email users in 2024
93%
data breaches
are initiated through phishing emails
162
Billion
spams sent each day
46%
emails worldwide
were identified as spam
How does an email filtering solution classify emails?
Email filtering relies on a combination of algorithms and rules to determine whether an email should be delivered to the inbox or sent to its recipient. Here are a few techniques used for filtering.
Header Analysis
Email headers contain interesting information that can be used to determine the nature of the email.
The sender information, like its email address and sending domain are analyzed and can be checked against black or grey lists if they were previously detected as potentially dangerous.
Content Analysis
Keywords in the subject and body of the email can trigger filters.
For instance, recent QR codes attacks usually use a pretext related to multi-factor authentication activation. Keywords such as MFA or 2FA can trigger a deeper inspection of the email.
Links and URLs within the email are also analyzed using dedicated tools combining reputation and content analysis.
Attachments are also scanned using anti-malware software to determine their nature and in specific cases, like extensions, be blocked altogether.
Metadata Examination
Metadata, such as origin IP addresses or unusual sending times can be used to block or quarantine suspicious emails.
For instance, email sent outside of work hours or from a different location can be flagged as suspicious.
Machine Learning Filtering
Using machine learning, filters can recognize patterns and behaviors and any email deviating from established patterns and user behaviors will be flagged as suspicious.
Bayesian Filtering
Using statistical analysis on keywords in the content and comparing it to known spam, filters can classify incoming or outgoing emails as spam or legitimate.
Heuristic Filtering
Heuristic email filtering uses rule-based filtering. For instance, an email containing a specific string of characters like “!!!” can be marked as spam automatically.
Collaborative Filtering
Some email filtering systems use reporting data from other users to identify suspicious emails. Large email providers benefit from a network effect to better protect their users.
If several users report an email as suspicious, it will be inspected and its properties will be used to identify and block it across other clients.
Behavioral Analysis
Sender reputation and recipient engagement are two behavioral metrics that can be used to classify emails.
Sender reputation, based on sending patterns and previous feedback from recipients, help categorize the nature of the email.
Recipient engagement, from reporting as suspicious to deletion help understand the quality and nature of emails. Some email filtering solutions can use this to help in their classification process.
Challenge based filtering
Like captcha on web pages, some filters require a challenge to be solved, either by a software or a human to classify the nature of the email.
User defined rules
On top of these techniques, filters can combine with custom rules defined by the user to alter its behavior.
For instance, users can “whitelist” specific IP addresses or email headers to allow for phishing simulations to bypass their filters and be delivered to their users.
How does an email filter access emails?
Email filters, depending on their deployment, usually access emails through two ways.
They can be used as an email gateway, usually by being put directly into the MX records: this means they will be receiving incoming emails directly and will choose to distribute, quarantine or block emails depending on their diagnosis.
As emails won’t land in the users inbox before any filtering process, this is often seen as the most secure deployment but is more complex to deploy and might create delivery and security problems during its deployment due to DNS propagation and modifications.
The second way is using mail flow rules or an API connection to the email provider so that any incoming mail will be inspected by the filter.
Due to their implementation, API connections can sometimes allow emails to be in the user inbox for a short amount of time and if a bug happens during this time, potentially dangerous emails could stay in the inbox and create a security risk.
However, deployment is much simpler and will create less deliverability problems.
How do users use email filters?
Regular users don’t interact much with email filters.
They usually can access a report button that allows them to report suspicious emails for deeper analysis and sometimes can access an “Indesirable” email folder within their inbox where the filter puts suspicious emails.
Administrators have higher access and can configure rules and depending on the filter’s capacities can access a quarantine where they can manually review emails that are considered suspicious by the filter.
What are the types of email filtering appliances?
There are three main types of email filtering appliances: on-premise, cloud or hybrid.
On-premise appliances are sometimes required for regulation purposes or to keep all email data internal to the company. It involves having a specific piece of hardware that will process locally inbound and outbound emails to analyze and classify them.
This provides higher privacy as all emails — that can contain potentially sensitive information — are processed on the premises of the organization. On the downside, it requires maintenance and is more difficult to scale should the volume of emails and amount of email accounts to secure increase.
Cloud appliances are more and more common now and allow the company to rely on a cloud service to filter their emails. Emails are routed through the cloud appliance that will handle the email filtering.
This provides users with a more scalable infrastructure with less maintenance costs but involves data that can be sensitive going through the cloud.
Finally, some organizations can have hybrid appliances, due to migration and legacy operations or having specific, high-criticality or regulated accounts handled by an on-premise appliance whereas the rest would be filtered by a cloud appliance.
What are the limitations of email filtering?
Like all security systems, email filtering isn’t perfect.
The two main problems that email filtering can create are false-positives and false-negatives.
- False-positives will tend to block potentially important email by detecting them as suspicious and delaying or dismissing important communication. A filter that is too-sensitive can hinder productivity and email reliability.
- False-negatives will let dangerous emails be delivered. Attackers constantly innovate to bypass filters and can succeed at sending potentially dangerous phishing emails or malware attachment to their victims by bypassing the detection systems.