Forcepoint URL Database - NGFW and Dynamic Edge

Accurate, current and comprehensive security and web categories for NGFW

The Forcepoint URL Database contains the industry's most accurate, current and comprehensive classification of URLs. We use proprietary classification software and human inspection techniques to categorize and maintain definitions of more than 95 URL categories in more than 50 languages.

URL categories help us ensure real-time protection against today's targeted and advanced threats. We update them according to intelligence provided by the Forcepoint ThreatSeeker Intelligence (formerly ThreatSeeker Intelligence Cloud), Forcepoint Security Labs researchers and customer feedback

From July 2019 the categories used in NGFW have been optimised to provide an improved user experience. Whilst in the background the full list of categories are still used, when using NGFW the following categories are now used to provide an improved experience for enterprise customers.

Features

Security Categories

Categories under the Security level are known to pose a security threat. The Security categories are:

  • Advanced Malware: Protects against outbound and Inbound Network transmissions for command and control, data exfiltration, payload execution and infection.
  • Botnets: Sites that host the command-and-control centers for networks of bots that have been installed onto users' computers. (Excludes web crawlers.)
  • Compromised Websites: Sites that are vulnerable and known to host an injected malicious code or unwanted content.
  • Malicious Websites: Sites that are infected with a malicious link or iFrame.
  • Emerging Exploits: Sites found to be hosting known and potential exploit code.
  • Mobile Malware: Protects against malicious websites and applications designed to run on mobile devices.
  • Phishing and Other Frauds: Sites that counterfeit legitimate sites to elicit financial or other private information from users.
  • Spyware: Sites that download software that generate HTTP traffic (other than simple user identification and validation) without a user's knowledge.
  • Web and Email Spam: Sites whose links are sent in unsolicited commercial email, either as part of campaigns to promote products or services or to entice readers to click through to surveys or similar sites. Also includes sites that display comment spam.
  • Proxy Avoidance: Sites that provide information about how to bypass proxy server features or to gain access to URLs in any way that bypasses the proxy server.
  • Custom-Encrypted Uploads: Outbound network transmissions of documents, payloads, and data that have been encrypted using custom encryption methods.
  • Files Containing Passwords: Documents and data that include lists of network passwords such as Unix and Windows user passwords; also, documents that potentially contain lists of usernames and passwords.
  • Potentially Exploited Documents: Documents containing content with suspicious characteristics that could lead to the exploitation of a machine.

 

Reputation

Categories under the Reputation level have security implications. The Reputation categories are:

  • Dynamic DNS: Sites that mask their identity using Dynamic DNS services, often associated with advanced persistent threats (APTs).
  • Elevated Exposure: Sites that camouflage their true nature or that include elements suggesting latent malicious intent.
  • Potentially Unwanted Software: Sites that use technologies that alter the operation of a user's hardware, software, or network to decrease owner’s control with the intent to gain fraudulent access and with potential malicious intent.
  • Newly Registered Websites: Sites whose domain name was registered recently.
  • Suspicious Content: Sites found to contain suspicious content.
  • Suspicious Embedded Link: Sites suspected of being infected with a malicious link.
  • Parked Domain: Sites that are expired, offered for sale, or known to display targeted links and advertisements.
  • Unauthorized Mobile Marketplaces: Protects against websites that may distribute applications unauthorized by the mobile OS manufacturer, the handheld device manufacturer or the network provider. (Traffic visiting websites in this category may indicate jail-broken or rooted phones.)
  • Uncategorized: Sites not categorized in the Master Database.

Legal Liability

Categories under the Legal Liability label contain content related to a potential age restriction or legal infringement. The Legal Liability categories are:

  • Adult Material: Sites that contain content that in some jurisdictions is considered suitable only for adults.
  • Adult Content: Sites that display full or partial nudity in a sexual context, but not sexual activity; erotica; sexual paraphernalia; sex-oriented businesses including clubs, nightclubs, escort services; and sites supporting the online purchase of such goods and services.
  • Lingerie and Swimsuit: Sites that offer images of models in suggestive but not lewd costume, with semi nudity permitted. Includes classic 'cheesecake, calendar and pinup art and photography. Includes sites offering lingerie or swimwear for sale.
  • Nudity: Sites that offer depictions of nude or seminude human forms, singly or in groups, not overtly sexual in intent or effect.
  • Sex: Sites that depict or graphically describe sexual acts or activity, including exhibitionism; sites offering direct links to such sites. This category includes data from specialist agencies such as the Internet Watch Foundation (https://www.iwf.org.uk/member/forcepoint).
  • Sex Education: Sites that offer information about sex and sexuality, with no pornographic intent.
  • Illegal or Questionable: Sites that provide instruction in or promote nonviolent crime or unethical or dishonest behavior or the avoidance of prosecution.
  • Marijuana: Sites that provide information about or promote the cultivation, preparation or use of marijuana.
  • Abused Drugs: Sites that promote or provide information about the use of prohibited drugs, except marijuana, or the abuse or unsanctioned use of controlled or regulated drugs; also, paraphernalia associated with such use or abuse.
  • Tasteless: Sites with content that is gratuitously offensive or shocking, but not violent or frightening. Includes sites devoted in part or whole to scatology and similar topics or to improper language, humor or behavior.
  • Violence: Sites that feature or promote violence or bodily harm, including self-inflicted harm; or that gratuitously display images of death, gore or injury; or that feature images or descriptions that are grotesque or frightening and of no redeeming value.
  • Weapons: Sites that provide information about, promote, or support the sale of weapons and related items.
  • Militancy and Extremist: Sites that offer information about or promote or are sponsored by groups advocating anti-government beliefs or action. This category includes data from official government sources such as the UK Counter Terrorism Internet Referral Unit (CTIRU).
  • Gambling: Sites that provide information about or promote gambling or support online gambling, involving a risk of losing money.
  • Hacking: Sites that provide information about or promote illegal or questionable access to or use of computer or communication equipment, software or databases.
  • Alcohol and Tobacco: Sites that provide information about, promote or support the sale of alcoholic beverages or tobacco products or associated paraphernalia.
  • Intolerance: Sites that condone intolerance towards any individual or group.

Bandwidth

Categories under the Bandwidth label are known to consume bandwidth resources. The Bandwidth categories are:

  • Internet Radio and TV: Sites that provide online radio or television programming.
  • Peer-to-Peer File Sharing: Sites that provide client software to enable peer-to-peer file sharing and transfer.
  • Personal Network Storage and Backup: Sites that store personal files on web servers for backup or exchange.
  • Streaming Media: Sites that enable streaming of media content including real-time monitoring, educational and entertainment related streaming.
  • Media File Download
  • Application and Software Download: Sites that enable download of software applications or file download servers.

Baseline

Categories under the Baseline label are related to general web access traffic. The Baseline categories are:

  • Abortion: Sites addressing the issue of abortion, pro life or pro choice.
  • Sites that promote change or reform in public policy, public opinion, social practice, economic activities and relationships.
  • Business and Economy: Sites sponsored by or devoted to business firms, business associations, industry groups, or business in general including job search and recruitment.
  • Education: Sites related to educational or cultural institutions, including educational and reference material.
  • Entertainment: Sites related to motion picture, television, books, humor, online games and restaurants.
  • Financial Data and Services: Sites offering news and quotations on stocks, bonds, and other investment vehicles, investment advice, and online trading.   Includes banks, credit unions, credit cards, and insurance.
  • Government: Sites sponsored by branches, bureaus, or agencies of any level of government, including the armed forces and political parties.
  • Health: Sites that provide information or advice on personal health and medical services, nutrition and prescribed legal medications.
  • Hosted Business Applications: Sites that provide access to business-oriented web applications and allow storage of sensitive data, including virtual workspaces for the purpose of collaboration and conferencing.
  • Information Technology: Sites that provide information regarding computing and technology, translation of websites and computer security tools.
    • Generative AI - Conversation: Sites that specialize in machine-generated conversational content for the purpose of general information, user assistance or entertainment. Includes sites hosting virtual agents and narrow domain conversational applications using AI with ability to generate new content.
    • Generative AI - Multimedia: Sites that specialize in machine-generated multimedia content such as images, videos or audio. Includes sites that provide information, tools or services related to text-to-speech, video, music, sound or image editing applications using AI with ability to generate new content.
    • Generative AI - Text & Code: Sites that provide machine-generated text with broad domain applications (including code and translation) using AI and generating new content. Includes sites that provide tools or services that make suggestions, edits, review or create summaries based on user prompts and interactions.
    • Other AI ML Applications: Sites that provide tools or services related to artificial intelligence and machine learning. Includes sites hosting applications with personal productivity or business purposes using AI but not typically capable of generating new content.
  • Internet Communication: Sites that provide email services for general or corporate use, enable exchange of messages or web chat, and the ability to make phone calls via the internet. Also, sites that reward users for online activity such as viewing websites, advertisements or email.
  • News and Media: Sites that offer current news and opinion, including those sponsored by newspapers, general-circulation magazines or other media.
  • Office - Apps: Office function that enables a user to collaborate via various applications.
  • Office - Documents: Office function that enables a user to collaborate via document applications.
  • Office - Drive: Office function that enables a user to collaborate via virtual storage.
  • Office - Mail: Office function that enables a user to collaborate via email and messaging.
  • Religion: Sites that provide information about or promote religions and religious beliefs and practices.
  • Shopping: Sites that support the online purchase of consumer goods and services including Real Estate, online auction sites and online marketing.
  • Social Networking: Sites of web communities that provide users with means for expression and interaction, including Facebook, Twitter, YouTube, LinkedIn and similar. Includes use of functions that enable users to interact with other users, post photos and videos and exchange messages.
  • Social Organizations: Sites sponsored by or that support or offer information about organizations devoted chiefly to socializing or common interests including philanthropy and professional advancement.
  • Society and Lifestyles: Message boards and sites that provide information about matters of daily life, including entertainment, hobbies, dating, GLBT community and sports hunting.
  • Sites that provide information about or promote sports, active games and recreation.
  • Travel: Sites that provide information about or promote travel-related services and destinations.
  • Web Infrastructure: Sites related to website architecture, web hosting services, web traffic analysis, dynamically generated URLs, URLs not resolving to an IP and private IP addresses as per RFC 1918.