What problem does Malware Discoverer address?
Example of one discovered malware campaigns. The entry-level domains (leftmost) use fake news as bait to lure users to click.
How does Malware Discoverer work?
Malware Discoverer is powered by an unsupervised discovery system that is able to trace coordinated redirection campaigns. The algorithm includes three components:
- A crawler to collect redirection paths from a seed of domains
- A cluster to identify suspicious domains that share common redirection paths
- A search expander to discover more domains co-hosted with suspicious domains
Malware Discoverer is a fully automated system. After initial data crawling, it calls a python program to load the data, calculate summary statistics, and generate redirection network graphs (see image above as an example). The system auto-generaet a daily threat intelligence report, which is published on this website and sent to subscribers via email.
How is Malware Discoverer seeded?
Currently our system tracks active domains from five IPs everyday. One IP belongs to a domain parking company, the other four IPs belong to so-called “bullet-proof” hosting providers.
To learn more about how we identified those five IPs, see our report malware campaigns that distribute suspicious chrome extensions. If you want to track other IP, please contact us.
What do we analyze in the threat intelligence report?
Our reports focus on the coordinated redirection behavior of those malware campaigns. We breakdown domains and IPs into three categories: tier one are entry-level domains/IPs, tier two are intermediate redirection hops, and ther three are final landing domains/IPs. For each tier, the report covers:
- Top 10 domains (registrar, name server, and organization from WHOIS record)
- Top 10 IPs (hostname, city, country, andorganization from ipinfo)
- Visualization of redirection network, providing full context about the information flow
- Google Safe Browsing result, Virustotal result
Current data collections
How can those threat intelligence benefit me?
Our detection is not designed to be comprehensive. Because first, we are not tracking all IP addresses and domains, and second, even if we do, there are malicious domains that never redirect. Nevertheless, we still believe that Malware Discoverer is a valuable threat intelligence tool – we find that only less than 1% of domains we discovered are labelled by Google Safe Browsing to be malicious. We hope that by sharing our method and data, we can receive more constructive feedback from the community, and together make malware detection more efficient.
We encourage you to take a look at our reports and graphs. If you find them helpful, connect us and we will share you the daily threat intelligence report.
Have other ideas? / Want to subscribe to get threat intelligence report? / Contact
Zhouhan Chen, NYU Center for Data Science, firstname.lastname@example.org, Personal Website