Channel News and Analysis - Channel Insider
Empowering the next generation Channel
 

Bull’s Eye Awards
Nominations Open for Channel Insider 2009 Bull’s Eye Awards
Nominations are now open for the Channel Insider 2009 Bull’s Eye Awards, which recognize excellence in customer service, technology prowess, business acumen, channel leadership, communications and community building, and innovation among vendors, solution providers, distributors and channel services companies.



Sponsored Links
  • Control VM Sprawl, What You Don’t Know Can Hurt You
  • FREE Sophos Encryption Tool: Encrypt, compress and share files easily
  • LSI 6Gb/s Portfolio Expands to Include SATA+SAS HBAs
  • Reduce the cost of managing your mobile workers.
  • Find out 7 Ways to Drive Data Center Efficiency
  • SonicWALL breaks through network and email gridlock
  • Save up to 40% on calling costs with Avaya Aura™



  •  

    Microsoft Research Automates Hunt for Search Engine Spam

    in Channel News and Analysis


    Article Rating:starstarstarstarstar / 0
    Article Views: 852

    Rate This Article:
    Add This Article To:
    Researchers at Microsoft are working on an ambitious new project to hunt down and neutralize large-scale search engine and blog comment spammers.

    Researchers at Microsoft are working on an ambitious new project to hunt down and neutralize large-scale search engine spammers.

    The Redmond, Wash., software giant's Cybersecurity and Systems Management Research Group has taken the wraps off Strider Search Defender, an experimental project that automates the discovery of search spammers through non-content analysis.

    The project integrates technology from two previous Microsoft Research prototypes—Strider HoneyMonkey and Strider URL Tracer—and promises a new approach to removing junk results from search engine queries.

    "The Web is so badly spammed, you can find a spam site on just about every search query," said Yi-Min Wang, the researcher heading up the project at Microsoft, in an interview with eWEEK. "We think this approach can pinpoint the big spammers and use their own tactic against them."

    According to data from Automattic Kismet, a tool that helps bloggers thwart comment spammers, a whopping 93 percent of all blog comments are spam. With Strider Search Defender, Wang's team is taking a context-based approach that uses URL-redirection analysis to pinpoint spammers.

    Resource Library:

    "For the spammers to be successful, they have to post millions of fake comments on message boards and blogs. That's the only way to get picked up by search engines. If we can find a way to pinpoint them before they get indexed by search engines, the problem is solved," Wang said.

    "They want to be found by search engines, that's why they're spamming. Well, now we're finding you," he added.

    The problem is tied to the use of spam blogs, or splogs, to earn money from pay-per-click advertising programs offered by Google, Yahoo and MSN. Content on fake blogs often contain text stolen from legitimate Web sites and include an unusually high number of links to sites associated with the splog creator. The sole purpose is to boost the search engine rank of the affiliated sites and cash in on ad impressions from unsuspecting surfers.

    Read more here about the Strider TypoPatrol and URL Tracer projects.

    During the early stages of the Microsoft research, Wang discovered that successful large-scale spammers create a huge number of "doorway pages" on reputable domains to trick search engine users into clicking on a fake site. It is well-known that Google's BlogSpot, Yahoo's GeoCities and AOL's Hometown services are all used by spammers to create doorway pages.

    The doorway pages are then spammed to millions of forums, blog comments and archived newsgroups, pushing the page up the search engine results for certain target keywords. A user clicking on a doorway-page link in search listings gets redirected to a target page controlled by the spammer or, in some cases, Wang explained, the browser is instructed to either redirect to or fetch ads listing operated by the spammer.

    Next Page: "Monkey program" analyzes traffic.

    The Microsoft Research team is now proposing to treat each spam page as a dynamic program rather than a static page and use a "monkey program" to analyze the traffic resulting from visiting each page with an actual browser. "By identifying those domains that serve target pages for a large number of doorway pages, we can catch major spammers' domains together with all their doorway pages and doorway domains," Wang explained.

    Read more here about Microsoft's Strider HoneyMonkey project.

    Strider Search Defender starts with a seed list of confirmed spam URLs and uses a homegrown tool called Spam Hunter to run link queries on search engines. This is an automated process that pinpoints the forums and guest books on which the known spam URLs were posted. On these pages, additional spam links are scrapped to automatically generate a list of spam URLs.

    To filter out false positives, Microsoft feeds the list of potential spam URLs to the Strider URL Tracer, a tool released earlier this year by Microsoft to help trademark owners find typo-squatting domains of their Web sites.

    Using the URL Tracer, Wang's team can launch an actual browser to visit each URL and record all secondary URLs visited as a result. At the end of that automated scan, the researchers can figure out which target-page domains are associated with a large number of doorway-page URLs.

    In one scenario, Wang said the Spam Hunter collected more than 17,000 BlogSpot URLs and fed them into the URL Tracer. The group was able to identify the top 25 target-page domains that are behind the Google-hosted splogs. The top six are particularly active, Wang said, identifying them as s-e-arch.com, speedsearcher.net, abcsearcher.com, eash.info, paysefeed.net and veryfastsearch.com, which collectively were responsible for approximately 45 percent of the BlogSpot URLs.

    Wang said the Strider Search Defender project has already helped to remove junk results from MSN Search. "The more widely spammed a URL is, the easier it is for the Spam Hunter to find it. Once a spammed forum is identified, it becomes a 'HoneyForum' that can be used to capture new spam URLs in new comment postings," he said. "Ideally, since there is a delay between spamming and its effect on search engine results, our spam hunter should be able to identify new spam URLs and notify the search engine before the URLs enter top search results."

    Check out eWEEK.com's for the latest security news, reviews and analysis. And for insights on security coverage around the Web, take a look at eWEEK.com Security Center Editor Larry Seltzer's Weblog.





    Discuss Microsoft Research Automates Hunt for Search Engine Spam
     
    >>> Be the FIRST to comment on this article!
     

     
     
    >>> More Channel News and Analysis Articles          >>> More By Ryan Naraine
     


     


    [ci] feeds
    XML
    Add Channel News, Product Reviews, Trends and Analysis to your RSS newsreader or My Yahoo!


    HTML PLAIN TEXT

    Keep on top of news for VARs and Resellers with CI's Weekly Newsletter and Alerts.

     


    CHANNEL RESOURCE CENTER
     
     
    Enterprise Mobility Zone
    The Enterprise Mobility Zone (EMZ) blog is a tool designed to help senior IT executives discuss, create and deploy next-generation mobile strategies in their organizations.
    Go beyond yesterday's tactical approach to mobility!
     
    Build A More Efficient Data Center
    Demands are growing but budgets are not. Solve your pressing IT issues using the resources you already have. Determine which technologies can help you drive efficiencies and how they are applied. Gain a quick ROI on new initiatives
    Find out how
    Let Enterprise TechBrief do the work for you. Aggregated content, tech news, product reviews, vendor updates, how-to’s—all you need to boost your efficiencies and cut costs, all from one place.
    enterprisetechbrief.com