Google-fu master J0hnny Long announced the Google Hack Honeypot (GHH) last week. The introduction states:
"GHH emulates a vulnerable web application by allowing itself to be indexed by search engines. It's hidden from casual page viewers, but is found through the use of a crawler or search engine. It does this through the use of a transparent link which isn't detected by casual browsing but is found when a search engine crawler indexes a site. The transparent link (when well crafted) will reduce false positives and avoid a fingerprint of the honeypot.
The honeypot connects to a configuration file, and the configuration file writes to a log file which is chosen during configuration. The log file contains information about the host, including IP address, referral information, and user agent.
Using the information gathered in the log file, an administrator can learn more about attackers doing reconnaissance against their site. An administrator can cross reference logs and view a better picture of specific attackers."
This is a really ingenious idea. You run this system to learn what Google hacking techniques potential intruders are running against your Web site. While the original Honeynet was designed to watch intruders scan and attack hosts, the GHH watches for people to scan and attack vulnerable applications.
Installation doesn't appear too complicated. Currently Apache and PHP are supported, with IIS and .NET "coming in a future release."
If anyone tries this out, please comment here.