A spider (also called crawler or robot) is a automatic program that visits your website by following links from other web pages or by a directly submitted request to a search engine. By following oder “crawling” links throughout the Internet, the spider is grabbing content from the visited sites and adding it to the index of a search engine.
When visiting a website the robot (as well as any browser) sends a specific string to the server. For this the HTTP protocol provides a field for a so-called „user-agent“. So you can recognize the visit of a spider in the log file. (e.g. the user agent code of the Google spider “Googlebot” is: Googlebot/2.1)