The latest wobserver.com release makes it possible to follow (currently up to 10) links which are originating from a page. The links are followed and the retrieved pages checked for changes or detected terms. Changes are reported in one accumulative Email.
That new feature enables you as a user of wobserver.com to do the following:
- Follow a page of which the page URL is changing regularly (e.g. for weekly offer pages at web shops)
- Follow up several linked pages in order to cover several pages downstream.
So let me explain how you do a set up in wobserver.com.
Once registered, you go to Add Job. In Add Job, you select the job type TERMDETECTSCANLINK or CHANGEDETECTSCANLINK depending if you want to detect terms or check pages for changes. The following image show how it should look like:
The two important attributes for this post are Scan-Link url and Scan-Link name. Scan-link url scans for links in the web page with the pattern in the URL. Scan-link name scans for patterns in the link name (what you see often underlined on a web page). Wobserver.com sends all links resulting out of the scan (maximum: 10) to its crawler. The crawler fetches the pages and applies the next step which would be a check for change detection.
Here comes a more concrete example:
Provided the fact, you want to scan for comment changes on Hacker News. The comment link on Hacker News looks like this: http://news.ycombinator.com/item?id=1234567. The id is changing from article to article but the word “item” remains the same. Therefore, in wobserver.com you would set up a job the following way:
Since the comment links on Hacker News carry the label “comments”, the job could also be defined this way:
In this example, the attribute Scan-Link name is filled. Wobserver.com scans for link names or labels. If both attributes are given (url pattern and name pattern), scan hits are combined in “OR” manner.
Selecting links on page
The follow link functionality scan the first 10 links on the page. But what if you don’t want the first 10 links to be scanned and rather do a more specific definition?
Then you should make use of the define area feature. As an example, let’s scan for changes on the linked pages in the footer of Hacker News.
First, you fill in the necessary attributes, set Scan-link url to ‘ycombinator’.
Then you press Define Area. Scroll to the bottom of Hacker News and select the footer:
Wobserver.com selects that particular area and scans for links in that area only. This is the way to determine where on the page link scanning is done.
Have fun! Feedback is welcome. (Tumblr ask or Wobserver.com Contact form).