|
Holger's Java API |
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.antelmann.net.SampleCrawlerSetting
public class SampleCrawlerSetting
SampleCrawlerSetting is what it's named: a sample CrawlerSetting. It is currently used by JSpider as the default CrawlerSetting.
JSpider,
Serialized Form| Field Summary | |
|---|---|
boolean |
currentSiteOnly
|
static String[] |
defaultRestrictURLPattern
|
int |
depth
|
boolean |
includeHTMLCode
|
String[] |
includeTextPattern
|
String[] |
restrictURLPattern
|
| Constructor Summary | |
|---|---|
SampleCrawlerSetting()
searches all files 3 levels deep in current site only |
|
SampleCrawlerSetting(int depth,
boolean currentSiteOnly,
String[] restrictURLPattern,
String[] includeTextPattern,
boolean includeHTMLCode)
|
|
SampleCrawlerSetting(int depth,
String includeTextPattern)
|
|
| Method Summary | |
|---|---|
boolean |
followLinks(URL url,
URL referer,
int depth,
List resultURLList,
List closedURLList,
List searchURLWrapperList)
followLinks() determines whether the given URL is to be searched for its links to be examined further in the next level. |
boolean |
isActive()
if inactive, followLinks() always returns false |
boolean |
matchesCriteria(URL url,
URL referer,
int depth,
List resultURLList,
List closedURLList)
This method decides whether either the URL itself or its content qualifies for what this CrawlerSetting searches for; as this function is also called on every URL encountered, it is also the place for any custom parsing this CrawlerSetting wants to do. |
void |
setActive(boolean flag)
|
String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
public static final String[] defaultRestrictURLPattern
public int depth
public boolean currentSiteOnly
public String[] restrictURLPattern
public String[] includeTextPattern
public boolean includeHTMLCode
| Constructor Detail |
|---|
public SampleCrawlerSetting()
public SampleCrawlerSetting(int depth,
String includeTextPattern)
public SampleCrawlerSetting(int depth,
boolean currentSiteOnly,
String[] restrictURLPattern,
String[] includeTextPattern,
boolean includeHTMLCode)
| Method Detail |
|---|
public void setActive(boolean flag)
public boolean isActive()
public boolean followLinks(URL url,
URL referer,
int depth,
List resultURLList,
List closedURLList,
List searchURLWrapperList)
CrawlerSetting
followLinks in interface CrawlerSettingurl - the URL that is to be examined for its linksreferer - url's referer URLdepth - distance from the original root URL where the search beganresultURLList - List of URLs that have already been found to match this CrawlerSetting's criteriaclosedURLList - List of URLs that have already been found not to match the CrawlerSetting's criteriasearchURLWrapperList - List of Spider.URLWrapper objects already identified to be examined in the next levelSpider.URLWrapper
public boolean matchesCriteria(URL url,
URL referer,
int depth,
List resultURLList,
List closedURLList)
CrawlerSetting
matchesCriteria in interface CrawlerSettingurl - the URL in question to satisfy the criteriareferer - url's referer URLdepth - link distance from the original root URL where the search beganresultURLList - List of URLs that have already been found to match this CrawlerSetting's criteriaclosedURLList - List of URLs that have already been found not to match the CrawlerSetting's criteriapublic String toString()
toString in class Object
|
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||