Sunrise
2018-02-15 17:27:15 0 举报
分布式爬虫系统UML图
作者其他创作
大纲/内容
Result
seriable
Task
Split
Route
Serializable
RouteResult
Crawler
+ attribute1:type = defaultValue+ attribute2:type- attribute3:type
+ run()+ beforeExecute()+ execute()+ afterExecute()
instruction
Robot
+ TaskExecutor+ Producer
+ main(String []args)
DBUtil
+ operation1(params):returnType- operation2(params)- operation3()
RouteVesell
ANRMCrawler
TaskResult
+ ResultId+ TaskId+ Result(JSON)+ ExceptionType+ ExceptionDesc+ CaptureStartDTM+ CaptureEndDTM+ CreateDTM+ LastUpdateDTM+ UpdateBy
ProxyLog
+ TaskId+ ProxyHost+ ProxyPort+ ProxyUsername+ ProxyPassword+ CreateDTM+ LastUpdateDTM+ Author
NEWS
DBOperator
Configure
TaskExecutor
TaskSaver
+ saveTaskResult()
TaskPicker
+ getOneTask()
BaseTaskLog
BaseTaskResult
CT
MySQLOperator
+ url+ port+ username
SSEUtil
RawDataUtil
RouteSplit
RouteVesselInstruction
DomainConverter
LogUtil
Vessel
Util
seriabl
Snapshot
+ TaskId+ Raw(BASE64)+ CreateDTM+ LastUpdateDTM+ UpdateBy
BaseTaskRawData
TaskLog
+ TaskId+ Action+ Message+ CreateDTM+ UpdateBy
BaseTask
CrawlerTask
+ TaskId+ Project (Route/Vessel/CT)+ Content(JSON)+ Status(New/Capturing/CaptureFailed/Captured/Responsed/Timeout)+ CreateDTM+ LastUpdateDTM+ UpdateByr(Hostname)
CrawlerConfigure
+ TargetURL+ CrawlerId(ANRM)+ Project (Route/Vessel/CT)+ Class+ Enable(True/False)+ UseProxy(True/False)+ ThreadNum(0-999)+ MaxThreadNum+ IntervalTimeSeconds+ ThreadTimeoutSeconds+ TaskExpireAfterHours(7*24Hours)+ CreateDTM+ LastUpdateDTM+ UpdateBy
VesselSplit
ProxyUtil
TaskSender
+ sendTaskResult()
BaseProxyLog
收藏
0 条评论
下一页