# IOError is raised by the HttpCompression middleware when trying to # decompress an empty response EXCEPTIONS_TO_RETRY = (defer.TimeoutError, TimeoutError, DNSLookupError, ConnectionRefusedError, ConnectionDone, ConnectError, ConnectionLost, TCPTimedOutError, ResponseFailed, IOError, TunnelError)
def__init__(self, settings): ifnot settings.getbool('RETRY_ENABLED'): raise NotConfigured self.max_retry_times = settings.getint('RETRY_TIMES') self.retry_http_codes = set(int(x) for x in settings.getlist('RETRY_HTTP_CODES')) self.priority_adjust = settings.getint('RETRY_PRIORITY_ADJUST')
defprocess_response(self, request, response, spider): if ( request.meta.get('dont_redirect', False) or response.status ingetattr(spider, 'handle_httpstatus_list', []) or response.status in request.meta.get('handle_httpstatus_list', []) or request.meta.get('handle_httpstatus_all', False) ): return response
301 Moved Permanently 的定义 301 状态码表明目标资源被永久的移动到了一个新的 URI,任何未来对这个资源的引用都应该使用新的 URI。
302 Found 的定义 302 状态码表示目标资源临时移动到了另一个 URI 上。由于重定向是临时发生的,所以客户端在之后的请求中还应该使用原本的 URI。 服务器会在响应 Header 的 Location 字段中放上这个不同的 URI。浏览器可以使用 Location 中的 URI 进行自动重定向。
303 See Other 的定义 303 状态码表示服务器要将浏览器重定向到另一个资源,这个资源的 URI 会被写在响应 Header 的 Location 字段。从语义上讲,重定向到的资源并不是你所请求的资源,而是对你所请求资源的一些描述。 303 常用于将 POST 请求重定向到 GET 请求,比如你上传了一份个人信息,服务器发回一个 303 响应,将你导向一个上传成功页面。
307 Temporary Redirect 的定义 307 的定义实际上和 302 是一致的,唯一的区别在于,307 状态码不允许浏览器将原本为 POST 的请求重定向到 GET 请求上
308 Permanent Redirect 的定义 308 的定义实际上和 301 是一致的,唯一的区别在于,308 状态码不允许浏览器将原本为 POST 的请求重定向到 GET 请求上
1×× Informational 100 Continue 101 Switching Protocols 102 Processing 2×× Success 200 OK 201 Created 202 Accepted 203 Non-authoritative Information 204 No Content 205 Reset Content 206 Partial Content 207 Multi-Status 208 Already Reported 226 IM Used 3×× Redirection 300 Multiple Choices 301 Moved Permanently 302 Found 303 See Other 304 Not Modified 305 Use Proxy 307 Temporary Redirect 308 Permanent Redirect 4×× Client Error 400 Bad Request 401 Unauthorized 402 Payment Required 403 Forbidden 404 Not Found 405 Method Not Allowed 406 Not Acceptable 407 Proxy Authentication Required 408 Request Timeout 409 Conflict 410 Gone 411 Length Required 412 Precondition Failed 413 Payload Too Large 414 Request-URI Too Long 415 Unsupported Media Type 416 Requested Range Not Satisfiable 417 Expectation Failed 418 I’m a teapot 421 Misdirected Request 422 Unprocessable Entity 423 Locked 424 Failed Dependency 426 Upgrade Required 428 Precondition Required 429 Too Many Requests 431 Request Header Fields Too Large 444 Connection Closed Without Response 451 Unavailable For Legal Reasons 499 Client Closed Request 5×× Server Error 500 Internal Server Error 501 Not Implemented 502 Bad Gateway 503 Service Unavailable 504 Gateway Timeout 505 HTTP Version Not Supported 506 Variant Also Negotiates 507 Insufficient Storage 508 Loop Detected 510 Not Extended 511 Network Authentication Required 599 Network Connect Timeout Error
from scrapy.downloadermiddlewares.downloadtimeout import DownloadTimeoutMiddleware