一个可执行的 jar 文件挂在 Linux 服务器上,即 Telegram bot。该程序全天候不间断工作,解析互联网上的链接。一切工作几天(2-3天)然后发生错误
Mar 17, 2020 6:19:26 AM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Detected dialect: W3C
Marionette threw an error: <unprintable error>
org.openqa.selenium.WebDriverException: java.net.SocketTimeoutException: timeout
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'autoru-1579587188088-s-1vcpu-1gb-ams3-01', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '4.15.0-88-generic', java.version: '1.8.0_242'
Driver info: driver.version: RemoteWebDriver
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:92)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:609)
at org.openqa.selenium.remote.RemoteWebDriver.getPageSource(RemoteWebDriver.java:438)
at com.company.bot.bot.Parsing.WebSurfing.connect(WebSurfing.java:37)
at com.company.bot.bot.Parsing.ParsinTop.start(ParsinTop.java:41)
at com.company.bot.bot.Parsing.ParsinTop.run(ParsinTop.java:19)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: timeout
at okio.Okio$4.newTimeoutException(Okio.java:232)
at okio.AsyncTimeout.exit(AsyncTimeout.java:285)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:241)
at okio.RealBufferedSource.indexOf(RealBufferedSource.java:355)
at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:227)
at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:215)
at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at org.openqa.selenium.remote.internal.OkHttpClient$Factory$1.lambda$createClient$1(OkHttpClient.java:152)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:200)
at okhttp3.RealCall.execute(RealCall.java:77)
at org.openqa.selenium.remote.internal.OkHttpClient.execute(OkHttpClient.java:103)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:155)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
... 7 more
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at okio.Okio$2.read(Okio.java:140)
at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
... 32 more
Mar 17, 2020 3:19:59 PM org.openqa.selenium.os.OsProcess destroy
INFO: Unable to drain process streams. Ignoring but the exception being swallowed follows.
org.apache.commons.exec.ExecuteException: The stop timeout of 2000 ms was exceeded (Exit value: -559038737)
at org.apache.commons.exec.PumpStreamHandler.stopThread(PumpStreamHandler.java:295)
at org.apache.commons.exec.PumpStreamHandler.stop(PumpStreamHandler.java:181)
at org.openqa.selenium.os.OsProcess.destroy(OsProcess.java:136)
at org.openqa.selenium.os.CommandLine.destroy(CommandLine.java:153)
at org.openqa.selenium.remote.service.DriverService.stop(DriverService.java:232)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:95)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:609)
at org.openqa.selenium.remote.RemoteWebDriver.quit(RemoteWebDriver.java:452)
at com.company.bot.bot.Parsing.WebSurfing.connect(WebSurfing.java:58)
at com.company.bot.bot.Parsing.ParsinTop.start(ParsinTop.java:41)
at com.company.bot.bot.Parsing.ParsinTop.run(ParsinTop.java:19)
at java.lang.Thread.run(Thread.java:748)
之后,在我的情况下,在 geckodriver 上的 Firefox 中重新启动浏览器。重新启动后,一个新的错误飞了
Mar 17, 2020 3:19:59 PM org.openqa.selenium.os.OsProcess destroy
INFO: Unable to drain process streams. Ignoring but the exception being swallowed follows.
org.apache.commons.exec.ExecuteException: The stop timeout of 2000 ms was exceeded (Exit value: -559038737)
at org.apache.commons.exec.PumpStreamHandler.stopThread(PumpStreamHandler.java:295)
at org.apache.commons.exec.PumpStreamHandler.stop(PumpStreamHandler.java:181)
at org.openqa.selenium.os.OsProcess.destroy(OsProcess.java:136)
at org.openqa.selenium.os.CommandLine.destroy(CommandLine.java:153)
at org.openqa.selenium.remote.service.DriverService.stop(DriverService.java:232)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:95)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:609)
at org.openqa.selenium.remote.RemoteWebDriver.quit(RemoteWebDriver.java:452)
at com.company.bot.bot.Parsing.WebSurfing.connect(WebSurfing.java:58)
at com.company.bot.bot.Parsing.ParsinTop.start(ParsinTop.java:41)
at com.company.bot.bot.Parsing.ParsinTop.run(ParsinTop.java:19)
at java.lang.Thread.run(Thread.java:748)
Mar 17, 2020 3:19:59 PM org.openqa.selenium.os.OsProcess destroy
SEVERE: Unable to kill process java.lang.UNIXProcess@78019293
...
但是它没有工作,而是连续多次启动浏览器,之后发生冲突。新打开的浏览器出现错误并且不关闭,但挂起工作,重新加载服务器
org.openqa.selenium.WebDriverException: connection refused
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: 'autoru-1579587188088-s-1vcpu-1gb-ams3-01', ip: '127.0.1.1', os.name: 'Linux', os.arch: 'amd64', os.version: '4.15.0-88-generic', java.version: '1.8.0_242'
Driver info: driver.version: FirefoxDriver
remote stacktrace:
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.openqa.selenium.remote.W3CHandshakeResponse.lambda$errorHandler$0(W3CHandshakeResponse.java:62)
at org.openqa.selenium.remote.HandshakeResponse.lambda$getResponseFunction$0(HandshakeResponse.java:30)
at org.openqa.selenium.remote.ProtocolHandshake.lambda$createSession$0(ProtocolHandshake.java:126)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Spliterators$ArraySpliterator.tryAdvance(Spliterators.java:958)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:531)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:128)
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:74)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:136)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:213)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:131)
at org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:147)
at com.company.bot.bot.Parsing.WebSurfing.connect(WebSurfing.java:25)
at com.company.bot.bot.Parsing.ParsinTop.start(ParsinTop.java:41)
at com.company.bot.bot.Parsing.ParsinTop.run(ParsinTop.java:19)
at java.lang.Thread.run(Thread.java:748)
执行解析和启动浏览器的类本身
public class WebSurfing extends ArrayList<String>{
public HashMap<String, String> connect(String url){
HashMap<String, String> hm = new HashMap<>();
System.setProperty("webdriver.gecko.driver", "/usr/bin/geckodriver"); // /usr/bin/geckodriver
FirefoxOptions options = new FirefoxOptions();
options.addArguments("--headless");
WebDriver driver = new FirefoxDriver(options);
try {
url = url.replace("https://auto.ru/cars/", "https://auto.ru/moskva/cars/");
driver.get(url);
System.out.println("Connect to " + url);
driver.findElements(By.xpath("//div[@class='button button_blue']")).get(0).click();
System.out.println("After accept");
Thread.sleep(10000);
try {
Document document = Jsoup.parse(driver.getPageSource());
Elements elements = document.getElementsByClass("Link CardDealerName-module__dealerName");
hm.put("Результат", elements.get(0).attr("href"));
System.out.println(hm.get("Результат"));
hm.put("Конкурент", elements.get(0).text());
System.out.println(hm.get("Конкурент"));
} catch (Exception e) {
hm.put("Результат", " ");
hm.put("Конкурент", " ");
e.printStackTrace();
}
} catch(Exception ex){
ex.printStackTrace();
} finally {
try {
driver.close();
} catch (Exception e){}
try {
driver.quit();
} catch (Exception e){}
}
return hm;
}
}
也许我做错了什么,怎么了?
我将浏览器更改为 GoogleDriver,一切正常运行了 4 天。我用参数打开浏览器
机器人正在飞行。数据库正在工作。没有错误。确实,检查链接的速度是有限的,我认为这是因为我在每次请求之前打开和关闭浏览器,如果我只是关闭选项卡并打开一个新选项卡,它很可能会运行得更快。所以问题可能出在浏览器上,或者内存耗尽,也许 FirefoxDriver 在某些时候重载了服务器并且一切都失败了。但与此同时,FirefoxDriver 自己重新加载检查了 3-4 个链接,GoogleDriver 在这方面落后,可能是由于给定的参数。