返回

apache spark-Databricks连接超时

发布时间:2022-02-21 15:49:16 359
# java# scala# java# sql

我正在使用databricks连接代理。代理配置为在长轮询过程中每30秒中断一次连接。在db connect中运行较长的作业会引发以下异常:

WARN SparkServiceRPCClient: Retrying failed request f519919e-dc9f-446f-b542-48331672e8cb (9 of 10 max retries): java.util.concurrent.ExecutionException: java.io.EOFException: HttpConnectionOverHTTP@55af9b55::DecryptedEndPoint@192f8bce{/127.0.0.1:3128<->/127.0.0.1:58030,OPEN,fill=-,flush=-,to=30338/60000}

WARN SparkServiceRPCClient: Retrying failed request f519919e-dc9f-446f-b542-48331672e8cb (10 of 10 max retries): java.util.concurrent.ExecutionException: java.io.EOFException: HttpConnectionOverHTTP@3f110c27::DecryptedEndPoint@7a1385a1{/127.0.0.1:3128<->/127.0.0.1:58034,OPEN,fill=-,flush=-,to=30002/60000}

WARN SparkServiceRPCClient: Retrying failed request f519919e-dc9f-446f-b542-48331672e8cb (11 of 10 max retries): java.util.concurrent.ExecutionException: java.io.EOFException: HttpConnectionOverHTTP@42ad0604::DecryptedEndPoint@43e866b{/127.0.0.1:3128<->/127.0.0.1:58038,OPEN,fill=-,flush=-,to=30123/60000}

WARN SparkServiceRPCClient: Exceeded max number of RPC retries for f519919e-dc9f-446f-b542-48331672e8cb
    An error occurred while calling o188.save.
    : java.util.concurrent.ExecutionException: java.io.EOFException: HttpConnectionOverHTTP@6777d9fd::DecryptedEndPoint@7fd9b24e{/127.0.0.1:3128<->/127.0.0.1:58046,OPEN,fill=-,flush=-,to=30445/60000}
    at org.sparkproject.jetty.client.util.FutureResponseListener.getResult(FutureResponseListener.java:118)
    at org.sparkproject.jetty.client.util.FutureResponseListener.get(FutureResponseListener.java:101)
    at com.databricks.service.DBAPIClient.post(DBAPIClient.scala:67)
    at com.databricks.service.SparkServiceRPCClient.$anonfun$doPost$1(SparkServiceRPCClient.scala:113)
    at com.databricks.service.SparkServiceRPCClient.handleResponse(SparkServiceRPCClient.scala:120)
    at com.databricks.service.SparkServiceRPCClient.doPost(SparkServiceRPCClient.scala:113)
    at com.databricks.service.SparkServiceRPCClient.executeRPC0(SparkServiceRPCClient.scala:79)
    at com.databricks.service.SparkServiceRemoteFuncRunner.withRpcRetries(SparkServiceRemoteFuncRunner.scala:227)
    at com.databricks.service.SparkServiceRemoteFuncRunner.executeRPC(SparkServiceRemoteFuncRunner.scala:156)
    at com.databricks.service.SparkServiceRemoteFuncRunner.executeRPCHandleCancels(SparkServiceRemoteFuncRunner.scala:287)
    at com.databricks.service.SparkServiceRemoteFuncRunner.$anonfun$execute0$1(SparkServiceRemoteFuncRunner.scala:118)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.service.SparkServiceRemoteFuncRunner.withRetry(SparkServiceRemoteFuncRunner.scala:135)
    at com.databricks.service.SparkServiceRemoteFuncRunner.execute0(SparkServiceRemoteFuncRunner.scala:113)
    at com.databricks.service.SparkServiceRemoteFuncRunner.$anonfun$execute$1(SparkServiceRemoteFuncRunner.scala:86)
    at com.databricks.spark.util.Log4jUsageLogger.recordOperation(UsageLogger.scala:237)
    at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:401)
    at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:380)
    at com.databricks.service.SparkServiceRPCClientStub.recordOperation(SparkServiceRPCClientStub.scala:61)
    at com.databricks.service.SparkServiceRemoteFuncRunner.execute(SparkServiceRemoteFuncRunner.scala:78)
    at com.databricks.service.SparkServiceRemoteFuncRunner.execute$(SparkServiceRemoteFuncRunner.scala:67)
    at com.databricks.service.SparkServiceRPCClientStub.execute(SparkServiceRPCClientStub.scala:61)
    at com.databricks.service.SparkServiceRPCClientStub.executePlan(SparkServiceRPCClientStub.scala:201)
    at org.apache.spark.sql.DataFrameWriter.doRemoteSave(DataFrameWriter.scala:1014)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:304)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:294)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
    at py4j.Gateway.invoke(Gateway.java:295)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:251)
    at java.base/java.lang.Thread.run(Thread.java:829)
...

您是否能够设置一个spark参数,该参数负责SparkService(客户端)的重试次数。我可以隐藏警告,并增加使用db connect和代理的数量。否则,我认为没有机会阻止连接中断。

特别声明:以上内容(图片及文字)均为互联网收集或者用户上传发布,本站仅提供信息存储服务!如有侵权或有涉及法律问题请联系我们。
举报
评论区(0)
按点赞数排序
用户头像