Hello,
I am reading data from parquet files as Dataframes and uploading them to iceberg using spark.
I have many small files that may write to same table .
I have an issue that after some files are written executor goes into stale state and just hangs there. the running query does not do anything,
df.WriteTo(fullTableName)
.Options(tableOptions)
.Append();
Stale yet running query looks like this:
append at <unknown>:0
org.apache.spark.sql.DataFrameWriterV2.append(DataFrameWriterV2.scala:153)
jdk.internal.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.base/java.lang.reflect.Method.invoke(Method.java:568)
org.apache.spark.api.dotnet.DotnetBackendHandler.handleMethodCall(DotnetBackendHandler.scala:167)
org.apache.spark.api.dotnet.DotnetBackendHandler.$anonfun$handleBackendRequest$2(DotnetBackendHandler.scala:105)
org.apache.spark.api.dotnet.ThreadPool$$anon$1.run(ThreadPool.scala:34)
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.base/java.lang.Thread.run(Thread.java:842)
Nuget version: 2.3.0
spark version 3.5.3
hadoop 3.4.1
jdk-17
Hello,
I am reading data from parquet files as Dataframes and uploading them to iceberg using spark.
I have many small files that may write to same table .
I have an issue that after some files are written executor goes into stale state and just hangs there. the running query does not do anything,
Stale yet running query looks like this:
Nuget version: 2.3.0
spark version 3.5.3
hadoop 3.4.1
jdk-17