Skip to content

Issue with iceberg.metadata-cache: Not an Avro data file #25702

Open
@memignone

Description

@memignone

Upgrading from Trino 453 to 454 produced the following errors on queries to my iceberg catalog

Stacktrace:

org.apache.iceberg.exceptions.RuntimeIOException: Failed to open file: s3a://target-bucket/registry/metadata-target/3c65473c-48f0-478b-96d1-9fda7ae565cc-m0.avro
	at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:104)
	at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:77)
	at org.apache.iceberg.io.CloseableIterable$7$1.<init>(CloseableIterable.java:188)
	at org.apache.iceberg.io.CloseableIterable$7.iterator(CloseableIterable.java:187)
	at org.apache.iceberg.io.CloseableIterable.lambda$filter$0(CloseableIterable.java:109)
	at org.apache.iceberg.io.CloseableIterable$2.iterator(CloseableIterable.java:72)
	at org.apache.iceberg.io.CloseableIterable.lambda$filter$1(CloseableIterable.java:136)
	at org.apache.iceberg.io.CloseableIterable$2.iterator(CloseableIterable.java:72)
	at org.apache.iceberg.io.CloseableIterable$7$1.<init>(CloseableIterable.java:188)
	at org.apache.iceberg.io.CloseableIterable$7.iterator(CloseableIterable.java:187)
	at org.apache.iceberg.ManifestGroup$1.iterator(ManifestGroup.java:346)
	at org.apache.iceberg.io.CloseableIterable$ConcatCloseableIterable$ConcatCloseableIterator.hasNext(CloseableIterable.java:257)
	at io.trino.plugin.iceberg.IcebergSplitSource.getNextBatch(IcebergSplitSource.java:253)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorSplitSource.getNextBatch(ClassLoaderSafeConnectorSplitSource.java:43)
	at io.trino.split.ConnectorAwareSplitSource.getNextBatch(ConnectorAwareSplitSource.java:73)
	at io.trino.split.TracingSplitSource.getNextBatch(TracingSplitSource.java:64)
	at io.trino.split.BufferingSplitSource$GetNextBatch.fetchSplits(BufferingSplitSource.java:130)
	at io.trino.split.BufferingSplitSource$GetNextBatch.fetchNextBatchAsync(BufferingSplitSource.java:112)
	at io.trino.split.BufferingSplitSource.getNextBatch(BufferingSplitSource.java:61)
	at io.trino.split.TracingSplitSource.getNextBatch(TracingSplitSource.java:64)
	at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:247)
	at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
	at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1279)
	at io.trino.$gen.Trino_454____20250429_124140_2.run(Unknown Source)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1570)
Caused by: org.apache.avro.InvalidAvroMagicException: Not an Avro data file
	at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:79)
	at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:102)
	... 30 more

Query: select * from iceberg_target.my_schema.registry limit 1;

Catalog definition:

connector.name=iceberg
hive.metastore.uri=thrift://hive-metastore:9083
fs.native-s3.enabled=true
s3.path-style-access=true
s3.endpoint=http://minio-target:9002
s3.region=us-east-1
s3.aws-access-key=###
s3.aws-secret-key=###
iceberg.file-format=PARQUET

Disabling the Iceberg metadata caching feature in the definition of the catalog makes it work again:

iceberg.metadata-cache.enabled=false

In newer releases (I tested it only on 462 and 475) the error is masked as the error reported in this issue:

Failed to open file: io.trino.filesystem.cache.CacheInputFile@3849a554

Metadata

Metadata

Assignees

No one assigned

    Labels

    icebergIceberg connector

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions