3

I am following the Flink official tutorial to start a session in native Kubernetes.

First I created a clean new cluster.

However, after running

./bin/kubernetes-session.sh -Dkubernetes.cluster-id=my-first-flink-cluster

I got error in the pod my-first-flink-cluster-xxx log that just got created:

2021-08-14 18:33:02,519 WARN  io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager [] - Exec Failure: HTTP 403, Status: 403 - pods is forbidden: User "system:serviceaccount:default:default" cannot watch resource "pods" in API group "" in the namespace "default"
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_302]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_302]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_302]
2021-08-14 18:33:02,585 INFO  org.apache.flink.kubernetes.kubeclient.resources.KubernetesPodsWatcher [] - The watcher is closing.
2021-08-14 18:33:02,592 INFO  org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Closing the slot manager.
Exception in thread "OkHttp Dispatcher" java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@b328667 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@31982176[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
    at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
    at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
    at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
    at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
    at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
    at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:678)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.scheduleReconnect(WatchConnectionManager.java:305)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$800(WatchConnectionManager.java:50)
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:218)
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
    at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2021-08-14 18:33:02,624 ERROR org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - Fatal error occurred in ResourceManager.
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Could not start the ResourceManager akka.tcp://[email protected]:6123/user/rpc/resourcemanager_0
    at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:239) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:181) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:605) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:180) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.Actor.aroundReceive(Actor.scala:517) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.Actor.aroundReceive$(Actor.scala:515) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.13.1.jar:1.13.1]
Caused by: org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Cannot initialize resource provider.
    at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:156) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:251) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:235) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    ... 22 more
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: User "system:serviceaccount:default:default" cannot watch resource "pods" in API group "" in the namespace "default"
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:203) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_302]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_302]
    at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_302]
    Suppressed: java.lang.Throwable: waiting here
        at io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:144) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.waitUntilReady(WatchConnectionManager.java:341) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:755) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:739) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:70) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.watchPodsAndDoCallback(Fabric8FlinkKubeClient.java:227) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.KubernetesResourceManagerDriver.watchTaskManagerPods(KubernetesResourceManagerDriver.java:331) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.KubernetesResourceManagerDriver.initializeInternal(KubernetesResourceManagerDriver.java:103) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:81) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:154) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:251) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:235) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:181) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:605) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:180) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.Actor.aroundReceive(Actor.scala:517) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.Actor.aroundReceive$(Actor.scala:515) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.13.1.jar:1.13.1]
2021-08-14 18:33:02,773 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Fatal error occurred in the cluster entrypoint.
org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Could not start the ResourceManager akka.tcp://[email protected]:6123/user/rpc/resourcemanager_0
    at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:239) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:181) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:605) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:180) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.Actor.aroundReceive(Actor.scala:517) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.Actor.aroundReceive$(Actor.scala:515) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.13.1.jar:1.13.1]
    at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.13.1.jar:1.13.1]
Caused by: org.apache.flink.runtime.resourcemanager.exceptions.ResourceManagerException: Cannot initialize resource provider.
    at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:156) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:251) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:235) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    ... 22 more
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: pods is forbidden: User "system:serviceaccount:default:default" cannot watch resource "pods" in API group "" in the namespace "default"
    at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:203) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at org.apache.flink.kubernetes.shaded.okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_302]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_302]
    at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_302]
    Suppressed: java.lang.Throwable: waiting here
        at io.fabric8.kubernetes.client.utils.Utils.waitUntilReady(Utils.java:144) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.waitUntilReady(WatchConnectionManager.java:341) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:755) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:739) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at io.fabric8.kubernetes.client.dsl.base.BaseOperation.watch(BaseOperation.java:70) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.watchPodsAndDoCallback(Fabric8FlinkKubeClient.java:227) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.KubernetesResourceManagerDriver.watchTaskManagerPods(KubernetesResourceManagerDriver.java:331) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.kubernetes.KubernetesResourceManagerDriver.initializeInternal(KubernetesResourceManagerDriver.java:103) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:81) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:154) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:251) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:235) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:181) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StoppedState.start(AkkaRpcActor.java:605) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:180) ~[flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.Actor.aroundReceive(Actor.scala:517) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.Actor.aroundReceive$(Actor.scala:515) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.13.1.jar:1.13.1]
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.13.1.jar:1.13.1]
2021-08-14 18:33:02,838 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting KubernetesSessionClusterEntrypoint down with application status UNKNOWN. Diagnostics Cluster entrypoint has been closed externally..
2021-08-14 18:33:02,876 INFO  org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint   [] - Shutting down rest endpoint.

And this pod keeps restarting.

1 Answer 1

11

After being stuck here for a long time, I finally made it. Hope it saves some time for future people.

In the RBAC section, it mentions

Every namespace has a default service account. However, the default service account may not have the permission to create or delete pods within the Kubernetes cluster. Users may need to update the permission of the default service account or specify another service account that has the right role bound.

Here is the way creating another service account:

kubectl create serviceaccount flink-service-account
kubectl create clusterrolebinding flink-role-binding-flink --clusterrole=edit --serviceaccount=default:flink-service-account

After creating the service account, you need to pass one more arg kubernetes.jobmanager.service-account for the command to start the session:

./bin/kubernetes-session.sh \
    -Dkubernetes.cluster-id=my-first-flink-cluster \
    -Dkubernetes.jobmanager.service-account=flink-service-account

All args can be found at https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#kubernetes

Now the session can be successfully started!

Sign up to request clarification or add additional context in comments.

1 Comment

This works in minikube too, great!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.