9,174 questions
1
vote
1
answer
28
views
On modern GPUs: Do threads still execute in lockstep? Or what is independent thread scheduling all about?
My mental model of how a GPU works is still that which is described by NVIDIA's article Life of a triangle - NVIDIA's logical pipeline. I.e., a warp scheduler sets one instruction, and all the 32 ...
-1
votes
0
answers
27
views
Inconsistent GPU Usage in Colab Pro — Same Code, Different Performance [closed]
I have purchased Google Colab Pro and I’m using it to run a notebook based on a code from Kaggle. I enabled the GPU runtime (Runtime > Change runtime type >A100 GPU) and confirmed that the GPU ...
0
votes
0
answers
23
views
Error Running OCR with Qwen2.5-VL in Colab
I am trying to run the OCR functionality of Qwen2.5-VL by following the tutorial provided in this notebook: OCR Tutorial Notebook
However, I am encountering an error when attempting to execute the ...
0
votes
0
answers
24
views
How can I make GStreamer WPE use the GPU inside a Docker container on AWS ECS? [closed]
I’m running a GStreamer pipeline inside a Docker container on AWS ECS. The pipeline overlays a web page onto a video stream using WPE. However, WPE rendering in software mode is extremely slow, so I ...
-4
votes
0
answers
33
views
Guys i am stucked on ubuntu... whenever i try to run gpu on rent they just give the error of gpu not found [closed]
Using Theano with GPU on Ubuntu 14.04 on AWS g2
I'm having trouble getting Theano to use the GPU on my machine. When I run: /usr/local/lib/python2.7/dist-packages/theano/misc$ THEANO_FLAGS=floatX=...
-1
votes
0
answers
27
views
How can I make D3D9 recognize and use both GPUs?
Problem Description
I have two graphics cards installed (including an NVIDIA Quadro M4000), but IDirect3D9::GetAdapterCount() only returns 1. How can I make my Direct3D9 application recognize both ...
3
votes
0
answers
31
views
Can older spaCy models be ported to future spaCy versions?
The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...
-3
votes
0
answers
59
views
Is there any command options for filtering the events in nsys?
I need to trace how the GPU accesses memory addresses on runtime.
With nsys, I collected data from command below:
nsys profile --export=json --cuda-um-gpu-page-faults=true ./run
middle of the content ...
0
votes
1
answer
70
views
How to get the frame buffer of current Activity's window surface on GPU side without copy?
I need to get the image of current activiy surface with 30-60 FPS and use it as the source texture in opengl es for rendering.
I found the API PixelCopy can copy the window surface to bitmap, but I ...
1
vote
1
answer
151
views
Why does mlx.core.sqrt() crash on my MacBook Air M2 when applied to a complex argument?
mlx.core.sqrt() is crashing on my MacBook Air M2 when applied to a complex argument:
Python 3.11.11 (main, Dec 3 2024, 17:20:40) [Clang 16.0.0 (clang-1600.0.26.4)] on darwin
Type "help", &...
-1
votes
0
answers
124
views
Migrating Drawing Operations from CPU to GPU in MFC [closed]
In an MFC application, drawing is currently implemented for elements such as rectangles using the CDC object. This means the CPU processes the drawing. Is it possible to move the drawing operations to ...
0
votes
0
answers
25
views
The k8s pod can obtain more gpu memory than volcano specifies
the specify of volcano queue
apiVersion: scheduling.volcano.sh/v1beta1
kind: Queue
metadata:
name: model-deploy
spec:
weight: 1
capability:
cpu: 10
memory: 20Gi
volcano.sh/vgpu-...
0
votes
1
answer
29
views
Does pyopencl transfer arrays to host memory implicitly?
I have AMD GPU. I'm using pyopencl. I have a context and a queue. Then I created an array:
import pyopencl
import pyopencl.array
ctx = pyopencl.create_some_context(interactive=False)
queue = pyopencl....
0
votes
0
answers
37
views
Can olmocr Run on Two 12 GB Titan X GPUs?
I’m trying to run olmocr (https://github.com/allenai/olmocr) locally, which requires a GPU with 20 GB RAM. I have two Titan X GPUs (12 GB each). When I run it, I get:
ERROR:olmocr.check:Torch was not ...
1
vote
1
answer
64
views
Reproducibility of JAX calculations
I am using JAX in running Reinforcement Learning (RL) & Multi-Agent Reinforcement Learning (MARL) calculations. I have noticed the following behaviour:
In RL, my results are always fully ...