Skip to main content
1 vote
1 answer
28 views

On modern GPUs: Do threads still execute in lockstep? Or what is independent thread scheduling all about?

My mental model of how a GPU works is still that which is described by NVIDIA's article Life of a triangle - NVIDIA's logical pipeline. I.e., a warp scheduler sets one instruction, and all the 32 ...
j00hi's user avatar
  • 6,009
-1 votes
0 answers
27 views

Inconsistent GPU Usage in Colab Pro — Same Code, Different Performance [closed]

I have purchased Google Colab Pro and I’m using it to run a notebook based on a code from Kaggle. I enabled the GPU runtime (Runtime > Change runtime type >A100 GPU) and confirmed that the GPU ...
Fatemeh Khamoushi's user avatar
0 votes
0 answers
23 views

Error Running OCR with Qwen2.5-VL in Colab

I am trying to run the OCR functionality of Qwen2.5-VL by following the tutorial provided in this notebook: OCR Tutorial Notebook However, I am encountering an error when attempting to execute the ...
JS3's user avatar
  • 1,879
0 votes
0 answers
24 views

How can I make GStreamer WPE use the GPU inside a Docker container on AWS ECS? [closed]

I’m running a GStreamer pipeline inside a Docker container on AWS ECS. The pipeline overlays a web page onto a video stream using WPE. However, WPE rendering in software mode is extremely slow, so I ...
Vlad.Z's user avatar
  • 170
-4 votes
0 answers
33 views

Guys i am stucked on ubuntu... whenever i try to run gpu on rent they just give the error of gpu not found [closed]

Using Theano with GPU on Ubuntu 14.04 on AWS g2 I'm having trouble getting Theano to use the GPU on my machine. When I run: /usr/local/lib/python2.7/dist-packages/theano/misc$ THEANO_FLAGS=floatX=...
muaz's user avatar
  • 1
-1 votes
0 answers
27 views

How can I make D3D9 recognize and use both GPUs?

Problem Description I have two graphics cards installed (including an NVIDIA Quadro M4000), but IDirect3D9::GetAdapterCount() only returns 1. How can I make my Direct3D9 application recognize both ...
Hui Mao's user avatar
  • 11
3 votes
0 answers
31 views

Can older spaCy models be ported to future spaCy versions?

The latest spaCy versions have better performance and compatibility for GPU acceleration on Apple devices, but I have an existing project that depends on spaCy 3.1.4 and some of the specific behavior ...
synchronizer's user avatar
  • 2,105
-3 votes
0 answers
59 views

Is there any command options for filtering the events in nsys?

I need to trace how the GPU accesses memory addresses on runtime. With nsys, I collected data from command below: nsys profile --export=json --cuda-um-gpu-page-faults=true ./run middle of the content ...
kdh's user avatar
  • 130
0 votes
1 answer
70 views

How to get the frame buffer of current Activity's window surface on GPU side without copy?

I need to get the image of current activiy surface with 30-60 FPS and use it as the source texture in opengl es for rendering. I found the API PixelCopy can copy the window surface to bitmap, but I ...
dragonfly's user avatar
  • 1,219
1 vote
1 answer
151 views

Why does mlx.core.sqrt() crash on my MacBook Air M2 when applied to a complex argument?

mlx.core.sqrt() is crashing on my MacBook Air M2 when applied to a complex argument: Python 3.11.11 (main, Dec 3 2024, 17:20:40) [Clang 16.0.0 (clang-1600.0.26.4)] on darwin Type "help", &...
David Banas's user avatar
  • 1,958
-1 votes
0 answers
124 views

Migrating Drawing Operations from CPU to GPU in MFC [closed]

In an MFC application, drawing is currently implemented for elements such as rectangles using the CDC object. This means the CPU processes the drawing. Is it possible to move the drawing operations to ...
SDR's user avatar
  • 57
0 votes
0 answers
25 views

The k8s pod can obtain more gpu memory than volcano specifies

the specify of volcano queue apiVersion: scheduling.volcano.sh/v1beta1 kind: Queue metadata: name: model-deploy spec: weight: 1 capability: cpu: 10 memory: 20Gi volcano.sh/vgpu-...
张纳心's user avatar
0 votes
1 answer
29 views

Does pyopencl transfer arrays to host memory implicitly?

I have AMD GPU. I'm using pyopencl. I have a context and a queue. Then I created an array: import pyopencl import pyopencl.array ctx = pyopencl.create_some_context(interactive=False) queue = pyopencl....
haael's user avatar
  • 1,059
0 votes
0 answers
37 views

Can olmocr Run on Two 12 GB Titan X GPUs?

I’m trying to run olmocr (https://github.com/allenai/olmocr) locally, which requires a GPU with 20 GB RAM. I have two Titan X GPUs (12 GB each). When I run it, I get: ERROR:olmocr.check:Torch was not ...
Dandelion's user avatar
  • 756
1 vote
1 answer
64 views

Reproducibility of JAX calculations

I am using JAX in running Reinforcement Learning (RL) & Multi-Agent Reinforcement Learning (MARL) calculations. I have noticed the following behaviour: In RL, my results are always fully ...
amavrits's user avatar

15 30 50 per page
1
2 3 4 5
612