Newest 'computer-vision' Questions - Data Science Stack Exchange

0 votes

0 answers

18 views

does object Aspect-ratio affect our resize policy?

Is object aspect ratio truly important for resize robustness, or is this suggestion based on a misunderstanding — e.g., treating a very wide object as if it is “bigger” or “pixel-richer” than a square ...

vinvin

1

asked Dec 9, 2025 at 4:53

0 votes

1 answer

305 views

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

I'm trying to distill a YOLO11x detection model into a YOLO11n for inference speed improvements without sacrificing too much detection performance. For this, I just overloaded some functions in the ...

Simon Hergott

1

asked Jul 11, 2025 at 11:30

3 votes

1 answer

62 views

Can I train what background is in a picture?

I have ~1,000 pictures like this. I really want the long, thin rock core in the middle, but they all differ (slightly) in angle, have different lengths, shaped ends and rock colours vary. I tried ...

user24007

133

asked Jul 9, 2025 at 3:10

2 votes

0 answers

68 views

DensNet169 model accuracy not increasing on medical classification dataset

I am training an DensNet model on medical dataset which has gold standards as per annotation. After training i noticed accuracy is just 60%. Later i performed following changes but still no luck. ...

NIrbhay Mathur

123

asked May 22, 2025 at 4:15

2 votes

0 answers

63 views

human detection using Thermal Imaging camera and Machine Learning on Raspberry Pi

I'm working on a Raspberry Pi 4–based project involving the MLX90640 thermal camera breakout. The camera outputs a thermal heat map (a low-resolution infrared image of 32x24 pixels). My goal is to ...

Zak A

21

asked May 18, 2025 at 10:52

5 votes

1 answer

135 views

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

I’m working on an object detection system and I'm new to this field. Here i'm talking with respect to camera point of view. When a object is detected which is far from the camera, it appears small and ...

Basavaraj Kittali

51

asked May 14, 2025 at 12:42

1 vote

1 answer

82 views

Need support to straighten,crop image properly for requirement in computer vision

My requirement: Need to extract license plates without duplicates and store images in a folder,then apply ocr to extract text from images. What i have achieved: Iam able to detect license plates ...

Raj

11

asked May 6, 2025 at 12:45

0 votes

0 answers

43 views

How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?

I am working on my first object detection project and need to implement multi-object detection using ResNet-18 (I am restricted to using this architecture). My dataset follows the COCO format and ...

Daniel

11

asked Mar 17, 2025 at 10:50

0 votes

0 answers

39 views

Validation metrics plateau from the first few epochs at relatively good values and don't improve

I am working on 6D pose tracking, where the goal is to estimate how 3D position and orientation of an object changes from frame t-1 to t. Train/validation datasets are synthetic and come from a single ...

zak

81

asked Mar 11, 2025 at 12:48

0 votes

0 answers

52 views

How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?

I have an image of a one-line substation schema diagram that includes various components (like transformers, circuit breakers, etc.) and the connections between them. I’m looking for a way to convert ...

Necrosis

9

asked Mar 9, 2025 at 10:51

1 vote

0 answers

29 views

Persistent 6D Rotation Representation Collapse to near-zero magnitudes in sequential camera rotation estimation

I am using a 6D continuous rotation representation (e.g., two orthogonal vectors from a 3×3 rotation matrix) to predict camera rotations in panoramic video sequences. Since panoramic videos involve ...

yep123

111

asked Mar 6, 2025 at 1:48

1 vote

0 answers

45 views

CNN for gaze regression predicts near the mean

I am currently building my first CNN network on my own for a regression task for which the network must predict the coordinates I am looking at on my screen based on an input image taken through my ...

bebel

175

asked Mar 1, 2025 at 13:24

0 votes

1 answer

41 views

Looking for images dataset with multiple images per instance

I'm looking for images dataset which have multiple images per instance. For example, healthcare dataset, where each person is classified with a diagnosys and have several images describing them.

J. Doe

101

asked Feb 6, 2025 at 19:24

1 vote

0 answers

38 views

How to make a correct prompt for "gpt-4o" vision API to find letters in an image?

I have an example of a generated image containing words, as well as several red arrows pointing to certain characters. I need to get these characters from GPT, but when I ask "what characters do ...

user175111

11

asked Dec 17, 2024 at 21:13

2 votes

0 answers

60 views

How to deal with many unlabeled data in an object detection dataset

I've a large multi-class object detection image dataset. The goal is to use a Yolo(v11) model to be trained on the aforementioned dataset to solve the object detection task. My intuition says that the ...

Ramiro Hum-Sah

143

asked Nov 19, 2024 at 0:24

Stack Exchange Network

Questions tagged [computer-vision]

does object Aspect-ratio affect our resize policy?

YOLO knowledge distillation (11x to 11n) yields poorer performance than native training

Can I train what background is in a picture?

DensNet169 model accuracy not increasing on medical classification dataset

human detection using Thermal Imaging camera and Machine Learning on Raspberry Pi

How to normalize bounding box sizes in perspective transform for objects at different distances from the camera

Need support to straighten,crop image properly for requirement in computer vision

How to properly implement and debug RPN anchors in ResNet-18 for multi-object detection?

Validation metrics plateau from the first few epochs at relatively good values and don't improve

How can I convert a one-line substation schema image into XML/JSON with all components and connections preserved?

Persistent 6D Rotation Representation Collapse to near-zero magnitudes in sequential camera rotation estimation

CNN for gaze regression predicts near the mean

Looking for images dataset with multiple images per instance

How to make a correct prompt for "gpt-4o" vision API to find letters in an image?

How to deal with many unlabeled data in an object detection dataset

Hot Network Questions