ODIR-5K: Should I train a Dual-Input CNN (Left/Right) or split images for Patient-Level Multi-Label Classification?

Ask Question

I am working with the ODIR-5K (Ocular Disease Intelligent Recognition) dataset. The goal is multi-label classification of 8 ocular diseases (Normal, Diabetes, Glaucoma, Cataract, etc.).

The Data Structure Problem The dataset provides two images per patient (Left Eye and Right Eye), but only one set of labels for the patient for each disease. though it has Two columns for each of eye containing its diagnostic keyword.

Approach 1: Single-Input (Splitting the Data)

I restructure the dataframe to treat every image as an independent sample.

Concern: This introduces "Label Noise." If a patient has a Cataract only in the Left eye, splitting the data forces the model to treat the healthy Right eye as "Cataract Positive."
I did train the cnnn based on this but it gave horrible results.

Approach 2: Dual-Input

I keep the patient grouped and feed both eyes simultaneously.

Inputs: [Left_Input, Right_Input]
Architecture: Two CNN branches (sharing weights) $\to$ Concatenate $\to$ Dense Layers $\to$ Output.
Concern: Higher computational cost and complex data generation

Approach 3: Using the diagnostic keyword to alter the disease data for each image.

Concern: not every disease is there & the diagnostic keywords are too complex to analyze for each image due to subtle differences in keywords

My Questions:

What is the standard practice for handling "Patient-Level Labels" with "Organ-Level Images"?
Which approach should i use also please tell if i have some other options or Tricks to use.
Also if i train based on approach 2 how do i augment the data

Collectives™ on Stack Overflow

ODIR-5K: Should I train a Dual-Input CNN (Left/Right) or split images for Patient-Level Multi-Label Classification?