I'm attempting to modify the TensorFlow Lite Audio Classification Android Demo to simultaneously run YAMNet.tflite and speech.tflite models. My goal is to make the app react when YAMNet detects speech and the speech model detects up, down, left, or right commands. However, since the input tensor for YAMNet is (1,16000) and for speech is (1,44032), the input tensors for these two models are different, and only the classifier declared first will make predictions.
I would like to ask if there's any way to modify this sample code to run both models with different input tensors simultaneously? Or are there any better approaches to achieve the functionality I desire?