Voice assistance

Hi,
I have the crazy idea to build a battery powered device with offline voice assistance. The most time, it should be in the idle mode, and the voice activity detector in combination with a wake word detector are waiting for a wake word to activate the general speech recognition. In idle mode, my device shouldn’t consume more than 1-2 Watts.
Your AICore SG2300x may have enough computational power. But, it has no GPU for a display and audio has to be recorded somewhere. Is it somehow possible to extend SG2300x by a GPU? Another idea would be to take one of the smaller Rock PIs to transfer the audio data to the SG2300x and to power it up and down. When powered down the language model(s) should be kept in RAM. What hardware composition do you recommend? Is there something in development on your side that I should wait a bit longer? What would be the idle power consumption for your suggested composition?

Kind regards,
Andreas

I think the better idea is to use Airbox as your LLM server and you can use some battery powered wireless mcu such as esp32 for the voice streaming to the Airbox and send back the response of words and then use TTS to output audio on ESP32.

Many thanks for your answer. You seem to prefer the 2 computers solution. But in my single portable device, both computers will be battery powered. Do AICore or Airbox have a sleep mode and can they be woken up with a wake-up pin? If yes, what is the power usage in that case?

Kind regards,
Andreas