In this tutorial you will learn how to build your own AI Language Tutor on a UNIHIKER M10 using OpenAI’s AI models. You will need an OpenAI account and an API key for that.
The UNIHIKER M10 is a small board with 512 MB RAM and 16 GB eMMC storage, and a 2.8-inch touchscreen. Apart from Wi-Fi, and Bluetooth connectivity the board includes built-in sensors such as a light sensor, accelerometer, gyroscope, and most importantly a microphone.
It runs Debian Linux, supports Python programming and comes with many Python libraries pre-installed. This makes it easy to implement AI solutions such as an AI Language Tutor. The following short video clip demonstrates the AI Language Tutor we are going to build.
You may need to increase the volume on your computer to hear my voice and the response of the AI.
Required Parts
For this tutorial you need a UNIHIKER M10 board. You can get it at DFRobot or at Amazon. If you have a USB Speaker you can use that one and connect it to the UNIHIKER M10. But if you want to add your own small speakers you will need a PAM8403 amplifier and one or two of the speakers listed below.

UNIHIKER M10

PAM8403 Amplifier

2 x Speaker 3 Watt 4 Ohm
Makerguides is a participant in affiliate advertising programs designed to provide a means for sites to earn advertising fees by linking to Amazon, AliExpress, Elecrow, and other sites. As an Affiliate we may earn from qualifying purchases.
Hardware of the UNIHIKER M10
The UNIHIKER M10 is a compact single-board computer for education, prototyping, and AIoT applications. It comes with a Linux-based system, sensors, and hardware interfaces on one board.
The device is based on a quad-core Arm Cortex-A35 processor running up to 1.2 GHz. It includes 512 MB RAM and 16 GB eMMC storage and runs the Debian Linux operating system.

The board integrates a 2.8-inch touchscreen with 240 × 320 pixel resolution, Wi-Fi, and Bluetooth connectivity. Built-in sensors include a light sensor, accelerometer, gyroscope, and microphone. Hardware expansion is available through USB ports, Gravity sensor connectors, and a micro:bit-compatible edge connector that exposes GPIO, I2C, SPI, and UART interfaces.
For more technical details see our Voice Assistant on UNIHIKER M10 with OpenAI tutorial. This tutorial will also tell you how to program the UNHIKER, how to install Python libraries and how to configure the Wi-Fi. All of that you will need to run the AI tutor we will build in this tutorial.
Connecting Loudspeakers to UNIHIKER M10
Our AI Tutor will have the capability to generate speak. For that we will need speakers. You can connect speakers to the UNHIKER M10 via USB or Bluetooth. This is the easiest solution and if you opt for it, then you can skip this section.
I wanted to use smaller speakers to make the my AI Tutor portable. The UNHIKER M10 has a lineout output but unfortunately no connector for it. You need to solder wires to specific pads on the back of the board and you also need a small amplifier to drive the speakers. I’ve got the wiring from the AI Assistant with OpenAI GPT, Azure Speech API and UNIHIKER post.
PAM8403 Amplifier
I used a PAM8403 module as amplifier. The PAM8403 module is a small, Class-D, stereo audio amplifier based on the PAM8403 chip. It accepts a supply voltage from 2.5 V up to 5.5 V. At 5V, and when driving 4 Ω speakers, each channel can produce up to about 3 W of output power. The following picture shows the pinout of the PAM8403 amplifier module:

For more information about the PAM8403 amplifier module see our Audio with PAM8403, PCM5102 and ESP32 tutorial.
Lineout Output of UNIHIKER M10
As mentioned, there is no connector for the lineout output on the UNIHIKER M10 but you can access it via (test) pads on the back of the board (schematics). The picture below shows the location of the Lineout and power supply pads that we need to connect the PAM8403 amplifier to:

I measured 4.7 V on the VCC pad, which is a strange, since I expected 3.3V or 5V but the PAM8403 works fine with 4.7 V. Also my board had only one pad for VCC, while the photo above shows two pads. But everything worked fine, nevertheless.
Connecting Amplifier and Loudspeakers to UNIHIKER M10
In this section I will show you how to connect the PAM8403 amplifier the loudspeakers to the UNIHIKER M10. The small 3W speakers often come with plugs you will need to cut off, since we solder the speaker wires directly to the PAM8403 module:

The following picture shows you how to wire the UNIHIKER M10 to the PAM8403 amplifier and the speakers:

Soldering the wires to the pads of the UNIHIKER M10 is easy but make sure you connect to the correct pads and don’t damage anything in the process. The photo below shows my completed wiring:

Note that you don’t need two speakers, since stereo sound is not really required. But two speakers will be louder than one speakers. If you want really loud sound use a USB speaker with an external power supply.
User Manual for AI Language Tutor
In this section I will quickly explain how the user interface for the AI Language Tutor works. This will make it easier for you to understand the code in the next section and to use the Tutor software. The picture below shows the GUI with the functional elements annotated;:

You press the A button and hold it while speaking. When you release the A button the recorded audio is sent to OpenAI for transcription and translation. The transcription and translation are then displayed in the Answer field and the translation is voiced out. You can replay the translation audio by pressing the B button.
You can select the target language for the translation by pressing the “Select Language” button at the top. It currently cycles through “Japanese” , “Italien” and “German” as languages. But you can extend this easily to other or more languages in the code.
Note, while the user interface is in English you can actually speak in any language supported by the TTS model at OpenAI (link). The system will understand the language (if it is supported) and will translate into the selected target language. You can even speak in the target language to check your pronunciation. See the following two video clips for a demo, where I speak German and Japanese:
When you press the “Explain” button at the bottom, an explanation of the grammar of the translation will be printed out. Similarly, if you press the “Example” button, example sentences with a similar grammar will be added. The following two video clips demonstrate the functionality:
Finally, there are buttons to increase and decrease the volume and font size at the top of the screen and scroll buttons at the bottom of the screen.
While you can use your fingers to control the UI, using the little pen that comes with the UNIHIKER M10 works better. Note that you can even touch the answer field and drag/scroll it instead of using the scroll buttons or the the scroll bar.
Getting OpenAI API Key
Our AI Language Tutor is going to use AI models provided by OpenAI. You therefore will need an OpenAI account and API key. Go to https://platform.openai.com and sign up with an email address or an existing Google or Microsoft account.
After verifying your email and completing the initial setup, log in to the OpenAI dashboard, platform.openai.com/api-keys and find or create your API Key (=SECRET KEY) as shown below:

The API Key is a unique, long string, starting with “sk-proj-” that is needed to authenticate your API requests. Later you will need to copy this entire string into the code for the AI Language Tutor.
Code for AI Language Tutor
The following code implements our AI-powered language tutor application. It allows users to speak phrases in any language, which are then transcribed, translated into a selected target language, and played back using text-to-speech (TTS). The app also offers grammar explanations and example sentences to aid language learning.
# www.makerguides.com
# Python 3.7
# openai 1.39.0
# PyAudio 0.2.11
# pinpong 0.6.1
# numpy 1.21.6
import sys
import time
import os
import threading
import tempfile
import numpy as np
import pyaudio
import io
import wave
import pygame
from openai import OpenAI
from pinpong.extension.unihiker import button_a, button_b
from qtpy.QtWidgets import (
QApplication,
QWidget,
QVBoxLayout,
QHBoxLayout,
QLabel,
QTextEdit,
QScroller,
QPushButton,
)
from qtpy.QtCore import Qt, QObject, Signal, QThread, Slot
API_KEY = "sk-proj-my-api_key" # SET YOUR API KEY HERE!
DEVICE_INDEX = 2
SAMPLE_RATE = 16000
CHANNELS = 1
CHUNK = 1024
LANGUAGES = ["Japanese", "Italian", "German"]
DEFAULT_STATUS = "Hold A: new | B: replay"
VOLUME_STEP = 0.1 # increment per button press
INITIAL_VOLUME = 0.8 # volume level when not muted
_volume = INITIAL_VOLUME # float, 0.0 – 1.0
TTS_FILE = os.path.join(tempfile.gettempdir(), "tts_translation.mp3")
client = OpenAI(api_key=API_KEY)
pygame.mixer.init()
# STYLE ----------------------------------------------------------
ORA_BG = "#fcdb03" # bright yellow
ORA_DARK = "#d4a800" # darker yellow (pressed / language btn)
ORA_DIM = "#fef08a" # disabled button bg
ORA_TEXT = "#000000" # black text everywhere
ORA_DT = "#888800" # disabled button text
def _btn_style(bg=ORA_BG, fg=ORA_TEXT, en=ORA_BG,
pr=ORA_DARK, ds=ORA_DIM, dt=ORA_DT):
return (
f"QPushButton {{ background: {bg}; color: {fg};"
f" border: none; font-size: 13px; }}"
f"QPushButton:enabled {{ background: {en}; }}"
f"QPushButton:pressed {{ background: {pr}; }}"
f"QPushButton:disabled {{ background: {ds}; color: {dt}; }}"
)
# WORKER THREAD ----------------------------------------------------
class AssistantWorker(QObject):
status = Signal(str)
answer = Signal(str)
btn_explain_on = Signal(bool)
btn_examples_on = Signal(bool)
language_changed = Signal(str)
def __init__(self):
super().__init__()
self._last_question = ""
self._last_translation = ""
self._last_grammar = ""
self._last_examples = ""
self._tts_ready = False
self._lang_index = 0
self._language = LANGUAGES[0]
# Thread-safe flags
self._explain_requested = False
self._examples_requested = False
self._lang_requested = False
self._lock = threading.Lock()
self._pa = pyaudio.PyAudio()
self._stream = self._pa.open(
format=pyaudio.paInt16,
channels=CHANNELS,
rate=SAMPLE_RATE,
input=True,
input_device_index=DEVICE_INDEX,
frames_per_buffer=CHUNK,
)
@Slot()
def on_explain_clicked(self):
with self._lock:
self._explain_requested = True
@Slot()
def on_examples_clicked(self):
with self._lock:
self._examples_requested = True
@Slot()
def on_language_clicked(self):
with self._lock:
self._lang_requested = True
def run(self):
self.status.emit("Hold Button A to speak")
while True:
a_pressed = button_a.is_pressed()
b_pressed = button_b.is_pressed()
with self._lock:
explain_req = self._explain_requested
examples_req = self._examples_requested
lang_req = self._lang_requested
self._explain_requested = False
self._examples_requested = False
self._lang_requested = False
if lang_req and not a_pressed:
self._lang_index = (self._lang_index + 1) % len(LANGUAGES)
self._language = LANGUAGES[self._lang_index]
self.language_changed.emit(self._language)
elif a_pressed:
self._clear_state()
raw_pcm = record_while_held(self, self._stream)
if not raw_pcm:
self.status.emit("Hold Button A to speak")
continue
self.status.emit("Transcribing...")
normalized = normalize(raw_pcm)
wav_bytes = pcm_to_wav(normalized)
question = transcribe(wav_bytes)
self.status.emit(f"Translating to {self._language}...")
translation = translate_only(question, self._language)
self._last_question = question
self._last_translation = translation
self._refresh_display()
self.btn_explain_on.emit(True)
self.btn_examples_on.emit(True)
self.status.emit("Generating audio...")
tts_text = extract_first_line(translation)
generate_tts(tts_text, TTS_FILE)
self._tts_ready = True
self._set_default_status()
play_audio(TTS_FILE)
elif b_pressed:
while button_b.is_pressed():
time.sleep(0.01)
if self._tts_ready:
self.status.emit("Replaying...")
play_audio(TTS_FILE)
self._set_default_status()
elif explain_req and self._last_translation:
self.status.emit("Explaining grammar...")
self._last_grammar = explain_grammar(
self._last_question, self._last_translation, self._language
)
self._refresh_display()
self._set_default_status()
elif examples_req and self._last_translation:
self.status.emit("Generating examples...")
self._last_examples = add_examples(
self._last_question, self._last_translation, self._language
)
self._refresh_display()
self._set_default_status()
else:
time.sleep(0.02)
def _set_default_status(self):
self.status.emit(DEFAULT_STATUS)
def _clear_state(self):
self._last_question = ""
self._last_translation = ""
self._last_grammar = ""
self._last_examples = ""
self._tts_ready = False
if pygame.mixer.music.get_busy():
pygame.mixer.music.stop()
if os.path.exists(TTS_FILE):
try:
os.remove(TTS_FILE)
except OSError:
pass
self.answer.emit("")
self.btn_explain_on.emit(False)
self.btn_examples_on.emit(False)
def _refresh_display(self):
parts = []
if self._last_question:
parts.append(self._last_question)
if self._last_translation:
parts.append(self._last_translation)
if self._last_grammar:
parts.append(f"── Grammar ──\n{self._last_grammar}")
if self._last_examples:
parts.append(f"── Examples ──\n{self._last_examples}")
self.answer.emit("\n\n".join(parts))
# GUI ----------------------------------------------------
class AssistantUI(QWidget):
def __init__(self):
super().__init__()
self.setWindowTitle("Voice Chatbot")
self.setFixedSize(240, 320)
# Orange window background, black text; white text area
self.setStyleSheet(
f"QWidget {{ background-color: {ORA_BG}; color: {ORA_TEXT}; }}"
f"QTextEdit {{ background-color: #ffffff; color: {ORA_TEXT};"
f" border: none; }}"
)
layout = QVBoxLayout(self)
layout.setContentsMargins(4, 4, 4, 4)
layout.setSpacing(3)
lang_row = QHBoxLayout()
lang_row.setSpacing(4)
_dark_style = _btn_style(bg=ORA_DARK, en=ORA_DARK, pr="#8a3d00")
self.btn_font_minus = QPushButton("-")
self.btn_font_minus.setFixedSize(28, 26)
self.btn_font_minus.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_font_minus)
self.btn_vol_down = QPushButton("<")
self.btn_vol_down.setFixedSize(28, 26)
self.btn_vol_down.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_vol_down)
self.btn_language = QPushButton(LANGUAGES[0])
self.btn_language.setFixedHeight(26)
self.btn_language.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_language, stretch=1)
self.btn_vol_up = QPushButton(">")
self.btn_vol_up.setFixedSize(28, 26)
self.btn_vol_up.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_vol_up)
self.btn_font_plus = QPushButton("+")
self.btn_font_plus.setFixedSize(28, 26)
self.btn_font_plus.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_font_plus)
layout.addLayout(lang_row)
self._font_size = 10
self.btn_font_plus.clicked.connect(self._font_increase)
self.btn_font_minus.clicked.connect(self._font_decrease)
self.btn_vol_down.clicked.connect(self._vol_decrease)
self.btn_vol_up.clicked.connect(self._vol_increase)
self.status = QLabel("Starting...")
self.status.setAlignment(Qt.AlignCenter)
self.status.setStyleSheet(
"QLabel { background-color: #ffffff; color: #000000; padding: 1px; }"
)
layout.addWidget(self.status)
self.text = QTextEdit()
self.text.setReadOnly(True)
self.text.setTextInteractionFlags(Qt.NoTextInteraction)
layout.addWidget(self.text, stretch=1)
QScroller.grabGesture(
self.text.viewport(),
QScroller.LeftMouseButtonGesture
)
btn_row = QHBoxLayout()
btn_row.setSpacing(4)
_scroll_style = _btn_style(bg=ORA_DARK, en=ORA_DARK, pr="#8a3d00")
self.btn_scroll_up = QPushButton("▲")
self.btn_scroll_up.setFixedSize(40, 26)
self.btn_scroll_up.setStyleSheet(_scroll_style)
btn_row.addWidget(self.btn_scroll_up)
self.btn_explain = QPushButton("Explain")
self.btn_examples = QPushButton("Examples")
for btn in (self.btn_explain, self.btn_examples):
btn.setEnabled(False)
btn.setFixedHeight(26)
btn.setStyleSheet(_btn_style())
btn_row.addWidget(btn, stretch=1)
self.btn_scroll_down = QPushButton("▼")
self.btn_scroll_down.setFixedSize(40, 26)
self.btn_scroll_down.setStyleSheet(_scroll_style)
btn_row.addWidget(self.btn_scroll_down)
layout.addLayout(btn_row)
self.btn_scroll_down.clicked.connect(self._scroll_down)
self.btn_scroll_up.clicked.connect(self._scroll_up)
def update_status(self, txt):
self.status.setText(txt)
def update_answer(self, txt):
self.text.clear()
self.text.setPlainText(txt)
self.text.verticalScrollBar().setValue(0)
def set_explain_enabled(self, enabled: bool):
self.btn_explain.setEnabled(enabled)
def set_examples_enabled(self, enabled: bool):
self.btn_examples.setEnabled(enabled)
def update_language_label(self, lang: str):
self.btn_language.setText(lang)
def _scroll_down(self):
sb = self.text.verticalScrollBar()
sb.setValue(sb.value() + self.text.viewport().height())
def _scroll_up(self):
sb = self.text.verticalScrollBar()
sb.setValue(sb.value() - self.text.viewport().height())
def _font_increase(self):
self._font_size = min(self._font_size + 1, 28)
self._apply_font_size()
def _font_decrease(self):
self._font_size = max(self._font_size - 1, 6)
self._apply_font_size()
def _apply_font_size(self):
font = self.text.font()
font.setPointSize(self._font_size)
self.text.setFont(font)
def _vol_increase(self):
global _volume
_volume = min(round(_volume + VOLUME_STEP, 1), 1.0)
def _vol_decrease(self):
global _volume
_volume = max(round(_volume - VOLUME_STEP, 1), 0.0)
# AUDIO HELPERS ----------------------------------------------------
def normalize(pcm_bytes: bytes) -> bytes:
samples = np.frombuffer(pcm_bytes, dtype=np.int16).astype(np.float32)
peak = np.max(np.abs(samples))
if peak == 0:
return pcm_bytes
gain = (0.9 * 32767) / peak
return np.clip(samples * gain, -32768, 32767).astype(np.int16).tobytes()
def record_while_held(worker: AssistantWorker, stream) -> bytes:
try:
while stream.get_read_available() > 0:
stream.read(stream.get_read_available(), exception_on_overflow=False)
except Exception:
pass
worker.status.emit("Recording...")
frames = []
while button_a.is_pressed():
frames.append(stream.read(CHUNK, exception_on_overflow=False))
return b"".join(frames)
def pcm_to_wav(pcm_bytes: bytes) -> bytes:
buf = io.BytesIO()
with wave.open(buf, "wb") as wf:
wf.setnchannels(CHANNELS)
wf.setsampwidth(2)
wf.setframerate(SAMPLE_RATE)
wf.writeframes(pcm_bytes)
return buf.getvalue()
def generate_tts(text: str, filepath: str) -> None:
"""Call OpenAI TTS and save to filepath (mp3)."""
response = client.audio.speech.create(
model="tts-1",
voice="alloy",
input=text,
)
with open(filepath, "wb") as f:
f.write(response.content)
def play_audio(filepath: str) -> None:
"""Play audio at the current volume, then mute the output again."""
pygame.mixer.music.set_volume(_volume)
pygame.mixer.music.load(filepath)
pygame.mixer.music.play()
while pygame.mixer.music.get_busy():
time.sleep(0.05)
pygame.mixer.music.set_volume(0.0)
# OPENAI HELPERS ------------------------------------------------------
def transcribe(wav_bytes: bytes) -> str:
audio_file = io.BytesIO(wav_bytes)
audio_file.name = "recording.wav"
response = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file,
language="en",
temperature=0,
)
return response.text
def _script_hint(language: str) -> str:
"""Return a script hint for the system prompt based on language."""
hints = {
"Japanese": "in Kanji/Kana on one line, then the romaji reading on the next line",
"Italian": "in Italian script",
"German": "in German script",
}
return hints.get(language, "in the target language's script")
def translate_only(question: str, language: str) -> str:
hint = _script_hint(language)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
f"Translate the given English phrase to {language}. "
f"Provide the translation {hint}. "
"Do NOT include grammar explanations or example sentences."
),
},
{"role": "user", "content": question},
],
)
return response.choices[0].message.content
def extract_first_line(translation: str) -> str:
"""Return the first line of the translation for TTS (skips romaji line)."""
return translation.splitlines()[0].strip() if translation else translation
def explain_grammar(question: str, translation: str, language: str) -> str:
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
f"You are a {language} language teacher. "
f"Explain the grammar of the {language} translation clearly and concisely. "
"Do NOT provide additional example sentences."
),
},
{
"role": "user",
"content": (
f"Original English: {question}\n"
f"{language} translation: {translation}\n\n"
"Please explain the grammar."
),
},
],
)
return response.choices[0].message.content
def add_examples(question: str, translation: str, language: str) -> str:
hint = _script_hint(language)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": (
f"You are a {language} language teacher. "
f"Provide 2-3 natural example sentences in {language} ({hint}) "
"that illustrate the same grammar pattern or vocabulary. "
"Include a short English translation for each."
),
},
{
"role": "user",
"content": (
f"Original English: {question}\n"
f"{language} translation: {translation}\n\n"
"Please provide example sentences."
),
},
],
)
return response.choices[0].message.content
# MAIN ------------------------------------------------------
def main():
app = QApplication(sys.argv)
ui = AssistantUI()
worker = AssistantWorker()
thread = QThread()
worker.moveToThread(thread)
thread.started.connect(worker.run)
worker.status.connect(ui.update_status)
worker.answer.connect(ui.update_answer)
worker.btn_explain_on.connect(ui.set_explain_enabled)
worker.btn_examples_on.connect(ui.set_examples_enabled)
worker.language_changed.connect(ui.update_language_label)
ui.btn_explain.clicked.connect(worker.on_explain_clicked, Qt.DirectConnection)
ui.btn_examples.clicked.connect(worker.on_examples_clicked, Qt.DirectConnection)
ui.btn_language.clicked.connect(worker.on_language_clicked, Qt.DirectConnection)
ui.show()
thread.start()
sys.exit(app.exec())
if __name__ == "__main__":
main()
Imports
The program begins by importing necessary Python modules and libraries. These include standard modules like sys, time, os, and threading for system interaction, timing, file handling, and concurrency. It uses numpy for numerical operations on audio data, pyaudio for audio recording, and pygame for audio playback.
The code also imports the OpenAI client library to access AI services, and specific hardware buttons button_a and button_b from the UNIHIKER extension. For the graphical user interface (GUI), it uses qtpy to create widgets and manage signals and slots.
Note that you will need to install the OpenAI libary, since it is not pre-installed on the UNIHIKER M10. If you need help with this see the Voice Assistant on UNIHIKER M10 with OpenAI tutorial.
Constants and Configuration
Several constants define the application’s behavior and appearance. These include the OpenAI API key, audio input device parameters such as sample rate and channels, and UI-related constants like supported languages and color codes for styling buttons and backgrounds.
Volume control parameters are also set, including the initial volume and the step size for volume adjustments.
API_KEY = "sk-proj-my-api_key" DEVICE_INDEX = 2 SAMPLE_RATE = 16000 CHANNELS = 1 CHUNK = 1024 LANGUAGES = ["Japanese", "Italian", "German"] DEFAULT_STATUS = "Hold A: new | B: replay" VOLUME_STEP = 0.1 INITIAL_VOLUME = 0.8 _volume = INITIAL_VOLUME
Remember that you have to replace the value for the API_KEY constant by your own API key!
Button Styling
The function _btn_style() returns a stylesheet string that defines the appearance of buttons in different states (normal, pressed, disabled). This function centralizes the styling to maintain a consistent look throughout the UI.
AssistantWorker Class
This class encapsulates the core logic running in a separate thread to keep the UI responsive. It manages audio recording, interaction with OpenAI APIs, and state management.
The class defines signals to communicate status updates, answers, button states, and language changes back to the UI.
In the constructor, it initializes variables to store the last question, translation, grammar explanation, and example sentences. It also sets up the audio input stream using pyaudio with the specified device and parameters.
class AssistantWorker(QObject):
status = Signal(str)
answer = Signal(str)
btn_explain_on = Signal(bool)
btn_examples_on = Signal(bool)
language_changed = Signal(str)
def __init__(self):
super().__init__()
self._last_question = ""
self._last_translation = ""
self._last_grammar = ""
self._last_examples = ""
self._tts_ready = False
self._lang_index = 0
self._language = LANGUAGES[0]
self._explain_requested = False
self._examples_requested = False
self._lang_requested = False
self._lock = threading.Lock()
self._pa = pyaudio.PyAudio()
self._stream = self._pa.open(
format=pyaudio.paInt16,
channels=CHANNELS,
rate=SAMPLE_RATE,
input=True,
input_device_index=DEVICE_INDEX,
frames_per_buffer=CHUNK,
)
The class provides slot methods to handle button clicks for requesting grammar explanations, example sentences, and language changes. These methods set thread-safe flags that the main loop monitors.
AssistantWorker Run Loop
The run() method contains the main loop that continuously checks the state of hardware buttons and processes user input accordingly.
When Button A is held, it records audio from the microphone. When Button A is released, it normalizes the recored audio, converts it to WAV format, and sends it to OpenAI’s Whisper model for transcription. The transcribed English text is then translated into the selected language using OpenAI’s GPT model.
Next the translated text is displayed, and a TTS audio file is generated and played back. Button B allows replaying the last generated audio. Pressing the language button cycles through the supported languages.
If the user requests grammar explanations or example sentences, the worker calls the appropriate OpenAI API endpoints to generate the content and updates the display.
def run(self):
self.status.emit("Hold Button A to speak")
while True:
a_pressed = button_a.is_pressed()
b_pressed = button_b.is_pressed()
with self._lock:
explain_req = self._explain_requested
examples_req = self._examples_requested
lang_req = self._lang_requested
self._explain_requested = False
self._examples_requested = False
self._lang_requested = False
if lang_req and not a_pressed:
self._lang_index = (self._lang_index + 1) % len(LANGUAGES)
self._language = LANGUAGES[self._lang_index]
self.language_changed.emit(self._language)
elif a_pressed:
self._clear_state()
raw_pcm = record_while_held(self, self._stream)
if not raw_pcm:
self.status.emit("Hold Button A to speak")
continue
self.status.emit("Transcribing...")
normalized = normalize(raw_pcm)
wav_bytes = pcm_to_wav(normalized)
question = transcribe(wav_bytes)
self.status.emit(f"Translating to {self._language}...")
translation = translate_only(question, self._language)
self._last_question = question
self._last_translation = translation
self._refresh_display()
self.btn_explain_on.emit(True)
self.btn_examples_on.emit(True)
self.status.emit("Generating audio...")
tts_text = extract_first_line(translation)
generate_tts(tts_text, TTS_FILE)
self._tts_ready = True
self._set_default_status()
play_audio(TTS_FILE)
elif b_pressed:
while button_b.is_pressed():
time.sleep(0.01)
if self._tts_ready:
self.status.emit("Replaying...")
play_audio(TTS_FILE)
self._set_default_status()
elif explain_req and self._last_translation:
self.status.emit("Explaining grammar...")
self._last_grammar = explain_grammar(
self._last_question, self._last_translation, self._language
)
self._refresh_display()
self._set_default_status()
elif examples_req and self._last_translation:
self.status.emit("Generating examples...")
self._last_examples = add_examples(
self._last_question, self._last_translation, self._language
)
self._refresh_display()
self._set_default_status()
else:
time.sleep(0.02)
State Management Methods
The worker class includes helper methods to reset the internal state, update the displayed text, and set the default status message. These methods ensure that the UI reflects the current state of the application accurately.
AssistantUI Class
This class defines the graphical user interface using Qt widgets. It creates a fixed-size window with an orange background and black text, matching the color scheme defined earlier.
The UI consists of a top row with buttons for font size adjustment, volume control, and language selection. Below that, a status label displays messages to the user. The main text area shows the transcribed, translated, and explanatory text.
At the bottom, buttons allow scrolling through the text and requesting grammar explanations or example sentences. The buttons are styled consistently using the previously defined styles.
The class provides methods to update the status text, displayed answer, enable or disable buttons, and change the language label. It also handles user interactions for scrolling and adjusting font size and volume.
class AssistantUI(QWidget):
def __init__(self):
super().__init__()
self.setWindowTitle("Voice Chatbot")
self.setFixedSize(240, 320)
self.setStyleSheet(
f"QWidget {{ background-color: {ORA_BG}; color: {ORA_TEXT}; }}"
f"QTextEdit {{ background-color: #ffffff; color: {ORA_TEXT};"
f" border: none; }}"
)
layout = QVBoxLayout(self)
layout.setContentsMargins(4, 4, 4, 4)
layout.setSpacing(3)
lang_row = QHBoxLayout()
lang_row.setSpacing(4)
_dark_style = _btn_style(bg=ORA_DARK, en=ORA_DARK, pr="#8a3d00")
self.btn_font_minus = QPushButton("-")
self.btn_font_minus.setFixedSize(28, 26)
self.btn_font_minus.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_font_minus)
self.btn_vol_down = QPushButton("<")
self.btn_vol_down.setFixedSize(28, 26)
self.btn_vol_down.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_vol_down)
self.btn_language = QPushButton(LANGUAGES[0])
self.btn_language.setFixedHeight(26)
self.btn_language.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_language, stretch=1)
self.btn_vol_up = QPushButton(">")
self.btn_vol_up.setFixedSize(28, 26)
self.btn_vol_up.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_vol_up)
self.btn_font_plus = QPushButton("+")
self.btn_font_plus.setFixedSize(28, 26)
self.btn_font_plus.setStyleSheet(_dark_style)
lang_row.addWidget(self.btn_font_plus)
layout.addLayout(lang_row)
self._font_size = 10
self.btn_font_plus.clicked.connect(self._font_increase)
self.btn_font_minus.clicked.connect(self._font_decrease)
self.btn_vol_down.clicked.connect(self._vol_decrease)
self.btn_vol_up.clicked.connect(self._vol_increase)
self.status = QLabel("Starting...")
self.status.setAlignment(Qt.AlignCenter)
self.status.setStyleSheet(
"QLabel { background-color: #ffffff; color: #000000; padding: 1px; }"
)
layout.addWidget(self.status)
self.text = QTextEdit()
self.text.setReadOnly(True)
self.text.setTextInteractionFlags(Qt.NoTextInteraction)
layout.addWidget(self.text, stretch=1)
QScroller.grabGesture(
self.text.viewport(),
QScroller.LeftMouseButtonGesture
)
btn_row = QHBoxLayout()
btn_row.setSpacing(4)
_scroll_style = _btn_style(bg=ORA_DARK, en=ORA_DARK, pr="#8a3d00")
self.btn_scroll_up = QPushButton("▲")
self.btn_scroll_up.setFixedSize(40, 26)
self.btn_scroll_up.setStyleSheet(_scroll_style)
btn_row.addWidget(self.btn_scroll_up)
self.btn_explain = QPushButton("Explain")
self.btn_examples = QPushButton("Examples")
for btn in (self.btn_explain, self.btn_examples):
btn.setEnabled(False)
btn.setFixedHeight(26)
btn.setStyleSheet(_btn_style())
btn_row.addWidget(btn, stretch=1)
self.btn_scroll_down = QPushButton("▼")
self.btn_scroll_down.setFixedSize(40, 26)
self.btn_scroll_down.setStyleSheet(_scroll_style)
btn_row.addWidget(self.btn_scroll_down)
layout.addLayout(btn_row)
self.btn_scroll_down.clicked.connect(self._scroll_down)
self.btn_scroll_up.clicked.connect(self._scroll_up)
Audio Helper Functions
Several helper functions handle audio processing tasks. The normalize() function adjusts the recorded PCM audio to maximize volume without clipping. record_while_held() records audio from the microphone while Button A is pressed. And the pcm_to_wav() function converts raw PCM bytes into WAV format, which is required for transcription.
Next we have the generate_tts() function that sends text to OpenAI’s TTS API and saves the resulting audio file. Finally, play_audio() plays the generated audio using pygame at the current volume.
OpenAI Helper Functions
These functions interact with OpenAI’s API to perform transcription, translation, grammar explanation, and example sentence generation.
The transcribe() function sends the recorded WAV audio to the Whisper model to obtain English text. translate_only() translates the English question into the selected language without additional explanations.
The explain_grammar() and add_examples() functions request grammar explanations and example sentences respectively, using GPT with prompts tailored for language teaching.
Main Function
The main() function initializes the Qt application, creates instances of the UI and worker classes, and sets up a separate thread for the worker to run concurrently.
It connects signals and slots between the worker and UI to update the interface based on the worker’s progress and user interactions. Finally, it starts the application event loop.
def main():
app = QApplication(sys.argv)
ui = AssistantUI()
worker = AssistantWorker()
thread = QThread()
worker.moveToThread(thread)
thread.started.connect(worker.run)
worker.status.connect(ui.update_status)
worker.answer.connect(ui.update_answer)
worker.btn_explain_on.connect(ui.set_explain_enabled)
worker.btn_examples_on.connect(ui.set_examples_enabled)
worker.language_changed.connect(ui.update_language_label)
ui.btn_explain.clicked.connect(worker.on_explain_clicked, Qt.DirectConnection)
ui.btn_examples.clicked.connect(worker.on_examples_clicked, Qt.DirectConnection)
ui.btn_language.clicked.connect(worker.on_language_clicked, Qt.DirectConnection)
ui.show()
thread.start()
sys.exit(app.exec())
In summary, this code integrates hardware button input, audio processing, AI language services, and a responsive GUI to create an interactive voice-based language tutor. It leverages OpenAI’s models for transcription, translation, and language teaching.
Conclusions
In this tutorial you learned how to implement a simple AI Language Tutor on the UNIHIKER M10 using OpenAI services. I recommend that you also read the Voice Assistant on UNIHIKER M10 with OpenAI tutorial, which covers some basics that are not part of this tutorial.
While the AI Language Tutor in this tutorial is already useful for language learning there are many possible extensions to make it even more useful. For instance, the program could store the translations and explanations to revisit later. The program could generate short stories around sentences and read them out. It could extract verbs and nouns to be trained separately. And much more …
Have fun to extend the tutor for your purposes, and if you have any questions feel free to leave them in the comment section.
Happy Tinkering 😉
Stefan is a professional software developer and researcher. He has worked in robotics, bioinformatics, image/audio processing and education at Siemens, IBM and Google. He specializes in AI and machine learning and has a keen interest in DIY projects involving Arduino and 3D printing.

