In this step-by-step tutorial, you’ll learn how to deploy and run a real-time face detection model on the XIAO ESP32-S3-Sense using the SenseCraft AI platform. The model will completely run on the ESP32 without the need to interact with a server to perform the actual face detection.
If you’re looking for a simple and fast way to do face detection with an ESP32, this guide is perfect for you. The SenseCraft AI platform allows you to quickly load and run pre-trained AI models directly on supported microcontrollers like the XIAO ESP32‑S3-Sense, without needing to write any machine learning code.
The platform also allows you to configure different output options such as GPIO pins, UART, I2C, SPI or MQTT, which makes it easy to integrate the face detector into your own projects, e.g. a Home Automation system, a face counter or an intruder detection system.
Aside from the microcontroller itself, this tutorial uses only the free, no-sign-up-required features of the SenseCraft AI platform. You won’t need a cloud subscription or advanced developer tools.
Let’s get started!
Required Parts
Obviously, you will need a XIAO ESP32-S3-Sense board by Seeed Studio. Note that the board can get very hot under high computational load. I recommend, you attach a small Heatsink at the back of the board (see the listed part below).
If you want to try out the serial communication example you will need a second microcontroller. I picked an older ESP32 lite but any other ESP32 or an Arduino will work just fine as well.

Seeed Studio XIAO ESP32-S3-Sense

ESP32-lite

USB C Cable

Small Heatsink 9×9 mm

Dupont Wire Set

Breadboard
Makerguides is a participant in affiliate advertising programs designed to provide a means for sites to earn advertising fees by linking to Amazon, AliExpress, Elecrow, and other sites. As an Affiliate we may earn from qualifying purchases.
What is SenseCraft AI ?
SenseCraft AI is an edge AI development platform created by Seeed Studio that allows you to easily deploy machine learning models onto embedded devices—without writing complex code or managing toolchains.
It’s designed for real-time AI inference on microcontrollers and edge computing boards, specifically for Seeed’s hardware like the XIAO ESP32S3 Sense, Grove Vision AI, and reComputer Jetson series.
With SenseCraft AI, you can:
- Quickly deploy prebuilt AI models (like face detection, object classification, keyword spotting)
- Train custom models with your own datasets using a web interface
- Flash firmware and models directly to supported devices
- Visualize real-time data and camera feeds through your browser
- Log, monitor, and analyze events using the SenseCraft Data cloud platform
The following architecture diagram shows the components of the SenseCraft AI platform. As you can see, it is a complex platform with many functions:

In this tutorial we will be using only the bits of SenseCraft AI needed to deploy, run and monitor a public Face detection model on a XIAO ESP32-S3-Sense board.
Select Face Detection Model
First you need to select the model we want to deploy. Go to the SenseCraft model selection page at https://sensecraft.seeed.cc/ai/model. In the tab for “Pretrained Models” select under Task “Detection” and under Supported Devices “XIAO ESP32S3 Sense”. Then enter “Face” in the search bar to filter for face detection models:

As of July 2025 there are only two models. Click on the Swift-YOLO model on the right (marked yellow). The other model did not work for me when I tried it.
This will open a new page with a description of the Face Detection model and a green button labeled “Deploy Model” on the right-hand side:

Connect your XIAO ESP32-S3-Sense Board via USB to your computer and then click on “Deploy Model”.
Deploy Face Detection Model
After clicking on “Deploy Model” a new page named “Deploy Face Detection” will open with a “Connect Device” button at the bottom:

Press it and you will see a confirmation dialog popping up, where you just need to press “Confirm”:

After that the SeedCraft platform needs to know to which COM port the XIAO ESP32-S3-Sense Board is connected. It will show you a list of used (Paired) ports – typically only one, if you connect only one board. Click on the COM port you want to use and then the “Connect” button at the bottom:

This will start the download (deployment) of the Face Detection Model to the XIAO-ESP32-S3-Sense board.
Monitoring Face Detection Model
Once the model is deployed it starts automatically and sends images with face detection boxes to your web browser, where you can see a live stream in the Preview section:

In the upper right corner of the Preview section is a “Stop” button to stop the running model. You can also adjust the frame rate with a slider in the Device section:

Below the video in the Preview Section are two important Settings (Confidence, IoU) to adjust the model detections, which we will discuss in detail next.
Confidence Threshold
The Confidence Threshold filters detections based on how sure the model is that something is a face. Every detection has a confidence score between 0 and 100%. If a detection’s score is below the confidence threshold, it’s discarded.
A lower threshold will result in the detection of more faces but will also cause more errors in the detection of non-faces (false positives). A higher threshold will cause the detection of less faces (with less errors) but will also result in missing some faces (false negatives). The example below shows the case of a too low confidence threshold, resulting in many spurious face detections:

There is always a trade-off between how sensitive you want the model to be and how many errors you are willing to accept. It will depend on the application, which threshold is the best. For instance, in case of an intruder detection system you might be willing to accept more false detections to make sure no robber gets in.
IoU Threshold
Internally the Face Detection model generates multiple face detection boxes, since an image can contain multiple faces. However, this can result in having one face detected by multiple overlapping face detection boxes. The example below shows an example:

The IoU (Intersection over Union) threshold helps decide which of the overlapping boxes to keep. It is calculated as the ratio of the overlap between boxes. If two boxes overlap more than the IoU threshold, the one with the lower confidence is removed.
If you find that faces are detected by multiple boxes then lower the IoU threshold. You may end up with boxes that are too small. In this case increase the IoU threshold.
Output of Face Detection Model
Monitoring the face detection process live is important to get a “feel” for the accuracy and behavior of the model. But in most applications we want an output of the model that we can process further. For instance, a digital output pin that goes high when a face is detected, which in turn switches on an alarm.
SenseCraft AI models typically offer three different types of outputs:
- Digital output via GPIO
- Serial output via UART, I2C or SPI
- MQTT output over WiFi
You can configure these different output options by clicking on the corresponding item in the sidebar:

The digital output via GPIO is great for simple actions, like opening a door when a face has been detected. If you need more detailed information such as how many face have been detected, where and with what confidence then the Serial output is what you want. Finally, if you want to connect the face detector to a home automation system, like Home Assistant, you probably want to use MQTT.
In the next sections we will have a closer look at the GPIO and Serial output options. The MQTT option is already described in the SenseCraft AI MQTT output for XIAO ESP32-S3 tutorial.
Digital GPIO Output
To configure the GPIO output click on the GPIO item in the sidebar:

You can then add a “Trigger action when event conditions are met”:

This opens the following dialog, where you can specify the details of the action. For instance, you may want to switch on the built-in LED if a face was detected with a confidence value greater than 50 %:

If you press the “Confirm” button this action is then added to the list of Trigger Actions and becomes active:

The yellow LED on the XIAO ESP32-S3-Sense board should now switch on if the camera detects a face with the specified confidence.
Instead of the LED, you can also select another GPIO pin and define its default and active state:

This would allow you to connect a device, e.g. a door opener or alarm to GPIO1 and activate it, if the camera sees a face.
The trigger actions remain set even after you disconnect or unplug the ESP32. While the documentation indicates otherwise, for me the trigger actions worked even when the XIAO ESP32-S3-Sense was not connected to the Web interface.
Serial Output via I2C
The GPIO output is limited to “face detected” or “no face detected”. For more detailed information, such as the number of detected faces, their detection boxes and confidence value, you can use the Serial Output. However, you will need to connect a second microcontroller via UART, I2C or SPI to process the detection.
The code for the face detection is deployed via SenseCraft and runs on the XIAO ESP32-S3-Sense. The code for processing detections is flashed via the Arduino IDE and runs on the second microcontroller – in my case an ESP32-lite. The picture below shows the main parts of the system, using I2C for communication:

Connecting the ESP32-lite to the XIAO ESP32-S3-Sense via I2C is easy. The picture below shows the wiring diagram:

The I2C pins on the ESP32-lite are SDA=GPIO19 ad SCL=GPIO23. The corresponding pins on the ESP32-S3-Sense are SDA=D4/GPIO5 ad SCL=D5/GPIO4. Note that the wiring diagram above shows the back of the boards, where the digital outputs are labeled (D0…D10).
| ESP32-lite | XIAO | |
|---|---|---|
| SDA | GPIO19 | D4 / GPIO5 |
| SCL | GPIO23 | D5 / GPIO4 |
Below the complete pinout of the XIAO ESP32-S3-Sense from the front. You can see the SDA and SCL pins on the left of the board.

In the next section we are going to write the code that runs on the ESP32-lite and processes the face detections generated by the XIAO ESP32-S3-Sense.
Code for Serial Communication
For the following code to work you will first need to install the Seeed_Arduino_SSCMA library. Open the LIBRARY MANAGER, search for “Seeed_Arduino_SSCMA” and click the INSTALL button:

Next connect the ESP32-lite to your computer and download the following code to it:
#include <Wire.h>
#include <Seeed_Arduino_SSCMA.h>
SSCMA detector;
void setup() {
Serial.begin(115200);
Wire.begin();
detector.begin(&Wire);
Serial.println("running...");
}
void loop() {
if (!detector.invoke(1, false, false)) {
for (int i = 0; i < detector.boxes().size(); i++) {
boxes_t &b = detector.boxes()[i];
Serial.printf("Box[%d] conf=%d [%3d %3d %3d %3d]\n",
i, b.score, b.x, b.y, b.w, b.h);
}
}
}
This code invokes the face detector running on the ESP32-S3-Sense and prints the confidence scores and detection boxes for any detected face to the Serial Monitor. Let’s have a closer look at the code.
libraries
First we include the Wire and the Seeed_Arduino_SSCMA libraries. We need the Wire library for the I2C communication and the Seeed_Arduino_SSCMA to communicate with the face detector.
#include <Wire.h> #include <Seeed_Arduino_SSCMA.h>
object
Next we create the detector object, which provides functions to start a face detection and to retrieve the results of a face detection:
SSCMA detector;
setup
In the setup function we initialize the Serial communication with the Serial Monitor and I2C communication with the detector:
void setup() {
Serial.begin(115200);
Wire.begin();
detector.begin(&Wire);
Serial.println("running...");
}
This assumes that you are using the default I2C pins. If not, you will need to specify them by creating a separate Wire object.
loop
Finally, in the loop function we invoke the detector and then print the confidence and the bounding box of the detected faces to the Serial Monitor:
void loop() {
if (!detector.invoke(1, false, false)) {
for (int i = 0; i < detector.boxes().size(); i++) {
boxes_t &b = detector.boxes()[i];
Serial.printf("Box[%d] conf=%d [%3d %3d %3d %3d]\n",
i, b.score, b.x, b.y, b.w, b.h);
}
}
}
If everything is correctly wired up and working you should see the detection results printed to the Serial Monitor as follows:

You could now easily change the code, for instance, to count the number of detected faces, to perform face tracking or cause an action if a face comes close (large bounding box).
Finally, note that the confidence and IoU thresholds are set via the SenseCraft web interface and stored on the ESP32-S3-Sense. You could set them to zero and perform your own thresholding in the code.
Conclusions and Comments
In this tutorial you learned how to deploy and run a real-time face detection model on the XIAO ESP32-S3-Sense using the SenseCraft AI platform.
The SenseCraft AI platform makes it very easy to integrate a pre-trained AI model in your TinyML projects. You don’t need to write any code to deploy or run the model.
The web interface allows you to configure different output options (GPIO, Serial, MQTT), again without any coding. But the Serial interface enables you to write your own code to process detection results further, if needed. However, you will need to connect a second microcontroller for that.
A disadvantage of the SenseCraft AI platform is that you essentially cannot extend the code for the deployed model. All further processing code needs to run on a second microcontroller. On the other hand, in most cases the computational load of the model is so high that you are better of with a separate microcontroller anyway.
Let’s say you want to build your one Face Tracking Gimbal. It would make sense to run the face detection on one microcontroller and the servo control on another.
If you want to learn more about the XIAO ESP32-S3-Sense have a look at our Getting started with XIAO-ESP32-S3-Sense tutorial. And Seeed Studio has a lot more information in its Getting Started Wiki.
A common application of microcontroller boards with a camera are streaming and surveillance. If that is what you are after the Stream Video with with XIAO-ESP32-S3-Sense and the Surveillance Camera with ESP32-CAM tutorials may be useful.
Finally, if you want to detect people instead of face, have a look at the Edge AI Room Occupancy Sensor with ESP32 and Person Detection tutorial.
If you have any questions feel free to leave them in the comment section.
Happy Tinkering 😉
Stefan is a professional software developer and researcher. He has worked in robotics, bioinformatics, image/audio processing and education at Siemens, IBM and Google. He specializes in AI and machine learning and has a keen interest in DIY projects involving Arduino and 3D printing.

