Skip to Content

Edge AI Room Occupancy Sensor with ESP32 and Person Detection

Edge AI Room Occupancy Sensor with ESP32 and Person Detection

In this tutorial we will use Edge AI to detect the number of persons in a room (occupancy) with a XIAO ESP32-S3-Sense and an ESP32 lite. If you haven’t used the XIAO-ESP32-S3 Sense before, have a look at the Getting started with XIAO-ESP32-S3-Sense tutorial first.

We will deploy a person detection model on the XIAO ESP32-S3-Sense using the SenseCraft AI platform. The model will run on the ESP32 without the need to interact with a server to perform the person detection or the person counting.

A room occupancy sensor has a wide range of practical applications in both DIY and professional settings. In smart homes, it can adjust lighting, heating, and security systems based on presence. In offices, schools, and commercial spaces, it helps monitor room usage, manage energy efficiency, and improve space utilization. Retail stores and public areas can use it for people counting, queue management, and safety compliance.

Let’s get started!

Required Parts

You will need a XIAO ESP32-S3-Sense board and a second ESP32. Note that the XIAO ESP32-S3-Sense can get very hot under high computational load. I recommend, you attach a small Heatsink at the back of the board (see the listed part below).

For the second ESP32, I picked an older ESP32 lite but any other ESP32 or an Arduino will work just fine as well.

Seeed Studio XIAO ESP32-S3-Sense

ESP32 lite Lolin32

ESP32-lite

OLED display

OLED Display

USB C Cable

Small Heatsink 9×9 mm

Dupont wire set

Dupont Wire Set

Half_breadboard56a

Breadboard

Makerguides is a participant in affiliate advertising programs designed to provide a means for sites to earn advertising fees by linking to Amazon, AliExpress, Elecrow, and other sites. As an Affiliate we may earn from qualifying purchases.

Architecture of the Occupancy Sensor

Our Occupancy Sensor system will be composed of the following parts, a XIAO ESP32S3 Sense with a camera that runs a person detection model, an ESP32-lite that analyzes the detections to count people and an OLED to display a person counter.

Architecture of Room Occupancy Sensor System
Architecture of Room Occupancy Sensor System

The two ESP32s and the OLED communicate via I2C. The ESP32 we are going to program using the Arduino IDE and the person detection model will be deployed to the XIAO ESP32S3 Sense from the SenseCraft AI platform. If you want to learn more about the XIAO ESP32-S3-Sense see Getting started with XIAO-ESP32-S3-Sense tutorial.

The reason why we need the second ESP32-lite is because we essentially cannot (easily) run other code on the XIAO ESP32S3 Sense apart from the person detection model. That has the advantage that you don’t need to write any code for the face detection and don’t have to worry about additional compute on the XIAO ESP32S3 Sense. The disadvantage is that you don’t have full control over the face detection model (and you need the second ESP32).

The following photo shows the system built on a breadboard with a powerbank as power supply:

Room Occupancy Sensor System on Breadboard
Room Occupancy Sensor System on Breadboard

As you can see, I attached the breadboard to a little holder so that the camera is facing forward. In the next section we deploy the Person Detection model to the XIAO ESP32S3 Sense.

Deploy Person Detection Model

Go to the SenseCraft model selection page at https://sensecraft.seeed.cc/ai/model. In the tab for “Pretrained Models” in the sidebar select under Task: “Detection” and under Supported Devices: “XIAO ESP32S3 Sense”. Then enter “Person” in the search bar to filter for person detection models:

Model Selection
Model Selection

As of July 2025 there is only one match. Click on the Person Detection–Swift YOLO model (marked yellow) on the right.

This will open a new page with a description of the Detection model and a green button labeled “Deploy Model” on the right-hand side:

Description of selected Person Detection Model
Description of selected Person Detection Model

Connect your XIAO ESP32-S3-Sense Board via USB to your computer and then click on “Deploy Model”. Follow the steps. If you need more help see the Face Detection with XIAO ESP32-S3-Sense and SenseCraft AI tutorial, which describes the deployment process in more detail.

Try the Person Detection Model

Once the model is deployed it starts automatically and sends images with bounding boxes around detected persons to your web browser, where you can monitor a live stream in the Preview section. In the upper right corner of the Preview section is a “Stop” button to stop the running model. The picture below shows the Preview window with person detections:

Preview of detected persons
Preview of detected persons

To test the model, I simply placed the following picture with persons in front of the camera.

Picture of people in a room (source)

You can see that the model detects most of the persons in the picture but it is not perfect. There are two Settings (Confidence and IoU threshold) under the Preview window, which you can adjust to improve the detection accuracy.

If the system detects too few people lower the confidence threshold, if it detects things as persons increase the threshold.

Should you find that the same person is covered by multiple detection boxed then lower the IoU threshold. On the other hand, if the detection boxes are too big increase the IoU threshold. For more details see the Face Detection with XIAO ESP32-S3-Sense and SenseCraft AI tutorial.

In the next section we will write the code to count the people in the room. This code will run on the ESP32-lite.

Connecting ESP32-S3-Sense to ESP32-lite

As mentioned, the above the face detection model runs on the XIAO ESP32-S3-Sense Board. We can communicate with the XIAO ESP32-S3-Sense via I2C. The following wiring diagram shows you how to connect the XIAO ESP32-S3-Sense to the ESP32-lite:

Connecting XIAO ESP32-S3-Sense to ESP32 lite via I2C
Connecting XIAO ESP32-S3-Sense to ESP32-lite via I2C

The I2C pins on the ESP32-lite are SDA=GPIO19 ad SCL=GPIO23. The corresponding pins on the ESP32-S3-Sense are SDA=D4/GPIO5 ad SCL=D5/GPIO4. Note that the wiring diagram above shows the back of the boards, where the digital outputs are labeled (D0…D10).

ESP32-liteXIAO
SDAGPIO19D4 / GPIO5
SCLGPIO23D5 / GPIO4

If you use a different ESP32 or Arduino the hardware I2C pins will differ. Make sure you are connecting to the correct pins otherwise the communication will fail.

In the next section we are going to test this connection by retrieving detections from the XIAO ESP32-S3-Sense and print them on the ESP32-lite.

Code for Serial Communication

For the following code to work you will first need to install the Seeed_Arduino_SSCMA library. Open the LIBRARY MANAGER, search for “Seeed_Arduino_SSCMA” and click the INSTALL button:

Installing Seeed_Arduino_SSCMA library
Installing Seeed_Arduino_SSCMA library

Next connect the ESP32-lite to your computer via USB and flash the following code to it:

#include <Wire.h>
#include <Seeed_Arduino_SSCMA.h>

SSCMA detector;

void setup() {
  Serial.begin(115200);
  Wire.begin();
  detector.begin(&Wire);
  Serial.println("running...");
}

void loop() {
  if (!detector.invoke(1, false, false)) {
    for (int i = 0; i < detector.boxes().size(); i++) {
      boxes_t &b = detector.boxes()[i];
      Serial.printf("Box[%d] conf=%d [%3d %3d %3d %3d]\n",
                    i, b.score, b.x, b.y, b.w, b.h);
    }
  }
}

This code invokes the person detector running on the ESP32-S3-Sense and prints the confidence scores and detection boxes for any detected person to the Serial Monitor. See the Face Detection with XIAO ESP32-S3-Sense and SenseCraft AI, if you have any questions about this code.

If everything is correctly wired up and working you should see the detection results printed to the Serial Monitor as follows:

Person detection results on Serial Monitor
Person detection results on Serial Monitor

In the next section we write the code to count the number of people.

Code for Room Occupancy

Once we have bounding boxes for detected persons calculating the Room Occupancy is simple. We just need to count the number of detection boxes.

But it would be nice to see this number on a display to make the system independent from a USB connect to a PC. We are therefore adding a small OLED to the circuit to display the number of people in the room. The picture below shows the wiring diagram:

Connecting OLED to ESP32-lite
Connecting OLED to ESP32-lite

Just connect the power supply of the ESP32 (3.3V and G) to the OLED and then connect SCL and SDA in parallel to the existing I2C connection.

The following code runs on the ESP32-lite. It communicates with the person detector on the ESP32-S3-Sense via I2C. It invokes the detector, retrieves the number n of the detection boxes and displays that number on the OLED:

#include <Wire.h>
#include <Seeed_Arduino_SSCMA.h>
#include <Adafruit_SSD1306.h>

SSCMA detector;
Adafruit_SSD1306 oled(128, 64, &Wire, -1);

void setup() {
  Wire.begin();
  detector.begin(&Wire);

  oled.begin(SSD1306_SWITCHCAPVCC, 0x3C);
  oled.setRotation(3);
  oled.setTextSize(6);
  oled.setTextColor(WHITE);
}

void loop() {
  if (!detector.invoke(1, false, false)) {
    int n = detector.boxes().size();
    oled.clearDisplay();  
    oled.setCursor(10, 20);
    oled.printf("%d", n);  
    oled.display();
  }
}

Note that the code uses the Adafruit_SSD1306 library, which you can install it via the Library Manager as usual. Let’s have a closer look at the code.

Libraries

First we include the Wire library for I2C communication, the Seeed_Arduino_SSCMA to communicate with the person detection model and the Adafruit_SSD1306 library to control the OLED.

#include <Wire.h>
#include <Seeed_Arduino_SSCMA.h>
#include <Adafruit_SSD1306.h>

Objects

Next we create the detector object and the oled object:

SSCMA detector;
Adafruit_SSD1306 oled(128, 64, &Wire, -1);

Setup

In the setup function we initiate the detector and the OLED:

void setup() {
  Wire.begin();
  detector.begin(&Wire);

  oled.begin(SSD1306_SWITCHCAPVCC, 0x3C);
  oled.setRotation(3);
  oled.setTextSize(6);
  oled.setTextColor(WHITE);
}

Note that the I2C address of the OLED I am using is 0x3C. If you have an OLED with a different address you will have to change the code here.

Loop

Finally, we have the loop function, where we invoke the detection model, retrieve the number of bounding boxes and display that number as a person count on the OLED:

void loop() {
  if (!detector.invoke(1, false, false)) {
    int n = detector.boxes().size();
    oled.clearDisplay();  
    oled.setCursor(10, 20);
    oled.printf("%d", n);  
    oled.display();
  }
}

If you watch the person detector, you will notice that the detection are not stable. Bounding boxes appear and disappear, even for a static image. For an actual, live scenario, where people move within a room, occlusions and changes in lighting will add further variability. The following three examples demonstrates the variability in detection even for a fixed image:

Unstable detection of persons
Unstable detection of persons

Stabilizing Person Count

To make the person count a bit more stable you can compute a rolling average. The following functions averages three consecutive person counts n, and therefore stabilizes it a bit:

int average3(int n) {
  static int prev1 = 0, prev2 = 0;

  int avg = (n + prev1 + prev2 + 1) / 3; 
  prev2 = prev1;
  prev1 = n;

  return avg;
}

In the loop function you call it as follows:

void loop() {
  if (!detector.invoke(1, false, false)) {
    int n = detector.boxes().size();
    n = average3(n);
    ...
    oled.printf("%d", n);  
    oled.display();
  }
}

The tradeoff is, of course, that it takes longer for the person counter to react to changes in the number of persons in the room. But that is fine if you want to adjust the room heating, ventilation, lighting or music volume depending on the number of people. Too fast changes of these conditions are usually not needed.

And that’s it!

Conclusions and Comments

In this tutorial we built an Room Occupancy Sensor. Our sensor operates as an Edge AI device and does not need to be connected to a server. The person detection and computation of person counts occurs directly on the device.

The SenseCraft AI platform makes it very simple to deploy a model on XIAO ESP32-S3-Sense but the disadvantage is that you need a second microcontroller for any further processing. Also your control over the model is limited and I have not seen a method to retrieve the person detections and the video stream.

However, if you stream the video to a server anyway, you could run the person detector there as well. Have a look at our Object Detection with ESP32-CAM and YOLO and the Stream Video with with XIAO-ESP32-S3-Sense tutorials.

Due to limited computational power, you can only run comparatively small AI models on a microcontroller. The accuracy of the person detection and consequently the room occupancy measurements are not as good as for a larger model that runs on server with a GPU, for instance.

On the other hand, if privacy is a concern, Wi-Fi communication is shaky or you want to run on battery power and Edge AI solution is better. And while the accuracy of the model is not great, it is good enough to regulate a rooms heating or ventilation.

If you have any questions feel free to leave them in the comment section.

Happy Tinkering 😉