In this tutorial you will learn how to build a motion-activated, Surveillance Camera system with the ESP32-CAM. The system will record video whenever motion is detected and saves the video stream in a file on your computer. This project is suitable for surveillance, security, or wildlife monitoring applications.
Required Parts
Below you will find the components required to build the project. You will need an ESP32-CAM and the USB-TTL Shield or the FTDI USB-TTL Adapter for programming the board.
I listed two different types of motion sensors. If you want a small size and a shorter activation interval go with the AM312. But if you want to activate the camera only at night, go with the HC-SR501, since it can easily be equipped with a light sensor. For more details, see the Motion Activated ESP32-CAM tutorial.

ESP32-CAM with USB-TTL Shield

FTDI USB-TTL Adapter

Dupont Wire Set

AM312 PIR Motion Sensor

HC-SR501 PIR Motion Sensor

USB Data Cable
Makerguides is a participant in affiliate advertising programs designed to provide a means for sites to earn advertising fees by linking to Amazon, AliExpress, Elecrow, and other sites. As an Affiliate we may earn from qualifying purchases.
System Architecture
In this project we are going to implement a video surveillance system that uses the ESP32-CAM, a motion sensor and a PC to receive and store surveillance videos.
The ESP32-CAM will run a video streaming server that transmits video frames via Wi-Fi, if a PIR sensor connected to the ESP32-CAM detects motion. The video frames will be received by a Python application running on a PC that writes the video data to a file. The picture below illustrates the set up:

Motion Detection Circuit
We start by building the motion detection circuit. This involves connecting the Passive Infrared Sensor (PIR) to the ESP32-CAM. In this example, I will use the AM312 PIR Sensor but connecting the HC-SR501 PIR sensor would be pretty much the same. See the Motion Activated ESP32-CAM tutorial.
Connecting the PIR Sensor to ESP32-CAM
Connecting the AM312 PIR Sensor Module to the ESP32-CAM is easy. Connect a power supply (battery) with 5V (up to 12) to the 5V and GND pin of the ESP32-CAM as shown below (red and blue wires).

You can provide more than 5V to 12V to power the board. Don’t worry. A voltage regulator is connected to the 5V pin and will reduce the input voltage to the level required by the board.
Similarly, connect the AM312 PIR Sensor Module to the power supply (red and blue wires). It also can handle up to 12V but watch out for correct polarity, since the AM312 has no polarity protection. Then connect the output S or OUT of the AM312 to the GPIO13 pin of the ESP32-CAM (green wire).
If you use the HC-SR501 PIR Sensor Module instead of the AM312, connect it the same way (GND to GND, VCC to 5V-12V).
Circuit on Breadboard
If you program the ESP32-CAM via the USB-TTL Shield and want to try the circuit out on a breadboard then you can power the ESP32-CAM and the AM312 sensor as shown below:

You will have to make a small gap between the ESP32-CAM and the Programming Shield and then connect the wires to the exposed pins in the gap:

The connections are as before. The GND pin is connected to the ‘-‘ pin, the 5V pin is connected to the ‘+’ and GPIO13 is connected to ‘s’ pin of the AM312.
Code for Testing the PIR Sensor
Before you implement the complete code for the video streaming server, let’s test the PIR sensor and the circuit. Upload the following code to your ESP32-CAM:
void setup() {
Serial.begin(115200);
pinMode(GPIO_NUM_13, INPUT);
}
void loop() {
bool isMotion = digitalRead(GPIO_NUM_13);
Serial.println(isMotion ? "ON" : "OFF");
delay(1000);
}
If you open the Serial Monitor you should see the text “ON” printed, when the PIR sensor detects motion and “OFF” otherwise. Should this not be the case, check the circuit. You also may find useful tips in the Motion Activated ESP32-CAM tutorial.
Video Streaming
In this section we implement the Video Streaming Server on the ESP32-CAM module that transmits video frames if the PIR sensor detects motion. Have a quick look at the complete code below, and then we will discuss its details.
#include "WebServer.h"
#include "WiFi.h"
#include "esp32cam.h"
const char* WIFI_SSID = "SSID";
const char* WIFI_PASS = "PASSWORD";
const char* URL = "/video";
const auto RESOLUTION = esp32cam::Resolution::find(800, 600);
const int FRAMERATE = 10;
const byte pinPIR = GPIO_NUM_13;
const byte pinFlash = GPIO_NUM_4;
WebServer server(80);
bool isMotion() {
return digitalRead(pinPIR);
}
void handleStream() {
static char head[128];
WiFiClient client = server.client();
server.sendContent("HTTP/1.1 200 OK\r\n"
"Content-Type: multipart/x-mixed-replace; "
"boundary=frame\r\n\r\n");
while (client.connected()) {
if (isMotion()) {
analogWrite(pinFlash, 20);
auto frame = esp32cam::capture();
if (frame) {
sprintf(head,
"--frame\r\n"
"Content-Type: image/jpeg\r\n"
"Content-Length: %ul\r\n\r\n",
frame->size());
client.write(head, strlen(head));
frame->writeTo(client);
client.write("\r\n");
delay(1000 / FRAMERATE);
}
} else {
analogWrite(pinFlash, 0);
}
}
analogWrite(pinFlash, 0);
}
void initCamera() {
using namespace esp32cam;
Config cfg;
cfg.setPins(pins::AiThinker);
cfg.setResolution(RESOLUTION);
cfg.setBufferCount(2);
cfg.setJpeg(80);
Camera.begin(cfg);
}
void initWifi() {
WiFi.persistent(false);
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PASS);
while (WiFi.status() != WL_CONNECTED) {
delay(100);
}
Serial.printf("Stream at: http://%s%s\n",
WiFi.localIP().toString().c_str(), URL);
}
void initServer() {
server.on(URL, handleStream);
server.begin();
}
void setup() {
Serial.begin(115200);
pinMode(pinPIR, INPUT);
pinMode(pinFlash, OUTPUT);
initWifi();
initCamera();
initServer();
}
void loop() {
server.handleClient();
}
Libraries
We begin by including the required libraries:
#include "WebServer.h" #include "WiFi.h" #include "esp32cam.h"
The WebServer.h library is used to handle HTTP server functionalities, and the WiFi.h library enables the ESP32 to connect to a wireless network. The esp32cam library provides an easy interface for configuring and using the camera module on the ESP32-CAM board. You will have to install it via the Library Manager:

Constants
Next, we define several constants for configuration:
const char* WIFI_SSID = "SSID"; const char* WIFI_PASS = "PASSWORD"; const char* URL = "/video"; const auto RESOLUTION = esp32cam::Resolution::find(800, 600); const int FRAMERATE = 10; const byte pinPIR = GPIO_NUM_13; const byte pinFlash = GPIO_NUM_4;
The WIFI_SSID and WIFI_PASS variables store the credentials for the Wi-Fi network. You will have to replace the dummy values SSID and PASSWORD with the credentials for your Wi-Fi network.
The URL variable specifies the path on the server that clients will access to view the video stream. You can change it to a different name, e.g. “/video-frontdoor” but make sure that name in the server (ESP32-CAM) and the client (Python app) match.
The RESOLUTION variable set the camera resolution to 800 by 600 pixels. Again that is something you can change but the resolution in the server and client must be identical.
You can also set the FRAMERATE, which defines the number of frames per second the server will attempt to stream. As before server and client framerates should match.
Finally, the pinPIR and pinFlash variables refer to the GPIO pins connected to the PIR motion sensor and the onboard flash LED, respectively.
Objects
We next create a web server that listens on port 80:
WebServer server(80);
isMotion function
The function isMotion() checks for motion detection:
bool isMotion() {
return digitalRead(pinPIR);
}
This function reads the digital signal from the PIR sensor. If the sensor outputs a high signal, motion is present, and the function returns true. Remember that the PIR sensor has an activation time that can be set for the HC-SR501.
handleStream function
The handleStream() function is responsible for handling the video stream when a client accesses the /video URL:
We begin by declaring a buffer named head that will hold HTTP headers for each frame. Next we retrieve the client object associated with the current HTTP request:
void handleStream() {
static char head[128];
WiFiClient client = server.client();
The server responds to the client with an HTTP header indicating that it will send a series of JPEG images separated by boundaries marked as --frame. This technique is commonly known as MJPEG (Motion JPEG) streaming.
server.sendContent("HTTP/1.1 200 OK\r\n"
"Content-Type: multipart/x-mixed-replace; "
"boundary=frame\r\n\r\n");
As long as the client remains connected, the loop continues. Inside the loop, we call the isMotion() function to determine if motion is detected:
while (client.connected()) {
if (isMotion()) {
analogWrite(pinFlash, 20);
If it is, we switch the flash LED on at low brightness using PWM by writing a duty cycle of 20 to pinFlash. Be careful with the brightness of the flash. If you set the flash to full brightness (255) for a longer period of time the LED might burn out!
We then try to capture a frame from the camera using esp32cam::capture():
auto frame = esp32cam::capture();
If a frame is successfully captured, we proceed to prepare the HTTP headers for the frame:
if (frame) {
sprintf(head,
"--frame\r\n"
"Content-Type: image/jpeg\r\n"
"Content-Length: %ul\r\n\r\n",
frame->size());
client.write(head, strlen(head));
frame->writeTo(client);
client.write("\r\n");
delay(1000 / FRAMERATE);
}
The sprintf() function formats a string containing the boundary marker, content type, and content length of the image. This header is written to the client using client.write(). The actual JPEG image is sent using frame->writeTo(client), followed by a carriage return and newline. A short delay is introduced based on the desired frame rate to control the speed of streaming.
} else {
analogWrite(pinFlash, 0);
}
}
analogWrite(pinFlash, 0);
}
If no motion is detected, we turn the flash LED off. After the loop exits, which would happen if the client disconnects, we turn the flash off again to ensure it is not left on.
initCamera function
The initCamera() function initializes the ESP32-CAM hardware:
void initCamera() {
using namespace esp32cam;
Config cfg;
cfg.setPins(pins::AiThinker);
cfg.setResolution(RESOLUTION);
cfg.setBufferCount(2);
cfg.setJpeg(80);
Camera.begin(cfg);
}
Inside this function, a Config object named cfg is created. The setPins() method is called with pins::AiThinker to configure the GPIO pins for the AI Thinker model of the ESP32-CAM. You can use the code for other camera boards as well by picking a different pin configuration. See the Stream Video with with XIAO-ESP32-S3-Sense and the Stream Video with ESP32-WROVER CAM, for instance.
The resolution is set using the value defined earlier. The buffer count is set to 2, allowing for double buffering. JPEG compression quality is set to 80%. Finally, the camera is started with Camera.begin(cfg).
initWifi function
The Wi-Fi initialization occurs in the initWifi() function:
void initWifi() {
WiFi.persistent(false);
WiFi.mode(WIFI_STA);
WiFi.begin(WIFI_SSID, WIFI_PASS);
while (WiFi.status() != WL_CONNECTED) {
delay(100);
}
Serial.printf("Stream at: http://%s%s\n",
WiFi.localIP().toString().c_str(), URL);
}
Wi-Fi persistence is disabled to avoid storing credentials in flash. The ESP32 is set to station mode and begins connecting to the specified Wi-Fi network. The loop waits until a connection is established. Once connected, the device prints the stream URL to the serial monitor:
Stream at: http://192.168.2.42/video
You will need to use this URL in the client application (Python app) that records the video stream to a file.
Note that you can test the video streaming by pasting this URL into the address bar of your Webbrowser. But make sure that you use either the client app OR the Webbrowser but not both, since only one client at the same time can connect to the stream.
initServer function
The HTTP server is set up in the initServer() function:
void initServer() {
server.on(URL, handleStream);
server.begin();
}
This function registers the /video endpoint with the handleStream() function and starts the server.
setup function
The setup() function prepares the ESP32-CAM for operation:
void setup() {
Serial.begin(115200);
pinMode(pinPIR, INPUT);
pinMode(pinFlash, OUTPUT);
initWifi();
initCamera();
initServer();
}
Serial communication is initialized at a baud rate of 115200. The PIR sensor pin is configured as an input, and the flash LED pin is set as an output. The Wi-Fi connection, camera configuration, and web server are all initialized in order.
loop function
Finally, the loop() function contains the main runtime logic:
void loop() {
server.handleClient();
}
This function continuously calls server.handleClient() to handle any incoming HTTP requests from clients.
And that is the server side of things. In the next section we implement the client that receives the video stream and saves it to a file.
Video Recording
The client is implemented as a Python script that connects to the MJPEG stream provided by an ESP32-CAM and writes the incoming video to a local .avi file. The recording is automatically segmented into daily files—one per calendar day. The script also optionally displays the live stream in a window.
Below is the complete code. Have a quick look first and then we dive into the details:
import cv2
import requests
import numpy as np
import datetime
import os
# Replace with your ESP32-CAM stream URL
stream_url = 'http://192.168.2.42/video'
# match your ESP32-CAM setting
frame_width = 800
frame_height = 600
fps = 10
fourcc = cv2.VideoWriter_fourcc(*'XVID')
def get_output_filename():
date_str = datetime.datetime.now().strftime("%Y-%m-%d")
return f'recording_{date_str}.avi'
# Initialize variables
current_date = datetime.date.today()
output_file = get_output_filename()
out = cv2.VideoWriter(output_file, fourcc, fps, (frame_width, frame_height))
# Connect to MJPEG stream
print(f"Connecting to {stream_url}")
stream = requests.get(stream_url, stream=True)
if stream.status_code != 200:
print(f"Failed to connect to ESP32-CAM. Status code: {stream.status_code}")
exit()
bytes_buffer = b''
try:
for chunk in stream.iter_content(chunk_size=1024):
bytes_buffer += chunk
a = bytes_buffer.find(b'\xff\xd8') # JPEG start
b = bytes_buffer.find(b'\xff\xd9') # JPEG end
if a != -1 and b != -1 and b > a:
jpg = bytes_buffer[a:b+2]
bytes_buffer = bytes_buffer[b+2:]
# Decode the JPEG image to OpenCV format
img_array = np.frombuffer(jpg, dtype=np.uint8)
frame = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
if frame is not None:
new_date = datetime.date.today()
if new_date != current_date:
out.release()
current_date = new_date
output_file = get_output_filename()
out = cv2.VideoWriter(output_file, fourcc, fps, (frame_width, frame_height))
print(f"Started new file: {output_file}")
out.write(frame)
# Optional: Show live video
cv2.imshow('ESP32-CAM Stream', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
except KeyboardInterrupt:
pass
finally:
out.release()
cv2.destroyAllWindows()
print(f"Video saved to {output_file}")
Libraries
The script starts by importing necessary libraries:
import cv2 import requests import numpy as np import datetime import os
The cv2 module (OpenCV) is used for handling image and video operations. The requests library allows the script to establish a connection to the ESP32-CAM’s HTTP MJPEG stream. The numpy library is used to convert raw byte data into an image array. The datetime module manages time and date, particularly for organizing daily video files. Finally, os is imported for potential file path manipulations, although it is not used in this particular script.
You will have to install the cv2, requests and numpy libraries on your computer, preferably in a virtual environment. See the Object Detection with ESP32-CAM and YOLO for an example with more details.
Constants
The IP address of the ESP32-CAM is defined in the stream_url variable:
stream_url = 'http://192.168.2.42/video'
This URL corresponds to the endpoint defined in the Arduino sketch (specifically const char* URL = "/video"). The ESP32-CAM must be accessible at this IP address for the script to work.
The following constants define the camera resolution and recording settings:
frame_width = 800 frame_height = 600 fps = 10 fourcc = cv2.VideoWriter_fourcc(*'XVID')
The frame_width and frame_height values must match the resolution set in the ESP32-CAM (esp32cam::Resolution::find(800, 600)). The fps variable specifies the frame rate, which should match the ESP32-CAM’s FRAMERATE setting. The fourcc variable defines the video codec format; here, XVID is used, which is a common codec for .avi files.
get_output_filename function
Next we define a helper function to generate filenames based on the current date:
def get_output_filename():
date_str = datetime.datetime.now().strftime("%Y-%m-%d")
return f'recording_{date_str}.avi'
This function uses the datetime module to create a string in the format YYYY-MM-DD, which is then used to name the output file, e.g. recording_2025-06-01. This ensures that each recording is saved to a file named after the date it was recorded and keeps the file size limited. Otherwise, you may end up with a massive video file if the camera is activated frequently.
Variables
The script then initializes variables to manage daily video segmentation:
current_date = datetime.date.today() output_file = get_output_filename() out = cv2.VideoWriter(output_file, fourcc, fps, (frame_width, frame_height))
The current_date variable stores today’s date. The output_file variable holds the filename generated by get_output_filename(). The out object is an OpenCV VideoWriter, which is used to write individual frames to the .avi file.
Connect to stream
Next, the script connects to the video stream from the ESP32-CAM:
print(f"Connecting to {stream_url}")
stream = requests.get(stream_url, stream=True)
A GET request is sent to the ESP32-CAM. The stream=True parameter ensures the response is treated as a streaming response, allowing data to be received in chunks.
Check status
Immediately after the connection attempt, we check the response status:
if stream.status_code != 200:
print(f"Failed to connect to ESP32-CAM. Status code: {stream.status_code}")
exit()
If the HTTP status code is not 200 (OK), the script prints an error message and exits. In this case check that the ESP32-CAM is running, connected to the Wi-Fi and the URL and IP address for server and client match, e.g. http://192.168.2.42/video
Buffer
The script then initializes an empty byte buffer to hold incoming data from the stream:
bytes_buffer = b''
This buffer will be used to accumulate stream data and extract individual JPEG images from it.
Main loop
The main loop begins in a try block:
try:
for chunk in stream.iter_content(chunk_size=1024):
bytes_buffer += chunk
The script reads the stream in chunks of 1024 bytes and appends each chunk to bytes_buffer. This approach handles the stream as a continuous sequence of bytes.
Within the loop, the script searches for the start and end markers of a JPEG image:
a = bytes_buffer.find(b'\xff\xd8') # JPEG start
b = bytes_buffer.find(b'\xff\xd9') # JPEG end
The JPEG format starts with the byte sequence 0xFF 0xD8 and ends with 0xFF 0xD9. These markers allow the script to isolate complete JPEG images from the byte stream.
If both markers are found and correctly ordered, the script extracts the JPEG image:
if a != -1 and b != -1 and b > a:
jpg = bytes_buffer[a:b+2]
bytes_buffer = bytes_buffer[b+2:]
The slice jpg contains the raw JPEG image. The processed part is removed from the buffer so the script can handle the next image.
The JPEG data is then converted into an OpenCV image:
img_array = np.frombuffer(jpg, dtype=np.uint8)
frame = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
The raw bytes are turned into a NumPy array of unsigned 8-bit integers. This array is decoded using cv2.imdecode() into an actual image frame suitable for display or saving.
If a valid frame is obtained, the script checks if the date has changed:
if frame is not None:
new_date = datetime.date.today()
if new_date != current_date:
out.release()
current_date = new_date
output_file = get_output_filename()
out = cv2.VideoWriter(output_file, fourcc, fps, (frame_width, frame_height))
print(f"Started new file: {output_file}")
It compares the current date with the previously stored date. If a new day has begun, the current video file is closed using out.release(). A new filename is generated, and a new VideoWriter is initialized to start recording into the new file.
The current frame is then written to the output video:
out.write(frame)
Additionally, the script shows the video in a real-time preview window:
cv2.imshow('ESP32-CAM Stream', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
The frame is displayed using OpenCV’s imshow() function. If the user presses the ‘q’ key, the loop exits, stopping the stream.
Exception handling
Similarly, if the client is stopped by a keyboard interrupt, e.g. via Ctrl+C, the script ensures in the finally block that the output file and the OpenCV window are closed:
except KeyboardInterrupt:
pass
finally:
out.release()
cv2.destroyAllWindows()
print(f"Video saved to {output_file}")
And that is the server! If you now run the ESP32-CAM (serve) and then start the client (Python app) you have a surveillance system that captures video while motion is detected and writes the video stream to a date-stamped file for later inspection.
Conclusions
In this tutorial you learned how to build a motion-activated, Surveillance Camera system with the ESP32-CAM, a PIR motion sensor, and a PC.
The ESP32-CAM operates as a motion-activated MJPEG video streaming server. When a client visits the /video URL, the server begins streaming video frames. However, frames are only captured and sent when motion is detected by the PIR sensor. For more information on motion detection see the Motion Activated ESP32-CAM tutorial.
In addition to starting the video stream the flash LED is activated to assist with illumination. If you have issues with controlling the flash LED have a look at the Control ESP32-CAM Flash LED tutorial.
On the client side a Python script receives the motion-triggered MJPEG video stream from the ESP32-CAM and writes each frame to a local .avi file. It creates a new file every calendar day, making it easy to organize and archive footage. The script also optionally shows the live video feed and can be interrupted cleanly by the user.
If you want to get notified whenever motion is detected, have a look at our ESP32 send Telegram Message tutorial, which teaches you how to send messages to the Telegram app on your phone.
If you want to detect objects within the video stream have a look at the Object Detection with ESP32-CAM and YOLO tutorial. And if you need more GPIO pins for other inputs or outputs, the More GPIO pins for ESP32-CAM may have some useful tips.
Feel free to leave your questions in the comment section.
Happy Tinkering ; )
Stefan is a professional software developer and researcher. He has worked in robotics, bioinformatics, image/audio processing and education at Siemens, IBM and Google. He specializes in AI and machine learning and has a keen interest in DIY projects involving Arduino and 3D printing.

