Skip to Content

Gravity Text-to-Speech Module Tutorial

Gravity Text-to-Speech Module Tutorial

The Gravity Text-to-Speech Module V2 is a compact speech-synthesis board from DFRobot that allows developers to easily convert text into audible speech in either English or Chinese. Equipped with a built-in speaker and supporting both I2C and UART communications, the module offers a plug-and-play solution for projects requiring spoken output.

In this tutorial, we’ll walk through everything you need to get started, from wiring and library installation to example code.

Required Parts

For this tutorial you will need the Gravity Text-to-Speech Module. You can get it at DFRobot, for instance. However, make sure you get version V2.x.

Furthermore, you will need a microcontroller. I am using an Arduino UNO but other Arduino or ESP32 boards will work fine as well. The only requirement is support for an I2C (or UART) interface.

Gravity Text-to-Speech Module

Arduino

Arduino Uno

USB Data Sync cable Arduino

USB Cable for Arduino UNO

Dupont wire set

Dupont Wire Set

Half_breadboard56a

Breadboard

Makerguides is a participant in affiliate advertising programs designed to provide a means for sites to earn advertising fees by linking to Amazon, AliExpress, Elecrow, and other sites. As an Affiliate we may earn from qualifying purchases.

Hardware of the Gravity Text-to-Speech Modul

The Gravity Text-to-Speech Module V2 is designed to deliver speech synthesis with minimal external hardware requirements. It supports both English and Chinese and can mix both languages in a single output. The board includes a built-in speaker, so no external audio amplifier or speaker is needed for basic voice output.

Gravity Text-to-Speech Module V2.0 (source)

Communication with a host controller (such as an Arduino, ESP32, Raspberry Pi, or other popular microcontroller platforms) can be done via either I2C or UART. The module’s I2C interface has a fixed address (0x40).

The module accepts a supply voltage in the 3.3 V to 5 V range, making it compatible with most common microcontroller boards. Typical operating current is under 50 mA, which keeps power consumption manageable for battery-powered projects.

On the software side, the module supports “text control identifiers” embedded inside the text string sent for synthesis. These identifiers allow dynamic configuration of voice parameters such as volume, speed, tone (intonation), and voice-type (speaker voice), giving developers some control over speech characteristics. For example, one can send something like “[v3]Hello [v8]world” to adjust volume mid-sentence.

Technical Specification

The following table summarizes the technical specification of the Gravity Text-to-Speech Module V2.0:

ParameterSpecification
Product NameGravity: Text to Speech Module V2.0 (DFR0760)
Supported LanguagesEnglish and Chinese (supports mixed-language synthesis)
Communication InterfacesI2C (address 0x40), UART
Supply Voltage3.3 V to 5.0 V
Typical Operating Current< 50 mA
Operating Temperature–40 °C to +85 °C
Audio OutputIntegrated onboard speaker
Voice ConfigurationAdjustable speed, volume, tone, and voice type via embedded control identifiers
DimensionsApproximately 42 mm × 32 mm
Interface TypeGravity 4-pin connector (VCC, GND, SDA/RX, SCL/TX)
Module Version ImprovementsEnhanced speech clarity and reduced pronunciation errors compared to V1

Install DFRobot_SpeechSynthesis_V2 Library

Before you can use the Gravity Text-to-Speech Modul, you first need to install the DFRobot_SpeechSynthesis_V2 library by DFRobot. Go to the github repo and click on the green “<> Code” button and then “Download ZIP”:

This will download the library as a zip file with the name “DFRobot_SpeechSynthesis_V2-main.zip” to your computer.

Next open your Arduino IDE and go to “Sketch -> Include Library -> Add .ZIP Library …” as shown below:

Finally, select the “DFRobot_SpeechSynthesis_V2-main.zip” file when asked and install the library.

Connecting Gravity Text-to-Speech Modul to Arduino

Connecting the Gravity Text-to-Speech Modul to an Arduino or an ESP32 is easy. Start by connecting ground of the Arduino with the GND pin of the Speech Modul. Next connect the VCC pin to the 3.3V or 5V output of the Arduino. Finally, connect SCL to C/R and SDA to to D/T as shown below:

Connecting Gravity Text to Speech Modul to Arduino UNO
Connecting Gravity Text-to-Speech Modul to Arduino UNO

Note that you can also connect the Speech Modul via the UART interface. But I2C communication is faster and keeps the Serial Interface free.

Code Example

The following code example lets the module say the text “This is a test”. It also sets parameters such as volume, speed and tone of the speech output. Have a quick look at the complete code first and then we dive into the details.

#include "Wire.h"
#include "DFRobot_SpeechSynthesis_V2.h"

DFRobot_SpeechSynthesis_I2C ss;

void setup() {
  ss.begin();
  ss.setVolume(1);
  ss.setSpeed(4);
  ss.setTone(6);
  ss.setEnglishPron(ss.eWord);
}

void loop() {
  ss.speak("This is a test");
  delay(10000);
}

Imports

At the top of the program, two header files are included to provide access to the necessary libraries. The Wire library enables I2C communication on Arduino-compatible boards, while the DFRobot_SpeechSynthesis_V2 library contains all functions required to control the Gravity Text-to-Speech Module over I2C.

#include "Wire.h"
#include "DFRobot_SpeechSynthesis_V2.h"

Including these libraries makes it possible for the sketch to initialize the module, send commands, and configure speech parameters.

Objects

The code creates a speech synthesis object named ss, which represents the Text-to-Speech module connected via I2C. This object provides methods such as begin, setVolume, setSpeed, and speak.

DFRobot_SpeechSynthesis_I2C ss;

Instantiating this object allows the sketch to interact with the hardware through a simple, high-level API.

Setup

The setup function runs once when the board powers on or resets. It configures the speech module and prepares it for spoken output. The initialization sequence begins by calling begin, which establishes communication and wakes the module.

ss.begin();

After initialization, the program configures several speech properties. The setVolume function adjusts the loudness of the synthesized voice, where lower values produce quieter output and higher values produce louder speech.

ss.setVolume(1);

Next, the setSpeed function modifies how quickly the module speaks. A higher speed value results in faster speech, while a lower value slows it down.

ss.setSpeed(4);

The code then sets the tone, which affects the pitch and intonation of the spoken voice. Adjusting tone allows tuning of how high or low the voice sounds. Lower levels also sound more robotic.

ss.setTone(6);

Finally, the program configures the English pronunciation mode. The library provides different pronunciation strategies, and selecting ss.eWord instructs the module to use word-level pronunciation rules.

ss.setEnglishPron(ss.eWord);

After these settings are applied, the module is fully configured and ready to synthesize speech.

Loop

The loop function runs continuously and triggers speech output at regular intervals. Each time through the loop, the speak function sends a text string to the module, which then converts it into audible speech.

ss.speak("This is a test");

Because speech playback should not be repeated continuously, the program includes a delay to pause execution for ten seconds. This ensures that the message is spoken only once per cycle.

delay(10000);

The loop will repeat indefinitely, causing the module to speak the same phrase every ten seconds until the board is reset or powered off.

Conclusions

In this tutorial you learned how to connect the Gravity Text-to-Speech Module with an Arduino UNO. Unlike cloud-based Text-to-Speech services such as ElevenLabs, it performs all synthesis locally on the device, which means it does not require an internet connection, avoids API costs, and provides immediate, low-latency responses.

The module supports both English and Chinese, can mix languages seamlessly, and allows real-time adjustment of volume, speed, tone, and pronunciation through simple control codes. Its I2C and UART interfaces make it easy to integrate with Arduino, ESP32, and similar microcontrollers.

On the other hand, the speech and sound quality are not as good as ElevenLabs for instance. Furthermore, the number of voices is limited, and the English voice has a distinct Chinese accent. However, if you need an offline solution and do not require natural-sounding speech the Gravity Text-to-Speech Module is an option.

For speech-to-text and speech recognition solutions have a look at our Getting started with Gravity Voice Recognition Module, the Using the Voice Recognition Module V3 with Arduino and the Voice control with XIAO-ESP32-S3-Sense and Edge Impulse tutorials.

Happy Tinkering ; )