Transform written content into speech using Google AI (Gemini) for text generation and internet-based information retrieval.
This project is based on an example in test/app.ts. It performs the following steps:
This project has been tested on Linux (Ubuntu 24.04 LTS x86_64). Windows users can install SoX via SourceForge. MacOS-specific information is currently unavailable.
Task | Priority | Status |
---|---|---|
Implement Gemini Chat | High | ✅ Completed |
Develop Voice Recognition | High | ✅ Completed |
Implement Audio Language Detection | High | ✅ Completed |
Implement Text Language Detection | Medium | ✅ Completed |
Implement an Audio Player | Low | ✅ Completed |
Define Enums | Low | ✅ Completed |
Integrate Debugging | Low | ✅ Completed |
Before using this repository, ensure the following dependencies are installed on your system:
sudo apt-get install sox
sudo apt-get install libsox-fmt-all
sudo apt install ffmpeg
choco install ffmpeg
(using Chocolatey) or Download from official websiteMacOS-specific installation instructions are not available at this time.
To install the package, use one of the following commands based on your preferred package manager:
# npm
$ npm install git+https://github.com/Stawa/GTTS.git --legacy-peer-deps
# Bun
$ bun install git+https://github.com/Stawa/GTTS.git --trust
Before diving into the examples, ensure you have the following API keys and credentials:
lib.GoogleGemini
)
lib.TextToSpeech
)
lib.VoiceRecognition.fetchTranscriptGoogle
)
lib.VoiceRecognition.fetchTranscriptDeepgram
)
lib.SummarizeText
)
Ensure to store these API keys securely and never commit them to version control. Consider using environment variables or a secure key management system.
Here's a concise example demonstrating how to generate a response using the Google Gemini API:
import { GoogleGemini } from "@stawa/gtts";
import dotenv from "dotenv";
dotenv.config();
const gemini = new GoogleGemini({
apiKey: process.env.GEMINI_API_KEY,
model: "gemini-1.5-flash",
enableLogging: true,
});
async function main() {
try {
const question = "When was Facebook launched?";
console.log(`Question: ${question}`);
const response = await gemini.chat(question);
console.log(`Gemini's response: ${response}`);
} catch (error) {
console.error("An error occurred:", error);
}
}
main();
We appreciate the contributions of all our collaborators. Each person's effort helps make this project better. A special thanks to all our contributors who have helped shape this project!