Auto Reply WA dengan Voice Note
Cara setup auto reply WhatsApp dengan voice note. Lebih personal, lebih engaging. Tutorial text-to-speech bot!
Voice note = Touch personal yang berbeda!
Kadang text tidak cukup. Voice note memberikan sentuhan personal yang membuat customer merasa lebih dihargai.
Kapan Pakai Voice Note?
✅ BAGUS UNTUK:
- Welcome message VIP customer
- Personal thank you
- Complex explanation
- Apology untuk komplain
- Exclusive announcements
- Intimate brand voice
❌ TIDAK IDEAL UNTUK:
- Quick FAQ answers
- Order confirmations
- High volume situations
- When customer needs to screenshot infoCara Kerja
┌────────────────────────────────────┐
│ TEXT MESSAGE/TEMPLATE │
│ "Hai Budi, terima kasih sudah..." │
└──────────────────┬─────────────────┘
│
▼
┌────────────────────────────────────┐
│ TEXT-TO-SPEECH (TTS) │
│ Convert text → Audio file │
└──────────────────┬─────────────────┘
│
▼
┌────────────────────────────────────┐
│ SEND AS VOICE NOTE │
│ WhatsApp sends .ogg/.mp3 audio │
└────────────────────────────────────┘Implementation
Using Google Text-to-Speech:
javascript
const textToSpeech = require('@google-cloud/text-to-speech');
const fs = require('fs');
const util = require('util');
const ttsClient = new textToSpeech.TextToSpeechClient();
async function generateVoiceNote(text, outputPath) {
const request = {
input: { text },
voice: {
languageCode: 'id-ID',
name: 'id-ID-Wavenet-A', // Indonesian female voice
ssmlGender: 'FEMALE'
},
audioConfig: {
audioEncoding: 'OGG_OPUS', // Best for WhatsApp
speakingRate: 1.0,
pitch: 0
}
};
const [response] = await ttsClient.synthesizeSpeech(request);
const writeFile = util.promisify(fs.writeFile);
await writeFile(outputPath, response.audioContent, 'binary');
return outputPath;
}
// Usage in bot
async function sendVoiceReply(msg, text) {
const audioPath = `/tmp/voice_${Date.now()}.ogg`;
await generateVoiceNote(text, audioPath);
const media = MessageMedia.fromFilePath(audioPath);
await msg.reply(media, undefined, { sendAudioAsVoice: true });
// Cleanup
fs.unlinkSync(audioPath);
}Using ElevenLabs (More Natural):
javascript
const axios = require('axios');
const fs = require('fs');
async function generateElevenLabsVoice(text, outputPath) {
const response = await axios({
method: 'post',
url: `https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`,
headers: {
'xi-api-key': process.env.ELEVENLABS_API_KEY,
'Content-Type': 'application/json'
},
data: {
text,
model_id: 'eleven_multilingual_v2',
voice_settings: {
stability: 0.5,
similarity_boost: 0.75
}
},
responseType: 'arraybuffer'
});
fs.writeFileSync(outputPath, response.data);
return outputPath;
}Using Free Alternative (gTTS):
javascript
const gtts = require('gtts');
async function generateGTTSVoice(text, outputPath) {
return new Promise((resolve, reject) => {
const speech = new gtts(text, 'id'); // Indonesian
speech.save(outputPath, (err) => {
if (err) reject(err);
else resolve(outputPath);
});
});
}Pre-recorded Voice Notes
javascript
// For consistent brand voice, use pre-recorded messages
const preRecordedVoices = {
'welcome': './voices/welcome.ogg',
'thank_you': './voices/thank_you.ogg',
'promo': './voices/promo_announcement.ogg',
'apology': './voices/apology.ogg',
'birthday': './voices/birthday.ogg'
};
async function sendPreRecordedVoice(msg, voiceType) {
const voicePath = preRecordedVoices[voiceType];
if (!voicePath || !fs.existsSync(voicePath)) {
console.error('Voice not found:', voiceType);
return false;
}
const media = MessageMedia.fromFilePath(voicePath);
await msg.reply(media, undefined, { sendAudioAsVoice: true });
return true;
}Hybrid Approach (Text + Voice)
javascript
async function sendHybridMessage(msg, text, includeVoice = false) {
// Always send text first (for reference/screenshot)
await msg.reply(text);
// Optionally send voice version
if (includeVoice) {
const audioPath = await generateVoiceNote(text);
const media = MessageMedia.fromFilePath(audioPath);
await msg.reply(media, undefined, { sendAudioAsVoice: true });
fs.unlinkSync(audioPath);
}
}
// Usage based on customer preference or context
client.on('message', async msg => {
const customer = await getCustomer(msg.from);
// VIP customers get voice notes
const includeVoice = customer?.tier === 'vip' || customer?.prefersVoice;
if (msg.body.toLowerCase() === 'promo') {
await sendHybridMessage(msg, getPromoText(), includeVoice);
}
});Template Voice Messages
Welcome VIP:
"Hai Kak [NAMA]! Selamat datang kembali di [BRAND].
Senang sekali bisa melayani kakak hari ini.
Kalau ada yang bisa kami bantu, langsung chat aja ya.
Tim kami siap 24 jam untuk kakak!"Thank You Order:
"Kak [NAMA], terima kasih banyak sudah order di [BRAND]!
Pesanan kakak sudah kami terima dan sedang diproses.
Kami akan kirim update resi secepatnya.
Terima kasih sudah percaya belanja di kami!"Apology:
"Kak [NAMA], saya pribadi minta maaf atas ketidaknyamanan
yang kakak alami. Ini benar-benar bukan standar layanan kami.
Tim kami sudah menangani masalah ini dan akan memberikan
solusi terbaik untuk kakak. Sekali lagi, mohon maaf ya kak."Birthday:
"Hai Kak [NAMA]! Selamat ulang tahun!
Semoga di usia yang baru ini, semua harapan dan impian kakak terwujud.
Sehat selalu, bahagia selalu. Ini ada hadiah spesial dari kami,
cek chat ya kak! Happy birthday!"Handling Voice Input
javascript
client.on('message', async msg => {
// Check if message is a voice note
if (msg.type === 'ptt') { // ptt = push-to-talk (voice note)
// Download voice note
const media = await msg.downloadMedia();
// Option 1: Transcribe using Speech-to-Text
const transcription = await transcribeAudio(media.data);
// Process transcription as text
const response = await processMessage(transcription);
// Reply (can be text or voice)
await msg.reply(`Saya dengar kakak bilang: "${transcription}"\n\n${response}`);
}
});
// Using Google Speech-to-Text
async function transcribeAudio(audioBase64) {
const speech = require('@google-cloud/speech');
const client = new speech.SpeechClient();
const [response] = await client.recognize({
config: {
encoding: 'OGG_OPUS',
languageCode: 'id-ID'
},
audio: {
content: audioBase64
}
});
return response.results
.map(r => r.alternatives[0].transcript)
.join(' ');
}Voice Note Best Practices
DO ✅
- Keep it short (< 30 seconds)
- Clear pronunciation
- Warm, friendly tone
- Background noise-free
- Follow up with text summary
- Test on different devicesDON'T ❌
- Long rambling messages
- Poor audio quality
- Monotone delivery
- Missing text alternative
- Every message as voice (too much)
- Sensitive info only in voiceCost Comparison
📊 TTS SERVICE PRICING:
Google Cloud TTS:
- Standard: $4 / 1M characters
- WaveNet: $16 / 1M characters
ElevenLabs:
- Free: 10k chars/month
- Starter: $5/mo (30k chars)
- Creator: $22/mo (100k chars)
Amazon Polly:
- Standard: $4 / 1M characters
- Neural: $16 / 1M characters
gTTS (Google Translate TTS):
- FREE (but less natural)FAQ
Apakah voice note lebih efektif dari text?
Tergantung context. Untuk personal touch (VIP, apology, celebration) = YES. Untuk info yang perlu di-screenshot = text better.
Berapa panjang ideal voice note?
15-30 detik max. Lebih dari itu, customer mungkin skip.
Bagaimana handle customer yang kirim voice?
Transcribe dengan Speech-to-Text, lalu process seperti text biasa.
Kesimpulan
Voice notes = Personal touch at scale!
| Text Only | Text + Voice |
|---|---|
| Standard | Premium feel |
| Scan-friendly | Engaging |
| Quick | Personal |
Add voice for special moments!