Auto Reply WA dengan Voice Note

Cara setup auto reply WhatsApp dengan voice note. Lebih personal, lebih engaging. Tutorial text-to-speech bot!

Auto Reply WA dengan Voice Note
Auto Reply WA dengan Voice Note

Voice note = Touch personal yang berbeda!

Kadang text tidak cukup. Voice note memberikan sentuhan personal yang membuat customer merasa lebih dihargai.


Kapan Pakai Voice Note?

✅ BAGUS UNTUK:
- Welcome message VIP customer
- Personal thank you
- Complex explanation
- Apology untuk komplain
- Exclusive announcements
- Intimate brand voice

❌ TIDAK IDEAL UNTUK:
- Quick FAQ answers
- Order confirmations
- High volume situations
- When customer needs to screenshot info

Cara Kerja

┌────────────────────────────────────┐
│      TEXT MESSAGE/TEMPLATE         │
│  "Hai Budi, terima kasih sudah..." │
└──────────────────┬─────────────────┘
                   │
                   ▼
┌────────────────────────────────────┐
│     TEXT-TO-SPEECH (TTS)           │
│  Convert text → Audio file         │
└──────────────────┬─────────────────┘
                   │
                   ▼
┌────────────────────────────────────┐
│        SEND AS VOICE NOTE          │
│  WhatsApp sends .ogg/.mp3 audio    │
└────────────────────────────────────┘

Implementation

Using Google Text-to-Speech:

javascript

const textToSpeech = require('@google-cloud/text-to-speech');
const fs = require('fs');
const util = require('util');

const ttsClient = new textToSpeech.TextToSpeechClient();

async function generateVoiceNote(text, outputPath) {
    const request = {
        input: { text },
        voice: {
            languageCode: 'id-ID',
            name: 'id-ID-Wavenet-A', // Indonesian female voice
            ssmlGender: 'FEMALE'
        },
        audioConfig: {
            audioEncoding: 'OGG_OPUS', // Best for WhatsApp
            speakingRate: 1.0,
            pitch: 0
        }
    };
    
    const [response] = await ttsClient.synthesizeSpeech(request);
    
    const writeFile = util.promisify(fs.writeFile);
    await writeFile(outputPath, response.audioContent, 'binary');
    
    return outputPath;
}

// Usage in bot
async function sendVoiceReply(msg, text) {
    const audioPath = `/tmp/voice_${Date.now()}.ogg`;
    
    await generateVoiceNote(text, audioPath);
    
    const media = MessageMedia.fromFilePath(audioPath);
    await msg.reply(media, undefined, { sendAudioAsVoice: true });
    
    // Cleanup
    fs.unlinkSync(audioPath);
}

Using ElevenLabs (More Natural):

javascript

const axios = require('axios');
const fs = require('fs');

async function generateElevenLabsVoice(text, outputPath) {
    const response = await axios({
        method: 'post',
        url: `https://api.elevenlabs.io/v1/text-to-speech/${VOICE_ID}`,
        headers: {
            'xi-api-key': process.env.ELEVENLABS_API_KEY,
            'Content-Type': 'application/json'
        },
        data: {
            text,
            model_id: 'eleven_multilingual_v2',
            voice_settings: {
                stability: 0.5,
                similarity_boost: 0.75
            }
        },
        responseType: 'arraybuffer'
    });
    
    fs.writeFileSync(outputPath, response.data);
    return outputPath;
}

Using Free Alternative (gTTS):

javascript

const gtts = require('gtts');

async function generateGTTSVoice(text, outputPath) {
    return new Promise((resolve, reject) => {
        const speech = new gtts(text, 'id'); // Indonesian
        
        speech.save(outputPath, (err) => {
            if (err) reject(err);
            else resolve(outputPath);
        });
    });
}

Pre-recorded Voice Notes

javascript

// For consistent brand voice, use pre-recorded messages
const preRecordedVoices = {
    'welcome': './voices/welcome.ogg',
    'thank_you': './voices/thank_you.ogg',
    'promo': './voices/promo_announcement.ogg',
    'apology': './voices/apology.ogg',
    'birthday': './voices/birthday.ogg'
};

async function sendPreRecordedVoice(msg, voiceType) {
    const voicePath = preRecordedVoices[voiceType];
    
    if (!voicePath || !fs.existsSync(voicePath)) {
        console.error('Voice not found:', voiceType);
        return false;
    }
    
    const media = MessageMedia.fromFilePath(voicePath);
    await msg.reply(media, undefined, { sendAudioAsVoice: true });
    
    return true;
}

Hybrid Approach (Text + Voice)

javascript

async function sendHybridMessage(msg, text, includeVoice = false) {
    // Always send text first (for reference/screenshot)
    await msg.reply(text);
    
    // Optionally send voice version
    if (includeVoice) {
        const audioPath = await generateVoiceNote(text);
        const media = MessageMedia.fromFilePath(audioPath);
        await msg.reply(media, undefined, { sendAudioAsVoice: true });
        fs.unlinkSync(audioPath);
    }
}

// Usage based on customer preference or context
client.on('message', async msg => {
    const customer = await getCustomer(msg.from);
    
    // VIP customers get voice notes
    const includeVoice = customer?.tier === 'vip' || customer?.prefersVoice;
    
    if (msg.body.toLowerCase() === 'promo') {
        await sendHybridMessage(msg, getPromoText(), includeVoice);
    }
});

Template Voice Messages

Welcome VIP:

"Hai Kak [NAMA]! Selamat datang kembali di [BRAND]. 
Senang sekali bisa melayani kakak hari ini. 
Kalau ada yang bisa kami bantu, langsung chat aja ya. 
Tim kami siap 24 jam untuk kakak!"

Thank You Order:

"Kak [NAMA], terima kasih banyak sudah order di [BRAND]! 
Pesanan kakak sudah kami terima dan sedang diproses. 
Kami akan kirim update resi secepatnya. 
Terima kasih sudah percaya belanja di kami!"

Apology:

"Kak [NAMA], saya pribadi minta maaf atas ketidaknyamanan 
yang kakak alami. Ini benar-benar bukan standar layanan kami. 
Tim kami sudah menangani masalah ini dan akan memberikan 
solusi terbaik untuk kakak. Sekali lagi, mohon maaf ya kak."

Birthday:

"Hai Kak [NAMA]! Selamat ulang tahun! 
Semoga di usia yang baru ini, semua harapan dan impian kakak terwujud. 
Sehat selalu, bahagia selalu. Ini ada hadiah spesial dari kami, 
cek chat ya kak! Happy birthday!"

Handling Voice Input

javascript

client.on('message', async msg => {
    // Check if message is a voice note
    if (msg.type === 'ptt') { // ptt = push-to-talk (voice note)
        // Download voice note
        const media = await msg.downloadMedia();
        
        // Option 1: Transcribe using Speech-to-Text
        const transcription = await transcribeAudio(media.data);
        
        // Process transcription as text
        const response = await processMessage(transcription);
        
        // Reply (can be text or voice)
        await msg.reply(`Saya dengar kakak bilang: "${transcription}"\n\n${response}`);
    }
});

// Using Google Speech-to-Text
async function transcribeAudio(audioBase64) {
    const speech = require('@google-cloud/speech');
    const client = new speech.SpeechClient();
    
    const [response] = await client.recognize({
        config: {
            encoding: 'OGG_OPUS',
            languageCode: 'id-ID'
        },
        audio: {
            content: audioBase64
        }
    });
    
    return response.results
        .map(r => r.alternatives[0].transcript)
        .join(' ');
}

Voice Note Best Practices

DO ✅

- Keep it short (< 30 seconds)
- Clear pronunciation
- Warm, friendly tone
- Background noise-free
- Follow up with text summary
- Test on different devices

DON'T ❌

- Long rambling messages
- Poor audio quality
- Monotone delivery
- Missing text alternative
- Every message as voice (too much)
- Sensitive info only in voice

Cost Comparison

📊 TTS SERVICE PRICING:

Google Cloud TTS:
- Standard: $4 / 1M characters
- WaveNet: $16 / 1M characters

ElevenLabs:
- Free: 10k chars/month
- Starter: $5/mo (30k chars)
- Creator: $22/mo (100k chars)

Amazon Polly:
- Standard: $4 / 1M characters
- Neural: $16 / 1M characters

gTTS (Google Translate TTS):
- FREE (but less natural)

FAQ

Apakah voice note lebih efektif dari text?

Tergantung context. Untuk personal touch (VIP, apology, celebration) = YES. Untuk info yang perlu di-screenshot = text better.

Berapa panjang ideal voice note?

15-30 detik max. Lebih dari itu, customer mungkin skip.

Bagaimana handle customer yang kirim voice?

Transcribe dengan Speech-to-Text, lalu process seperti text biasa.


Kesimpulan

Voice notes = Personal touch at scale!

Text OnlyText + Voice
StandardPremium feel
Scan-friendlyEngaging
QuickPersonal

Add voice for special moments!

Setup Voice Bot →


Artikel Terkait