In this document, I describe how to implement AI Speaker on ODROID-HC1 using Google Assistant SDK.

Hardware Requirements


Insert the Bluetooth dongle to the USB port of ODROID-HC1. Turn on the ODROID-HC1 and Bluetooth speaker. You are ready to start!

Sound Settings

To access ODROID-HC1 console, you have to get the ip address of the board. Please refer to Boot the ODROID and find IP address section in Headless setup wiki page.

This guide is based on latest Ubuntu 16.04 minimal. (Where to get the OS image ⇒ Software Release for Linux/Ubuntu Kernel 4.9 on XU4/XU3).
Before the start the system settings, we add a new user account odroid as a sudo user because Ubuntu minimal does not have any user account.

# adduser odroid
# usermod -aG sudo odroid
# su - odroid

Install the sound related packages. They are alsa and pulseaudio packages.

$ sudo apt update
$ sudo apt install libasound2 libasound2-plugins alsa-utils alsa-oss
$ sudo apt install pulseaudio pulseaudio-utils pulseaudio-module-bluetooth

Add the pulse audio permission to the user account. Add the 'load-module module-switch-on-connect' line to the pulseaudio configuration file.
This setting is to change the audio output to the Bluetooth speaker automatically.

$ sudo usermod -aG pulse,pulse-access odroid
$ sudo nano /etc/pulse/default.pa
  • /etc/pulse/default.pa
.ifexists module-bluetooth-discover.so
load-module module-bluetooth-discover
load-module module-switch-on-connect # this is new!
.endif

Start pulseaudio.

$ pulseaudio --start

Bluetooth Settings

Install Bluetooth related package. I used bluez package for bluetooth.

$ sudo apt install bluez
$ bluetoothctl

If the bluetoothctl command does not work on the user account, modify the dbus configuration file. Add the below's configurations to the file

$ sudo nano /etc/dbus-1/system.d/bluetooth.conf
  • /etc/dbus-1/system.d/bluetooth.conf
  <policy user="odroid">
    <allow send_destination="org.bluez"/>
    <allow send_interface="org.bluez.Agent1"/>
    <allow send_interface="org.bluez.GattCharacteristic1"/>
    <allow send_interface="org.bluez.GattDescriptor1"/>
    <allow send_interface="org.freedesktop.DBus.ObjectManager"/>
    <allow send_interface="org.freedesktop.DBus.Properties"/>
  </policy>

Enter the below's commands on the bluetoothctl console. The mac address of my Bluetooth speaker is 00:11:67:AE:25:C6. This is different each Bluetooth device. Use the correct mac address by yours.

[bluetooth]# agent on
[bluetooth]# default-agent 
[bluetooth]# scan on
[bluetooth]# pair 00:11:67:AE:25:C6
[bluetooth]# trust 00:11:67:AE:25:C6
[bluetooth]# connect 00:11:67:AE:25:C6
[bluetooth]# quit

The Bluetooth speaker set the A2DP (Advanced Audio Distribution Profile) as default. Change the profile to HSP (Head Set Profile) because A2DP cannot use the microphone.

$ pacmd ls
(Check the card index of the Bluetooth speaker, I assume the index is 1.)
$ pacmd set-card-profile 1 headset_head_unit

Verify sound and Bluetooth setup.

(Play a test sound)
$ speaker-test -t wav
(Record and play back some audio using ALSA command-line tools)
$ arecord --format=S16_LE --duration=5 --rate=16k --file-type=raw out.raw
$ aplay --format=S16_LE --rate=16k --file-type=raw out.raw

To easy use Bluetooth speaker, some configurations are useful.

  • /etc/bluetooth/main.conf
[Policy]
AutoEnable=true
  • ($HOME)/.bashrc
pulseaudio --start
echo "connect 00:11:67:AE:25:C6" | bluetoothctl

Enable Google Assistant API

This section contains same contents as Google Assistant SDK Guides page.

“I am new to GCP (Google Cloud Platform).”
Use your Google account to sign in. If you don’t have one, you’ll need to create one. Trying the Google Assistant API is free to use for personal use.

Configure a Google Developer Project

A Google Developer Project gives your device access to the Google Assistant API. The project tracks quota usage and gives you valuable metrics for the requests made from your device.

To enable access to the Google Assistant API, do the following:

1. In the Cloud Platform Console, go to the Projects page. Select an existing project or create a new project. GO TO THE PROJECTS PAGE

2. Enable the Google Assistant API on the project you selected. Click Enable. ENABLE THE API

3. Create an OAuth Client ID with the following steps:
a. Create the client ID. CREATE AN OAUTH CLIENT ID
b. You may need to set a product name for the product consent screen. On the OAuth consent screen tab, give the product a name and click Save.
c. Click Other and give the client ID a name.
d. Click Create. A dialog box appears that shows you a client ID and secret. (No need to remember or save this, just close the dialog.)
e. Click (at the far right of screen) for the client ID to download the client secret JSON file (client_secret_<client-id>.json).

4. The client_secret_<client-id>.json file must be located on the device to authorize the Google Assistant SDK sample to make Google Assistant queries. Do not rename this file.

5. Copy client_secret_<client-id>.json to the ODROID-HC1.

$ scp ~/Downloads/client_secret_client-id.json odroid@<ODROID-HC1 ip address>:~/

Set activity controls for your Google account

In order to use the Google Assistant, you must share certain activity data with Google. The Google Assistant needs this data to function properly; this is not specific to the SDK.

Open the Activity Controls page for the Google account that you want to use with the Assistant. You can use any Google account, it does not need to be your developer account.

Ensure the following toggle switches are enabled (blue):

  • Web & App Activity
  • Device Information
  • Voice & Audio Activity

Download and Run the Google Assistant API Sample

Use a Python virtual environment to isolate the SDK and its dependencies from the system Python packages.

$ sudo apt update
$ sudo apt install python-dev python-virtualenv git portaudio19-dev libffi-dev libssl-dev
$ virtualenv env --no-site-packages

If you face the locale problem like below's, set the LC_ALL environment variable.

  Complete output from command /home/odroid/env/bin/python2 - setuptools pkg_resources pip wheel:
  Traceback (most recent call last):
  File "<stdin>", line 24, in <module>
  File "/usr/share/python-wheels/pip-8.1.1-py2.py3-none-any.whl/pip/__init__.py", line 215, in main
  File "/home/odroid/env/lib/python2.7/locale.py", line 581, in setlocale
    return _setlocale(category, locale)
locale.Error: unsupported locale setting
$ export LC_ALL=C
$ virtualenv env --no-site-packages

Activate Python virtual environment.

$ env/bin/python -m pip install --upgrade pip setuptools
$ source env/bin/activate

After activating Python virtual environment, the '(env)' string is added to the prompt.
Authorize the Google Assistant SDK sample to make Google Assistant queries for the given Google Account. Reference the JSON file you copied over to the device in a previous step.
Install the authorization tool.

(env) $ python -m pip install --upgrade google-auth-oauthlib[tool]

Run the tool. Remove the –headless flag if you are running this from a terminal on the device (not an SSH session).

(env) $ google-oauthlib-tool --client-secrets /path/to/client_secret_client-id.json --scope https://www.googleapis.com/auth/assistant-sdk-prototype --save --headless

You should see a URL displayed in the terminal.

Please go to this URL: https://...

Copy the URL and paste it into a browser (this can be done on your development machine, or any other machine). After you approve, a code will appear in your browser, such as “4/XXXX”. Copy and paste this code into the terminal.

Enter the authorization code:

If authorization was successful, you will see OAuth credentials initialized in the terminal. If instead of you see InvalidGrantError, then an invalid code was entered. Try again, taking care to copy and paste the entire code.
If you enter the correct authorization code, then credentials.json file is generated.

credentials saved: /home/odroid/.config/google-oauthlib-tool/credentials.json

Get the sample codes from the github repository.

(env) $ git clone https://github.com/googlesamples/assistant-sdk-python
(env) $ cd assistant-sdk-python

Install Python packages requirements for the sample program. We use pushtotalk sample.

(env) $ cd google-assistant-sdk
(env) $ python setup.py install
(env) $ cd googlesamples/assistant/grpc
(env) $ pip install --upgrade -r requirements.txt 
(env) $ nano pushtotalk.py

To run the sample, we have to modify the sample code. Change the exception type 'SystemError to ValueError in the sample code (line 35).

  • pushtotalk.py
except ValueError:
    import assistant_helpers
    import audio_helpers

Run and test the pushtotalk sample. If the sample program is working well, this work is almost done.

(env) $ python pushtotalk.py
INFO:root:Connecting to embeddedassistant.googleapis.com
Press Enter to send a new request...

Copy the sample to your working directory. Deactivate Python virtual environment. We have to work more for futher useful AI speaker. We will work on $(HOME)/ai_speaker directory.

(env) $ cd ..
(env) $ cp -r grpc ~/ai_speaker
(env) $ cd ~/ai_speaker
(env) $ cp pushtotalk.py ai_speaker.py
(env) $ deactivate
$ cd

Wake-Up-Word

The pushtotalk sample looks like interact with AI assistant. But, we should type enter key before talking something to AI assistant. In this section, I describe about Wake-Up-Word like “Okay, Google”, “Alexa” and “Jarvis”.
To detect Wake-Up-Word, I used CMUSphinx which is the open source local speech recognition toolkit.
Build and install sphinxbase. SphinxBase provides common functionality across all CMUSphinx projects.

$ sudo apt install libtool bison swig python-dev autoconf libtool automake
$ git clone --depth 1 https://github.com/cmusphinx/sphinxbase.git
$ cd sphinxbase
$ ./autogen.sh
$ make -j8
$ sudo make install
$ cd

Sphinxbase will be installed in the '/usr/local/' directory by default. Not all systems load libraries from this folder automatically. In order to load them you need to configure the path to look for shared libaries. This can be done either in the '/etc/ld.so.conf' file or by exporting the environment variables:

export LD_LIBRARY_PATH=/usr/local/lib
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

Build and install PocketSphinx. PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop.

$ git clone --depth 1 https://github.com/cmusphinx/pocketsphinx.git
$ cd pocketsphinx
$ make -j8
$ sudo make install
$ cd

To test the installation, run pocketsphinx_continuous and check that it recognizes words you speak into your microphone.

$ pocketsphinx_continuous -inmic yes

For more information about building PocketSphinx. Please refer to Building an application with PocketSphinx page.
Add pocketsphinx_continuous program as subprocess in the AI speaker program. pocketsphinx_continuous is good to detect hotwords because it recognizes speech asynchronously.
And remove the wait_for_user_trigger related lines, our trigger is the hotwords.

$ source env/bin/activate
(env) $ pip install --upgrade subprocess
  • $(HOME)/ai_speaker/ai_speaker.py
"""Sample that implements gRPC client for Google Assistant API."""

# Add subprocess module
import subprocess
import json
import logging
import os.path

(......)

# Add below's routines in the 'While True:' loop
       while True:
           p = subprocess.Popen(args    = ['pocketsphinx_continuous','-inmic', 'yes', '-kws_threshold', '1e-16', '-keyphrase', 'hey dude'],
                                stdin     = subprocess.PIPE,
                                stdout = subprocess.PIPE,
                                universal_newlines=True)
           while p.poll() is None:
               data = p.stdout.readline()
               if data.find("hey dude") is not -1:
                   print "Detected Hotwords"
                   p.stdout.flush()
                   break
           p.terminate()
           
# Delete 'wait_for_user_trigger' related lines, we don't use it.

Our Wake-Up-Words is “hey dude”. Run the program. Tell “Hey dude” and then talk something whatever you want to AI assistant.

(env) $ cd ai_speaker
(env) $ python ai_speaker.py

Detection Sound

We have a problem after adding Wake-Up-Words. We cannot realize whether the AI speaker detects hotwords or not. We need to know the timing to command to AI assistant by voice.
In this section, we add the detection sound to the program.

Copy the detect.wav to the ODROID-HC1.

$ scp ~/Downloads/detect.wav odroid@<ODROID-HC1 ip address>:~/

We use the pyaudio and wave module in order to play .wav file in Python source code. Install the Python packages.

(env) $ pip install --upgrade pyaudio wave

Add the detection sound play routine to the program. Full differences including Wake-Up-Words routines are as below.

(env) $ nano ai_speaker.py
  • diff file between original sample code pushtotalk.py and modified program ai_speaker.py
--- pushtotalk.py	2017-10-19 15:42:12.164741800 +0000
+++ ai_speaker.py	2017-10-19 15:41:49.644811151 +0000
@@ -14,6 +14,9 @@
 
 """Sample that implements gRPC client for Google Assistant API."""
 
+import pyaudio
+import wave
+import subprocess
 import json
 import logging
 import os.path
@@ -310,14 +313,38 @@
         # keep recording voice requests using the microphone
         # and playing back assistant response using the speaker.
         # When the once flag is set, don't wait for a trigger. Otherwise, wait.
-        wait_for_user_trigger = not once
+        chunk = 1024
+        pa = pyaudio.PyAudio()
+
         while True:
-            if wait_for_user_trigger:
-                click.pause(info='Press Enter to send a new request...')
+	     p = subprocess.Popen(args	 = ['pocketsphinx_continuous','-inmic', 'yes', '-kws_threshold', '1e-16', '-keyphrase', 'hey dude'],
+                             stdin	= subprocess.PIPE,
+                             stdout = subprocess.PIPE,
+                             universal_newlines=True)
+            while p.poll() is None:
+                data = p.stdout.readline()
+	         if data.find("hey dude") is not -1:
+                    print "Detected Hotwords"
+                    p.stdout.flush()
+                    break
+            p.terminate()
+
+            # Play the detection sound
+            f = wave.open(r"/home/odroid/detect.wav","rb") 	
+            stream = pa.open(format = pa.get_format_from_width(f.getsampwidth()),
+                             channels = f.getnchannels(),
+                             rate = f.getframerate(),
+                             output = True)
+            wav_data = f.readframes(chunk)
+
+            while wav_data:
+                stream.write(wav_data)
+                wav_data = f.readframes(chunk)
+            stream.stop_stream()
+            stream.close()
+            f.close()
+
             continue_conversation = assistant.converse()
-            # wait for user trigger if there is no follow-up turn in
-            # the conversation.
-            wait_for_user_trigger = not continue_conversation
 
             # If we only want one conversation, break.
             if once and (not continue_conversation):

Run the AI speaker program.

(env) $ python ai_speaker.py

Next Step

Actually, the detection rate of the Wake-Up-Words is terrible. Whether still using pocketsphinx or finding another solution, we need to improve Wake-Up-Words routine.
Adding custom command is interesting in this project. For example, it is easy to control IOT devices by voice if we use Google Assistant SDK. Do google with 'Action on google' keyword.
But I give you more simple and easy custom command adding solution. Just add the custom command to the ai_speaker.py program. In the pushtotalk sample, we can find request text which is already recognized voice.

--- pushtotalk.py	2017-10-19 16:07:46.753689882 +0000
+++ pushtotalk_new.py	2017-10-19 16:09:58.165799271 +0000
@@ -119,6 +119,15 @@
                 logging.info('Transcript of user request: "%s".',
                              resp.result.spoken_request_text)
                 logging.info('Playing assistant response.')
+		#Add your custom voice commands here
+		#Ex> 
+		#import os
+		#r_text = resp.resut.spoken_request_text
+		#if r_text.find("play music") is not -1:
+		#    os.system("mplayer ~/Music/*&")
+		#if r_text.find("turn on light") is not -1:
+		#    os.system("echo 1 > /sys/class/gpio/gpio1/value")
+
             if len(resp.audio_out.audio_data) > 0:
                 self.conversation_stream.write(resp.audio_out.audio_data)
             if resp.result.spoken_response_text:

Try control you home electronic devices using IOT controller and your voice.