Node.js + React Web App using deepgram to transcribe voice to text sent to openai api, then stream the response back to be spoken one sentence at a time, optimizing for ultra-fast response times.
A Node.js web app using deepgram to transcribe voice to text sent to openai api, then stream the response back to be spoken one sentence at a time, optimizing for ultra-fast response times.
This software represents the foundation and starting point of my project to seamlessly integrate AI into daily life. It begins with voice-to-voice communication as a form more natural to human thought, and uses streaming to optimize for ultra-fast response time.
https://github.com/wiesenthal/AI-Voice-Chat/assets/26258920/1e75dda3-5bf1-4207-aa4c-f37a7dbc244a
Two servers, named "orchestrator" and "brain", primarily compose the software. These servers divide the logic between
The servers can be hosted locally or on AWS. I've currently turned off AWS to limit cost, but the architecture follows:
The orchestrator controls the customer journey:
On the backend:
The brain is set up a seperate server to allow for different scaling than the orchestrator server. This allows for the future potential for the brain to open several threads per user for more advanced processing.
To set up the program locally perform the following:
CREATE TABLE users ( user_id VARCHAR(255) PRIMARY KEY, google_sub VARCHAR(255), name VARCHAR(255), email VARCHAR(255), payment_expiration_date BIGINT );
CREATE TABLE messages ( id INT AUTO_INCREMENT PRIMARY KEY, user_id VARCHAR(255), role VARCHAR(255), content TEXT, command_id VARCHAR(255), timestamp TIMESTAMP BIGINT, FOREIGN KEY (user_id) REFERENCES users(user_id) );
npm install
inside both folders "brain" and "orchestrator", and the frontend folder within "brain"inside brain:
OPENAI_API_KEY=<your secret key>
SESSION_SECRET_KEY=<an arbitrary string you can generate>
MODEL=<either gpt-3.5-turbo or gpt-4>
DB_NAME=<name of your database>
DB_USERNAME=<username for your database>
DB_ENDPOINT=<localhost, if hosting locally or the endpoint for your database>
DB_PASSWORD=<password for your database>
inside orchestrator:
DEEPGRAM_API_KEY=<secret key for https://deepgram.com/>
GOOGLE_CLIENT_ID=<google client ID for a google application to use google OAuth2>
SESSION_SECRET_KEY=<another arbitrary string you can generate>
BRAIN_RUNNING_LOCALLY=true
BRAIN_HOSTNAME=<unnecessary if hosting locally, but if cloud-hosting set the above to false and put the hostname of the brain server here>
DB_NAME=<name of your database>
DB_USERNAME=<username for your database>
DB_ENDPOINT=<localhost, if hosting locally or the endpoint for your database>
DB_PASSWORD=<password for your database>
npm start
cd frontend && npm run build && cd .. && npm start
, or use ./run
to do this as a macro.