TrainMyAI - Install Demo

These instructions are for installing TrainMyAI on your own server.

If you prefer a hassle-free trial of the product, fill out this form for a free cloud-based demo.

Local or Remote Language Models

Before installing TrainMyAI, it's important to decide whether to use a local language model or not, since this affects the server requirements. For more information, see choosing a language model.

Requirements

Modern 64-bit Linux server:
- For ChatGPT only (remote language models): Ubuntu 18/20/22, Debian 10/11/12 or CentOS Stream 8/9.
- For Llama 3 (local language model): Ubuntu 22 required.
Minimum memory: 4 GB (ChatGPT only) or 16 GB (Llama 3).
Disk: SSD with 20 GB free (ChatGPT only) or 40 GB (Llama 3).
For ChatGPT (remote language models):
- Account for OpenAI Platform (paid) or Azure OpenAI Service.
For Llama 3 (local language model):
- Best performance: NVIDIA GPU with 12+ GB of RAM and CUDA 7.5+ (see table), e.g. NVIDIA RTX 3060 12GB.
  Cloud providers include Amazon EC2 (G4dn), Azure (NCasT4_v3), Google Cloud (N1+T4, G2+L4).
- Reasonable performance: High-end Intel/AMD systems with at least 16 CPU cores.

Easy Installation for Ubuntu and Debian

A shell script is available to automatically install TrainMyAI on a clean Ubuntu or Debian server:

Log in to the Ubuntu or Debian server command line.
Download the installation script from our site:
wget -O install-trainmyai.sh https://trainmy.ai/install-demo/install-trainmyai.php
If you only wish to use ChatGPT (remote language models), run the script with no parameters:
sudo bash install-trainmyai.sh
During installation, the script will ask some questions – in most cases, you can accept the defaults presented.
To use Llama 3 with an NVIDIA GPU, install the GPU drivers, reboot, then run the install script:
sudo bash install-trainmyai.sh nvidia
sudo reboot
sudo bash install-trainmyai.sh local
To use Llama 3 without a GPU, just run the install script with the local parameter:
sudo bash install-trainmyai.sh local
Open the server's address in your web browser and sign up as the first TrainMyAI admin user.
To use ChatGPT, connect TrainMyAI to your OpenAI Platform or Azure OpenAI Service account:
- OpenAI Platform: Enter your OpenAI API key in the 'Configuration' page (via the 'Manage' menu). We recommend adding a payment method to your OpenAI account, to increase the API rate limit and test TrainMyAI properly. Each TrainMyAI chat response costs well under $0.01 (excluding GPT-4).
- Azure OpenAI Service: Open the config.ini file in the trainmyai directory in a text editor. If you did not change the default location, run nano -w /home/trainmyai/config.ini. In the [openai] section, set provider = azure, set api_key to your Azure OpenAI API key, and enter the other Azure credentials.
Now create the first knowledge base and start adding content to it.
Once the content is analyzed, click 'Chat' in the menu to start asking questions.

Getting help

If you encounter any problems with the installation process, please contact us with the following information:

A description of the problem, including any error messages shown.
Linux distribution, e.g. Ubuntu 22.
MySQL/MariaDB version.
PHP version.
Any relevant lines of Apache's error_log file.

You can run the following set of commands to collect much of this information together:

grep PRETTY_NAME /etc/os-release ; mysql --version ; php -v
sudo grep -s PHP /var/log/httpd/error_log /var/log/apache2/error.log | tail -n 10

Manual Installation

If you are not running Ubuntu or Debian, or prefer to follow the installation process step-by-step, you can also install TrainMyAI manually. This process will take around half an hour for experienced system administrators.

1. Install prerequisites

First, TrainMyAI needs a few common packages to be installed on your Linux server.

Ensure Apache, PHP and MySQL are installed and active. The process will depend on the version of Linux you are using – there are tons of guides available online.
To enable extracting text from PDF, DOC and DOCX files, install Poppler, catdoc and docx2txt:

Ubuntu/Debian sudo apt install poppler-utils catdoc docx2txt

CentOS sudo yum install poppler-utils

(Note that DOC and DOCX files are not yet supported on CentOS.)

Ubuntu/Debian	`sudo apt install poppler-utils catdoc docx2txt`
CentOS	`sudo yum install poppler-utils`

Install any other required packages that might be missing:

Ubuntu/Debian	`sudo apt install curl php-curl php-json php-mbstring wget tar nano`
CentOS	`sudo yum install curl php-curl php-json php-mbstring wget tar nano`

To use Llama 3 (local language model), perform these additional steps to install other required packages:

sudo apt install nvidia-driver-535
sudo apt install nvidia-cuda-toolkit
sudo apt install python3-venv
sudo reboot
Then wait for the system to reboot and continue.

2. Install files and directories

Now's we're going to download and install TrainMyAI, and create a directory for it to store its files.

Use cd to navigate to a directory in which to create the TrainMyAI directory. For security reasons, this should not be within the web content directory (we'll create the necessary link later on). You can install TrainMyAI in any location, so long its files are readable by the web server.
Download and install TrainMyAI:
wget https://trainmy.ai/download/trainmyai-1.5.tar.gz
tar -xzf trainmyai-1.5.tar.gz
cd trainmyai
Decide where TrainMyAI should store its data directory, e.g. /home/trainmyai_data (example used below) and create this directory:
sudo mkdir /home/trainmyai_data
Ensure this data directory is owned by the same user which runs the Apache web server, as follows:

Ubuntu/Debian sudo chown www-data /home/trainmyai_data

CentOS sudo chown apache /home/trainmyai_data
To use Llama 3 (local language model), perform these additional steps to download the models and install libraries:

wget -P models/models--llama-3-trainmyai-001 http://models.trainmy.net/llama-3-trainmyai-001.gguf
wget http://models.trainmy.net/ms-embeds-trainmyai-001.tar
tar -xf ms-embeds-trainmyai-001.tar -C models/models--ms-embeds-trainmyai-001
rm ms-embeds-trainmyai-001.tar

python3 -m venv include/external/pytorch_env
source include/external/pytorch_env/bin/activate
pip3 install torch
pip3 install transformers
deactivate

Ubuntu/Debian	`sudo chown www-data /home/trainmyai_data`
CentOS	`sudo chown apache /home/trainmyai_data`

3. Create the database

TrainMyAI stores some of its information in a MySQL or MariaDB database, which we'll now set up.

Choose a database name (e.g. trainmyai_db), user name (e.g. trainmyai_user) and password (denoted by <password> below) for the TrainMyAI database and note them down. We will use these examples below and assume the database is running on the same server as the website.
Log in to MySQL or MariaDB as the root user, entering the MySQL root password:
mysql -u root -p
Create the TrainMyAI database, user and grant all privileges:
CREATE USER 'trainmyai_user'@'localhost' IDENTIFIED BY '<password>';
CREATE DATABASE trainmyai_db;
GRANT ALL PRIVILEGES ON trainmyai_db.* TO 'trainmyai_user'@'localhost';
Type exit to leave the MySQL command line.

4. Configure TrainMyAI

Now it's time to set up TrainMyAI to use the chosen data directory, database and your OpenAI account.

Make a copy of the TrainMyAI example configuration file:
cp config-example.ini config.ini
Use your favorite text editor to start editing config.ini, e.g. nano config.ini.
Decide whether TrainMyAI should be at the root of your web site or in a subdirectory. If it will be at the root, leave url_base = / as is. Otherwise, set it accordingly, e.g. url_base = /trainmyai/
Set data_directory to the full path of the directory created for TrainMyAI to store its files, e.g. /home/trainmyai_data
Enter the MySQL or MariaDB database credentials chosen earlier in the [database] section.
To use ChatGPT (remote language models), connect to your OpenAI Platform or Azure OpenAI Service account:
- OpenAI Platform: Enter your OpenAI API key in the api_key setting of the [openai] section. We recommend adding a payment method to your OpenAI account, to increase the API rate limit and test TrainMyAI properly. Each TrainMyAI chat response costs well under $0.01 (excluding GPT-4).
- Azure OpenAI Service: In the [openai] section, set provider = azure, set api_key to your Azure OpenAI API key, and enter the other Azure credentials.
To use Llama 3 (local language model), perform the following steps:
- In the [ref_embeddings] section, set openai = off.
- In the [ref_embeddings] section, set local_api_1 = on.
- In the [openai] section, set enabled = off.
- In the [local_llm] section, set enabled = on.
- In the [local_api] section, set enabled = on.
Save the changes to disk and exit the text editor.

5. Set up the web site

We'll now set up your web server to work with TrainMyAI.

Create a symbolic link in your web server's content directory to the html directory of TrainMyAI. To serve TrainMyAI at the root of your website:
sudo mv /var/www/html /var/www/html-old
sudo ln -s $(pwd)/html /var/www/html
If you prefer to serve it in a subdirectory such as trainmyai of your website, create a symbolic instead:
sudo ln -s $(pwd)/html /var/www/html/trainmyai
Note that servers running SELinux may prevent these symbolic links being followed. You can use the getenforce command to check (if the command is not found, SELinux is not installed). If SELinux is installed, you may need to configure it to allow Apache to access this directory.
Ensure that Apache is configured to follow symbolic links and .htaccess files. To do this, use your favorite text editor to modify Apache's configuration file, e.g. using nano:

Ubuntu/Debian sudo nano /etc/apache2/apache2.conf

CentOS sudo nano /etc/httpd/conf/httpd.conf
Add the following block at the end of the configuration file:
<Directory /var/www/html>
Options FollowSymLinks
AllowOverride All
</Directory>
Save the changes to disk and exit the text editor.
Activate Apache's rewrite module (where required) and restart the web server:

Ubuntu/Debian sudo a2enmod rewrite sudo systemctl restart apache2

CentOS sudo systemctl restart httpd

Ubuntu/Debian	`sudo nano /etc/apache2/apache2.conf`
CentOS	`sudo nano /etc/httpd/conf/httpd.conf`

Ubuntu/Debian	`sudo a2enmod rewrite sudo systemctl restart apache2`
CentOS	`sudo systemctl restart httpd`

6. Create cron job

Some of TrainMyAI's tasks are asynchronous, meaning that they aren't activated directly by a web request. To keep these moving, we need to set a cron job to make regular requests to a special web page.

Start editing the crontab file for your user:
crontab -e
If the interface is unfamiliar, you're probably using Vi/Vim, so press i to enter insert mode.
Add in the following line, substituting the appropriate URL for your TrainMyAI site. For example, if TrainMyAI is at https://my-site.com/trainmyai/ you would add:
* * * * * wget -O - https://my-site.com/trainmyai/trainmyai_async >>/dev/null 2>&1
Save changes to disk and exit the editor. (To do this in Vi/Vim, press escape then :wq.)

7. Install loader and verify

We're nearly there. The last step is to install the SourceGuardian loader and verify the installation.

Open the installation verification page verify-install.php in your web browser. For example, https://my-site.com/trainmyai/verify-install.php
If you see a 'Forbidden' error, this is because the web server does not have permission to look within the directory containing the trainmyai directory, but this can be fixed.

You need to add global 'execute' permissions to this enclosing directory. On Linux, the 'execute' permissions for a directory controls which users can list that directory's contents, and is not related to actually executing code. Ensuring you are still in the trainmyai directory, do the following:

sudo chmod a+x ..

Then refresh the page in your web browser. If that still doesn't help, you can also try adding the same execute permissions to the directories further up the hierarchy:

sudo chmod a+x ../..
sudo chmod a+x ../../..

Alternatively, instead of changing permissions, you could move the trainmyai directory to a better location such as /home, and then update the symbolic link at (or in) /var/www/html.
Follow the instructions on the page to install the SourceGuardian loader. To download the loader directly on the server, copy the link for the loader given in the page and run the following:
wget --content-disposition -U agent '<url>'
Follow the SourceGuardian instructions to move the downloaded ixed... file to the correct directory.
Follow the SourceGuardian instructions to add the appropriate extension=... directive to PHP's configuration file.
Restart the web server:

Ubuntu/Debian sudo systemctl restart apache2

CentOS sudo systemctl restart httpd
Refresh the installation verification page. If the SourceGuardian message is still appearing, try running sudo systemctl reload php-fpm and refresh again.
You should be told the database was created and whether anything is missing. If you see a message about the cron script, wait a minute then refresh the page again.
Click the link to go to TrainMyAI, and sign up as the first admin user.

Ubuntu/Debian	`sudo systemctl restart apache2`
CentOS	`sudo systemctl restart httpd`

That's everything! Now you can create the first knowledge base and start adding content.