These instructions are for installing TrainMyAI on your own server.
If you prefer a hassle-free trial of the product, fill out this form for a free cloud-based demo.
Local or Remote Language Models
Before installing TrainMyAI, it's important to decide whether to use a local language model or not, since this affects the server requirements. For more information, see choosing a language model.
Requirements
- Modern 64-bit Linux server:
- For ChatGPT only (remote language models): Ubuntu 18/20/22, Debian 10/11/12 or CentOS Stream 8/9.
- For Llama 3 (local language model): Ubuntu 22 required.
- Minimum memory: 4 GB (ChatGPT only) or 16 GB (Llama 3).
- Disk: SSD with 20 GB free (ChatGPT only) or 40 GB (Llama 3).
- For ChatGPT (remote language models):
- Account for OpenAI Platform (paid) or Azure OpenAI Service.
- For Llama 3 (local language model):
- Best performance: NVIDIA GPU with 12+ GB of RAM and CUDA 7.5+ (see table), e.g. NVIDIA RTX 3060 12GB.
Cloud providers include Amazon EC2 (G4dn), Azure (NCasT4_v3), Google Cloud (N1+T4, G2+L4). - Reasonable performance: High-end Intel/AMD systems with at least 16 CPU cores.
- Best performance: NVIDIA GPU with 12+ GB of RAM and CUDA 7.5+ (see table), e.g. NVIDIA RTX 3060 12GB.
Easy Installation for Ubuntu and Debian
A shell script is available to automatically install TrainMyAI on a clean Ubuntu or Debian server:
- Log in to the Ubuntu or Debian server command line.
-
Download the installation script from our site:
wget -O install-trainmyai.sh https://trainmy.ai/install-demo/install-trainmyai.php
-
If you only wish to use ChatGPT (remote language models), run the script with no parameters:
sudo bash install-trainmyai.sh
During installation, the script will ask some questions – in most cases, you can accept the defaults presented. -
To use Llama 3 with an NVIDIA GPU, install the GPU drivers, reboot, then run the install script:
sudo bash install-trainmyai.sh nvidia
To use Llama 3 without a GPU, just run the install script with the
sudo reboot
sudo bash install-trainmyai.sh locallocal
parameter:sudo bash install-trainmyai.sh local
- Open the server's address in your web browser and sign up as the first TrainMyAI admin user.
-
To use ChatGPT, connect TrainMyAI to your OpenAI Platform or Azure OpenAI Service account:
- OpenAI Platform: Enter your OpenAI API key in the 'Configuration' page (via the 'Manage' menu). We recommend adding a payment method to your OpenAI account, to increase the API rate limit and test TrainMyAI properly. Each TrainMyAI chat response costs well under $0.01 (excluding GPT-4).
-
Azure OpenAI Service: Open the
config.ini
file in thetrainmyai
directory in a text editor. If you did not change the default location, runnano -w /home/trainmyai/config.ini
. In the[openai]
section, setprovider = azure
, setapi_key
to your Azure OpenAI API key, and enter the other Azure credentials.
- Now create the first knowledge base and start adding content to it.
- Once the content is analyzed, click 'Chat' in the menu to start asking questions.
Getting help
If you encounter any problems with the installation process, please contact us with the following information:
- A description of the problem, including any error messages shown.
- Linux distribution, e.g. Ubuntu 22.
- MySQL/MariaDB version.
- PHP version.
- Any relevant lines of Apache's error_log file.
You can run the following set of commands to collect much of this information together:
grep PRETTY_NAME /etc/os-release ; mysql --version ; php -v
sudo grep -s PHP /var/log/httpd/error_log /var/log/apache2/error.log | tail -n 10
Manual Installation
If you are not running Ubuntu or Debian, or prefer to follow the installation process step-by-step, you can also install TrainMyAI manually. This process will take around half an hour for experienced system administrators.
1. Install prerequisites
First, TrainMyAI needs a few common packages to be installed on your Linux server.- Ensure Apache, PHP and MySQL are installed and active. The process will depend on the version of Linux you are using – there are tons of guides available online.
-
To enable extracting text from PDF, DOC and DOCX files, install Poppler, catdoc and docx2txt:
Ubuntu/Debian sudo apt install poppler-utils catdoc docx2txt
CentOS sudo yum install poppler-utils
-
Install any other required packages that might be missing:
Ubuntu/Debian sudo apt install curl php-curl php-json php-mbstring wget tar nano
CentOS sudo yum install curl php-curl php-json php-mbstring wget tar nano
- To use Llama 3 (local language model), perform these additional steps to install other required packages:
sudo apt install nvidia-driver-535
Then wait for the system to reboot and continue.
sudo apt install nvidia-cuda-toolkit
sudo apt install python3-venv
sudo reboot
2. Install files and directories
Now's we're going to download and install TrainMyAI, and create a directory for it to store its files.-
Use
cd
to navigate to a directory in which to create the TrainMyAI directory. For security reasons, this should not be within the web content directory (we'll create the necessary link later on). You can install TrainMyAI in any location, so long its files are readable by the web server. -
Download and install TrainMyAI:
wget https://trainmy.ai/download/trainmyai-1.6.tar.gz
tar -xzf trainmyai-1.6.tar.gz
cd trainmyai -
Decide where TrainMyAI should store its data directory, e.g.
/home/trainmyai_data
(example used below) and create this directory:sudo mkdir /home/trainmyai_data
-
Ensure this data directory is owned by the same user which runs the Apache web server, as follows:
Ubuntu/Debian sudo chown www-data /home/trainmyai_data
CentOS sudo chown apache /home/trainmyai_data
- To use Llama 3 (local language model), perform these additional steps to download the models and install libraries:
wget -P models/models--llama-3-trainmyai-001 http://models.trainmy.net/llama-3-trainmyai-001.gguf
wget http://models.trainmy.net/ms-embeds-trainmyai-001.tar
tar -xf ms-embeds-trainmyai-001.tar -C models/models--ms-embeds-trainmyai-001
rm ms-embeds-trainmyai-001.tar
python3 -m venv include/external/pytorch_env
source include/external/pytorch_env/bin/activate
pip3 install torch
pip3 install transformers
deactivate
3. Create the database
TrainMyAI stores some of its information in a MySQL or MariaDB database, which we'll now set up.-
Choose a database name (e.g.
trainmyai_db
), user name (e.g.trainmyai_user
) and password (denoted by<password>
below) for the TrainMyAI database and note them down. We will use these examples below and assume the database is running on the same server as the website. -
Log in to MySQL or MariaDB as the root user, entering the MySQL root password:
mysql -u root -p
-
Create the TrainMyAI database, user and grant all privileges:
CREATE USER 'trainmyai_user'@'localhost' IDENTIFIED BY '<password>';
CREATE DATABASE trainmyai_db;
GRANT ALL PRIVILEGES ON trainmyai_db.* TO 'trainmyai_user'@'localhost'; -
Type
exit
to leave the MySQL command line.
4. Configure TrainMyAI
Now it's time to set up TrainMyAI to use the chosen data directory, database and your OpenAI account.-
Make a copy of the TrainMyAI example configuration file:
cp config-example.ini config.ini
-
Use your favorite text editor to start editing
config.ini
, e.g.nano config.ini
. -
Decide whether TrainMyAI should be at the root of your web site or in a subdirectory. If it will be at the root, leave
url_base = /
as is. Otherwise, set it accordingly, e.g.url_base = /trainmyai/
-
Set
data_directory
to the full path of the directory created for TrainMyAI to store its files, e.g./home/trainmyai_data
-
Enter the MySQL or MariaDB database credentials chosen earlier in the
[database]
section. -
To use ChatGPT (remote language models), connect to your OpenAI Platform or Azure OpenAI Service account:
-
OpenAI Platform: Enter your OpenAI API key in the
api_key
setting of the[openai]
section. We recommend adding a payment method to your OpenAI account, to increase the API rate limit and test TrainMyAI properly. Each TrainMyAI chat response costs well under $0.01 (excluding GPT-4). -
Azure OpenAI Service: In the
[openai]
section, setprovider = azure
, setapi_key
to your Azure OpenAI API key, and enter the other Azure credentials.
-
OpenAI Platform: Enter your OpenAI API key in the
- To use Llama 3 (local language model), perform the following steps:
- In the
[ref_embeddings]
section, setopenai = off
. - In the
[ref_embeddings]
section, setlocal_api_1 = on
. - In the
[openai]
section, setenabled = off
. - In the
[local_llm]
section, setenabled = on
. - In the
[local_api]
section, setenabled = on
.
- In the
- Save the changes to disk and exit the text editor.
5. Set up the web site
We'll now set up your web server to work with TrainMyAI.-
Create a symbolic link in your web server's content directory to the
html
directory of TrainMyAI. To serve TrainMyAI at the root of your website:sudo mv /var/www/html /var/www/html-old
If you prefer to serve it in a subdirectory such as
sudo ln -s $(pwd)/html /var/www/htmltrainmyai
of your website, create a symbolic instead:sudo ln -s $(pwd)/html /var/www/html/trainmyai
Note that servers running SELinux may prevent these symbolic links being followed. You can use thegetenforce
command to check (if the command is not found, SELinux is not installed). If SELinux is installed, you may need to configure it to allow Apache to access this directory. -
Ensure that Apache is configured to follow symbolic links and
.htaccess
files. To do this, use your favorite text editor to modify Apache's configuration file, e.g. using nano:Ubuntu/Debian sudo nano /etc/apache2/apache2.conf
CentOS sudo nano /etc/httpd/conf/httpd.conf
-
Add the following block at the end of the configuration file:
<Directory /var/www/html>
Options FollowSymLinks
AllowOverride All
</Directory> - Save the changes to disk and exit the text editor.
-
Activate Apache's rewrite module (where required) and restart the web server:
Ubuntu/Debian sudo a2enmod rewrite
sudo systemctl restart apache2CentOS sudo systemctl restart httpd
6. Create cron job
Some of TrainMyAI's tasks are asynchronous, meaning that they aren't activated directly by a web request. To keep these moving, we need to set a cron job to make regular requests to a special web page.-
Start editing the crontab file for your user:
crontab -e
-
If the interface is unfamiliar, you're probably using Vi/Vim, so press
i
to enter insert mode. -
Add in the following line, substituting the appropriate URL for your TrainMyAI site. For example, if TrainMyAI is at
https://my-site.com/trainmyai/
you would add:* * * * * wget -O - https://my-site.com/trainmyai/trainmyai_async >>/dev/null 2>&1
-
Save changes to disk and exit the editor. (To do this in Vi/Vim, press
escape
then:wq
.)
7. Install loader and verify
We're nearly there. The last step is to install the SourceGuardian loader and verify the installation.-
Open the installation verification page
verify-install.php
in your web browser. For example,https://my-site.com/trainmyai/verify-install.php
-
If you see a 'Forbidden' error, this is because the web server does not have permission to look within the directory containing the
trainmyai
directory, but this can be fixed. -
Follow the instructions on the page to install the SourceGuardian loader. To download the loader directly on the server, copy the link for the loader given in the page and run the following:
wget --content-disposition -U agent '<url>'
-
Follow the SourceGuardian instructions to move the downloaded
ixed...
file to the correct directory. -
Follow the SourceGuardian instructions to add the appropriate
extension=...
directive to PHP's configuration file. -
Restart the web server:
Ubuntu/Debian sudo systemctl restart apache2
CentOS sudo systemctl restart httpd
-
Refresh the installation verification page. If the SourceGuardian message is still appearing, try running
sudo systemctl reload php-fpm
and refresh again. - You should be told the database was created and whether anything is missing. If you see a message about the cron script, wait a minute then refresh the page again.
- Click the link to go to TrainMyAI, and sign up as the first admin user.
That's everything! Now you can create the first knowledge base and start adding content.