AI Module Artificial Neural Network

Model Overview: The model has 1.5 billion parameters. I used a decoder only transformer architecture which is good for text generation tasks. The embedding dimension is 1536 and it uses multi head self attention to understand context better.

Training Details:

Started with a loss of 1.556 and ended at 1.140 about 27% improvement
Trained for around 1,546 steps which took about 6 hours on GPU
Used gradient accumulation of 16 steps to handle memory limits
Training precision was FP16 half precision to speed things up

What it can do: The model can understand and generate text in multiple languages. In YSHOP, it will handle user queries and help with database operations. It takes text input, processes it through the transformer layers, and outputs relevant responses.

Technical stuff:

Activation function: SwiGLU
Architecture: Transformer Decoder Only
Output: Multilingual text generation

This graph was drawn in old training it shows the neural network was trained with 900 iterations and 1.600 parameters the module here was able to generate texts but it was giving lots of hallucination you can check the image down to see what I got but the good news is the module was at that time able to give texts and tries to put punctuation marks and talks like human but it was not smart enough.

This output I got it after training the artificial neural netwok AI LLM with 900 iterations and 1.600 parameters.

Then I increased the number of parameters to 5000 with 50,000 iteration and this is the training progress validation loss over iteration

This graph confirms that the module has learned with the dataset that I gave to it