The distance between the output of the two neural networks is calculated with the idea being that the difference is 0 when the answer is correct and 1 if it is not. The weights are updated to adjust the network depending on whether the answer was right or wrong and by how much. Essentially, by training the network in this manner, we can calculate the distance between a question and an answer, which in turn acts as a distance function. This stage of the project was the hardest theoretical part of the project. straightforward, due to the very simple, modular API provided by Keras.
Step Results
After implementing and training the model on south korea rcs data our dataset, we performed some testing on it, to see how well it actually performed in different scenarios. The first test used the complete training set, to see how well it “remembered” questions, with our dataset correctly identifying 79% of questions. It is important to note that one does not want 100% at this stage, as it is a common sign that the model will have likely just memorised the initial dataset, and has not generalised the relationships between questions and answers. When we tested it on unseen questions, our model did not perform particularly well, however, we suspect that this is due to some answers only having one relevant question, meaning that it cannot generalise well.