GoldenGem

GoldenGem is a neural network computer program. The default configuration is the standard one, a three level perceptron, which was shown simultaneously, but independently, to be a 'universal nonlinear function approximator' in two articles in the same journal in 1989.

Freeware but not open source status
The program was supported by user registration, and was converted to freeware in 2006 and 2007, first at download.com and then at the main website. Some software websites continue to post information from an earlier pad file relating to a charge for the use of the program, but all links followed will eventually lead to the main site or download.com. The software is not open source because there is concern whether editors would have adequate technical expertise.

Operational modes
While it can access the end-of-the-day daily stock, index, and bond information provided by Yahoo, MSN and Google; however a serious user would manage text files of his own data in one of two other formats, either a folder of .csv files or a single .txt file downloaded from other sources such as forexrate.com, or the Bank of England, or particular niche commodity prices. The article on Technical Analysis correctly describes the manner in which neural network software can act as a bridge between technical analysis and the more highly regarded fundamental analysis.

Sources of confusion
Many users fail to understand the multi-variable nature of the software. A user must provide a set of share prices, indices, interest rates, or exchange rates which he already expects may be useful to predict the share price of interest.

Training and validation
Training is accomplished by the use of a logarithmic sensitivity adjustment. Validation is by a pair of indicator lights. The first indicator light which becomes yellow if both the correlation coefficient and adjusted correlation coefficient of predicted versus actual change is larger than 0.2 and green if it is larger than 0.5 while the second indicator light goes from red to yellow to green as the training input is removed by the user's control of the sensitivity adjustment. Secondly visual comparison of the graphs is needed to ensure the correlation coefficient is not large due to isolated coincidental similarities. Thirdly the user should run a validation data set on an interval in the past.

The adjusted correlation coefficient
The adjusted correlation coefficient is needed because it is possible to obtain a falsely favourable correlation coefficient during backtesting by a strategy of returning to the known mean value of past data. A neural network will choose this strategy if it fails to find a relationship. Since the known mean value of past data includes time in the relative future during backtesting the use of the unadjusted correlation coefficient would incorrectly reward such a strategy, which would not necessarily be profitable to continue into the future. The adjusted correlation coefficient is the ordinary correlation coefficient multiplied by the variance of the actual changes and divided by the variance of the predicted changes. This becomes small if the neural network becomes unstable or if it converges upon a strategy of continual sudden return to the past mean value. The second indicator light shows the minimum of the adjusted and unadjusted correlation coefficients.

The transition function
The transition function is arctan rather than the sometimes used hyperbolic tangent function. The reason is that the arctan function is suitable for an analog neural network.

Limitations
The configuration of the program is limited to analyzing the values of a set of variables that change over time, with the aim of predicting the future value of one of those variables based only on the current value of all the variables. Therefore it would not be useful for analyzing credit risk at a single point in time, for example. Also the displays and user interface assume that the data is daily and that no data is provided on weekends. While this is not an essential feature it would make use of the program too confusing to be used for intraday data; for example if used for hourly data the tick marks at the bottom of the screen labelled 'weeks' would refer to five hour intervals, and the slider labelled 'days' being set to 21, in an attempt to exclude weekends, would set an interval of fifteen hours.

Whereas the training takes into account the overall relation among the variables throughout the entire backtesting interval, there is no particular weight attached to, for example, yesterday's values, or values from last week. A large coincidental change in the values of the variables on the final day of data would impact upon the prediction. A user can compensate by including what are known as 'stochastics' as input variables. The justification for not doing so is that the use of 'stochastics' would degrade the performance in cases when a change in today's values are genuinely meaningful, and that variability in performance due to the dependence on data from only one day should not affect the overall average performance of the prediction.

The algorithm is the most widely used and simplest algorithm. Improved algorithms such as conjugate gradient may possibly be superior.