Speed Up Training \ Optimization Task

Posted By: Giorm

Speed Up Training \ Optimization Task - 10/18/15 18:33

In order to boost the training\optimization tasks I think it could very useful to implement the following functionalities:
1. Genetic Optimization (method similar of Multicharts - see other thread)
2. Test\Train by using MultiCPU (not only for WFO) and GPU cores (CUDA \ OpenCL)
3. Distributed nodes Optimization (to distribute train task across multiple computer nodes running Zorro)

Let me know what you think about them.
thanks

Posted By: jcl

Re: Speed Up Training \ Optimization Task - 10/19/15 09:22

Genetic optimization a la Multicharts is planned, although not with the highest priority, as it is much slower that the current method. Multiple cores and nodes can be used only when the process can be parallelized, as in WFO. CUDA is for special tasks and can not be used for general optimization.

Posted By: Giorm

Re: Speed Up Training \ Optimization Task - 10/19/15 10:24

thanks Jcl

Genetic optimization: when you say that it's slower than the current method what do you mean? If the optimization search domain is big enough, I think that the genetic algorithmic is faster than the actual method, isn't it?

Multiple cores and nodes: i understand the topic that you can use it only on specific parallelizable processes. As on WFO (that is an easy process - isolation speaking) you can use parallelization also in order to distribute the training process among different asset or algo (supposing obviously that you write the code, inside an asset\algo "while" loop, in a truly indipendent way from other loops).

CUDA: I'm not an expert and I could be wrong but I see that in Python is possible use it for Montecarlo, matrix manipulation,... (https://www.quantstart.com/articles).

Posted By: jcl

Re: Speed Up Training \ Optimization Task - 10/19/15 11:42

No, the genetic algorithm is an order of magnitude slower, especially the Multicharts variant, simply because it needs more test cycles than the current method. The advantage is not speed, it is finding more local maxima.

Distributing the training process on several cores would require that no algo or asset can use variables from other algos and assets. That would not work for many systems, for instance not for Z12. And it has no real advantage, as you normally have less CPU cores than WFO cycles.

CUDA is a C dialect for the Open64 compiler and has nothing to do with Python. You can not "run something in CUDA", you must specifically program it in CUDA for taking advantage of the GPU.