What is Particle Swarm Optimization?


In general, traders like ecological analogies for the markets. Traders live in a world where they  eat what they kill . Hence, strategies  live ,  die  and  evolve  in an almost biological sense. Successful strategies  breed like animals  as information leaks from desk to desk. Market crashes are described as  cleansing fires, clearing out the  deadwood  and making way for upstarts to take their place.

Thus it should come as no surprise that one of our favorite optimization routines was inspired by watching a flock of birds.

The basic idea is iterative and simple. For example, suppose you have a strategy with two parameters.

You want to offload the process of setting these values to an algorithm  because, as a human, your spirit is willing but your brain is spongy and bruised from years of trading the markets. Ideally, you want your algorithm to pick the best strategy. 

How can we do this?

The first step, of course, is to explicitly define what “best” means. To be simple,  let’s define the best strategy as the one who makes the most money  over our training dataset. One way to accomplish this is simply to try every possible combination of the two parameters until you find the strategy with the best PnL.  This is known as brute force search , and is only feasible if your parameter space is small and your computational power is cheap. In most practical examples, brute force is out of the question, either:

1) the parameter space is too large or

2) the objective function takes too long to execute. 

In our example, calculating the PnL of an individual strategy takes a nontrivial amount of time to complete. Thus we must search for more clever optimization techniques

This is where the birds come into the picture . (Way) Back in 1995 James Kennedy and Russell Eberhart published their original paper on the subject, appropriately titled:

Particle Swarm Optimization  [CiteSeerX LINK]

"A concept for the optimization of nonlinear functions using particle swarm methodology is introduced. The evolution of several paradigms is outlined, and an implementation of one of the paradigms is discussed. Benchmark testing of the paradigm is described, and applications, including nonlinear function optimization and neural network training, are proposed. The relationships between particle swarm optimization and both artificial life and genetic algorithms are described."

Intuitively, imagine our search space as a 2-dimensional grid, where the x axis represents the first parameter and the y the second parameter. The Particle Swarm Optimization or PSO algorithm will begin by initializing a random field of points (known as particles) within this search space. It will then evaluate the objective function for each particle and update the position of the particle in the search space according to system of transition equations:

Put simply, each iteration we:

Evaluate the objective function for each particle Find the best particle for the iteration and compare against (x best, y best), updating if needed For each particle, compare the output of the objective function against (x local,y local), updating if needed Move each particle in a random way towards a blend of its global and local best positions.

Using an iterative algorithm such as PSO can lead to effective model parameter. Hence, reducing parameter risk. In fact PSO could be extremely effective in the training of many machine learning methods such as artificial neural network weights.


Imagine this nightmare scenario: you are a statistical arbitrage trader at a proprietary trading desk or hedge fund. As such, you routinely hold an inventory of ETF exposure that you must hedge.

The previous night, you instructed your overnight traders to calculate the hedge ratios for a matrix of ETF’s.

The next morning before the market opens, your junior quant traders eagerly present their results for your inspection. You load the hedge ratios into your trading platform and wait for the open.

When the market first opens for trading, you re-balance your hedges according to the new ratios. Afterwards, you watch in horror as your hedges do not perform as expected. What went wrong?

Every good trader knows they have to adapt when conditions in the market change, so why do we demand otherwise from our trading models? The traders in our example relied on static hedge ratios to power their trading logic. As a result, they opened themselves up to what is known as parameter risk.

Updating your parameters as new information becomes available is one way to protect yourself from this under-appreciated trading risk. A great model for accomplishing this in a trading scenario is the Kalman Filter. 


Demystifying Kalman Filter

Kalman Filter is an online (real-time) algorithm that estimates a system state as it receives new information. The algorithm is named after the developer of its theory, Rudolf E. Kalman. One of the first application of the Kalman Filter was on the Apollo project. The Kalman Filter is used to this very day in guidance and navigation systems, particularly aircraft, missiles, and spacecraft. 



Kalman Filters are optimal estimation algorithms. The two main benefits are:

  1. The variables of interest (i.e. submarines deep under water space rocket engine) can only be measured indirectly.

  2. Measurements are available from various sensors but are subject to noise (i.e. radar, accelerometers, financial markets)

The second benefit is why the Kalman Filter is an excellent algorithm to incorporate within your pairs trading models. As stated above, when conditions in the market change it is critical that your trading models adjust to various market conditions that expose your PnL to parameter risk (i.e. stale hedge ratios).

Hence, Kalman Filters are useful when dealing with a linear model such as pairs trading, which in its simplest form reduces down to trading the residual of a linear regression:

Yt = ßt * Xt + et

Where Yt is the current price of the first stock, Xt is the current price of the second ßt stock, is our current hedge ratio and et is the current spread price we are trading. We could also estimate the hedge ratio using the log changes in X and Y, instead of their levels. This would be more likely to be the case in a High Frequency Trading scenario, where all we care about are price changes. The Kalman Filter allows us to vary the hedge ratio over time. For example, suppose we assume the hedge ratio follows a random walk, i.e.

ßt = ßt-1 + wt

Where ßt