首页 > > 详细

MXB361 Module 1: Complex Systems and Networks

 MXB361

Module 1: Complex Systems and Networks
Assessment Part A: Complex Systems and Computation
This assignment counts for 60% of the MXB361 final score and focusses on the 
material covered in weeks 1-6 of Module 1. It is due at midnight on the Tuesday of the week 
following the last week of teaching of Semester 1. This is to be completed individually. The 
answers will be run through plagiarism checking software (please don’t even think about it!).
Your submissions should be in the form of a report answering the questions below. You must 
provide a code sample for each question where code writing is involved; and images as 
requested. Give all answers to at least 2 decimal points.
Part I (Complex Systems Quantitative Analysis – 40 marks in total)
You have just landed a dream job as the head of strategy at an evil but old-fashioned, 
non-quantitative foreign exchange trading hedge fund, having out-muscled, out-witted and out￾computed your rivals. You have been given a remit by the boss of the firm, Mr Fatman Rich, 
to revolutionise their inept investment strategies. This you are determined to accomplish, in 
order to get a massive bonus so you can buy yourself a huge jacuzzi and fill it with chocolate 
and gold, or whatever other material goods you desire.
The fund invests only in the top 22 most traded “tickers”, which includes 19 national 
currencies, as well as bitcoin (BTC), gold (XAU) and silver (XAG). Because you took 
MXB361, your first step is to download the firm’s stock market data with a view to devising a 
new, computationally sound investment strategy. The data set (‘data’ matrix) has the past 7110 
days (about 30 years) of ticker prices for the 22 tickers – these are the prices daily of each 
currency against the US dollar. The ticker prices, corresponding to the codes for each currency 
are in the ‘tickers’ matrix and the dates are in the ‘times’ matrix.
1. Load the data into MATLAB. Look only at the last 504 days of trading prices.
Divide each ticker’s price by its initial opening price 504 days ago so that they all 
start at 1. Plot these corrected prices against time (for the past two years) for all 
tickers. Plot, in a 22 x 22 plot array, all the corrected prices against one another.
Provide the code you used and the respective images in an image format of your 
choice. Briefly describe 5 things you see. (5 marks).
2. Using your corrected prices over the past two years of data, compute the corrected 
daily returns (i.e. the % daily change in the value) of each ticker. Using this, 
compute the correlation matrix of daily returns. Check in MATLAB whether the 
matrix is symmetric and provide evidence whether it is or not. Visualise the 
correlations matrix appropriately. What is the determinant of this matrix? In 10 
words or less, give a geometric interpretation of the determinant of a matrix (5
marks).
3. The Marcenko-Pastur distribution describes asymptotic behaviour of the 
eigenvalues of random matrices. Plot this distribution as a family of curves for 
different parameter values, N and T, respectively, where N is the number of series 
and T is their length. Give expressions for the two points along the x-axis at which 
the MP distribution curves departs from the horizontal axis (i.e. where the MP 
distribution is > 0), as functions of N and T. What is the significance of any of the 
eigenvalues of a given non-random matrix derived from time series data falling 
between these two values? What about those falling outside the range? (5 marks).
4. Study the network derived from the correlation matrix you just computed. Is it a 
directed or undirected matrix? Plot a histogram of its eigenvalues and comment on 
it (100 words). Plot the distribution of the node degrees and comment on it (100 
words). What is the mean correlation between any two randomly chosen ticker price 
returns in the past year? What is the distribution of the correlations between any 
two randomly chosen ticker price returns in that time? What is the standard 
deviation? Comment on the above in 250 words or less (5 marks).
5. Your evil employer’s investment strategy focusses around choosing currencies that 
its analysts believe are geographic regions (Asia, Europe, etc) that are “on the rise” 
and avoiding those that are “falling” relative to the USD. Naturally, having taken 
MXB361, you are sceptical of such an approach absent evidence in its favour. You 
decide to use community detection methods to see if there are any distinct 
communities in the ticker returns over the last year and if so, if they overlap with 
the GICS sectors. Your first step is to write your own community detection code. 
Choose an algorithm you’ve come across (such as Potts, Louvain, etc.) or devise 
your own and write a piece of code in MATLAB that takes as input an adjacency 
matrix (for example, a correlation matrix) and maximises the modularity, outputting 
the communities within the network and a value for the estimated maximum 
modularity. Provide evidence that the method works on a small network you have 
devised (5 marks).
6. In order to feed a noise-cleaned network into your community detection code, you 
need to eliminate some eigenvectors from the correlations matrix you computed. 
Based on you answer to Q5 above, set the eigenvalues that you think are 
indistinguishable from noise to 0 and reconstitute the matrix using only the non￾noise eigenvalues/vectors. Study the degree distribution of this new
matrix/network. How, if at all, is it different from your answer in Q5? What 
proportion of the eigenvalues were eliminated in this way? What does this 
proportion tell you about the level of noise in the initial correlation matrix? How 
would this all change if the values of T and N changes? In particular, for what 
combination(s) of T and N is the correlation matrix indistinguishable from a random 
matrix, and justify the reason for this claim using the expressions you reported in 
your answer from Q3 (5 marks).
7. Apply your community detection code to the new, noise-cleaned matrix in order to 
find modules in your data. Are there any and if so, how many? What is the 
modularity maximum your code found? Plot on a world map bubbles/dots 
corresponding to each currency placed over the geographic coordinates of each 
country’s capital city (for bitcoin, gold and silver, just place the dots somewhere 
“empty” to the side, e.g. over the Pacific). Provide the image and code. Does the 
membership of these modules overlap with the geographic regions? If not, how big 
is the difference? Do their “locations” make any sense, with respect to their 
economic activity (you may need to do a bit of research on that latter aspect)? Where 
do bitcoin, gold and silver localise to (obviously, they don’t belong to any country), 
module membership-wise? Any comments on these findings? Finally, what is the 
implication of these findings for your firm’s current investment strategy in Q6? (5 
marks).
8. Based on the above, devise a new, computationally driven investment strategy for 
your firm. Explain the outlines of your strategy to your CEO in a one-page 
document that includes one image, along with an estimate of the % return your 
strategy would make yearly. You may wish to discuss the effect of over-fitting 
and/or keep some data for model fitting and some for validation. (5 marks).
Part II (Computational Complexity and OtherAreas of Module I – 20 marks in total)
9. “We are living in a period in human history in which new data is generated at such 
a rate and mis-analysed so badly that most of us can’t make sense of the world 
around us. This is similar to the 1500s when, following the invention of the printing 
press but before literacy was widespread, there followed perhaps the bloodiest 
period in human history”. Do you agree or disagree? Why or why not? Discuss in 
500 words or so (10 marks).
10. The permanent of a matrix is defined exactly like the determinant but instead of 
alternating signs between the terms, it’s all +’s, like this:
Write code to compute the permanent for a given n x n real-valued matrix. What 
is the computational complexity (in time) of your algorithm? Can you find a way 
to do any better? It’s thought that the permanent is computationally intractable – i.e. exponential time (in n) is required for its computation, a problem related to 
“P?=NP” in ways that we don’t yet understand. Why would the determinant be 
polynomial time computable (it is, otherwise we couldn’t solve systems of linear 
equations or do pretty much anything useful with a computer), but the permanent, 
so similar in its definition, be intractable to compute? Discuss in 500 words or so 
(10 marks).
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!