Pair Trading - Exploring The Low Risk Statistical Arbitrage Trading Concepts

ncube

Well-Known Member
Hi,
I think you could post a link to the jupyter notebook or could upload it to github so that it could be easy for all.

I believe for the above code to work , I should run it only once a day so that the entries aren't duplicated.
Yes, you are correct, it will be an issue, lets modify the code a bit to write the updated master file to a different folder. This way one can run the script any number of time, however eod they need to put both master & eod.txt file in the same folder first and then copy the updated master file to original folder.

1. Add the following function into the cell with other functions:
def update_eod(masterfile,eodfile):
master = pd.read_csv(masterfile, index_col=[0])
eod = pd.read_csv(eodfile, header=None,index_col=[0],usecols=[0,5])
df = master.append(eod.T).dropna(axis=1).reset_index(drop=True)
df.to_csv('C://master/stockdata.csv')
return

2. Add a new cell after after the cell which reads the master file and add the following python statement
update_eod('C://master/stockdata.csv','C://master/eod.txt')

1533139290554.png


How it works:
1. Every day download the EOD NSE bhavcopy text file and rename it as eod.txt, place it in a folder named "master" along with the old master file.
2. Then just run the cell with the function update_eod to update the master file and then comment this statement with # so that it is not run next time. Remove the # only when you need to update the master file.
3. Once the master file is updated copy the file to the folder used as earlier.

1533139851281.png


I have update the script in my google drive:
https://drive.google.com/drive/folders/1a40Ih__otr99E1TAQe7WEjsTEsiBDKQQ?usp=sharing
 
Last edited:

UberMachine

Well-Known Member
Yes, you are correct, it will be an issue, lets modify the code a bit to write the updated master file to a different folder. This way one can run the script any number of time, however eod they need to put both master & eod.txt file in the same folder first and then copy the updated master file to original folder.

1. Add the following function into the cell with other functions:
def update_eod(masterfile,eodfile):
master = pd.read_csv(masterfile, index_col=[0])
eod = pd.read_csv(eodfile, header=None,index_col=[0],usecols=[0,5])
df = master.append(eod.T).dropna(axis=1).reset_index(drop=True)
df.to_csv('C://master/stockdata.csv')
return

2. Add a new cell after after the cell which reads the master file and add the following python statement
update_eod('C://master/stockdata.csv','C://master/eod.txt')

View attachment 27047

How it works:
1. Every day download the EOD NSE bhavcopy text file and rename it as eod.txt, place it in a folder named "master" along with the old master file.
2. Then just run the cell with the function update_eod to update the master file and then comment this statement with # so that it is not run next time. Remove the # only when you need to update the master file.
3. Once the master file is updated copy the file to the folder used as earlier.

View attachment 27048
I suggest the files be downloaded on a daily basis and the master file be regenerated each day by iterating through all the files in the directory.
I know this is time consuming and an inefficient approach but I think it helps the master file to be entirely dependent on the folder and regenerate it any time the user wants (if he has a confusion)
What'[s your take on it?
 

travi

Well-Known Member
@UberMachine Just saw your code but was just testing ncube's earlier post code just now. Thanks

It works right off the bat, was just wondering that the EoD file had around 148 scrips and original has about 500.

In my case , the final output has only 140 scrips.

My EoD is a stripped down version of the top 150 stocks.
Actual EoD file has around 1650 stocks.

Is that 10 stocks aren't in your top 500 ?

EDIT: When I use 1650, probably only 316 appear in master.csv
 
Last edited:

UberMachine

Well-Known Member
@UberMachine Just saw your code but was just testing ncube's earlier post code just now.

It works right off the bat, was just wondering that the EoD file had around 148 scrips and original has about 500.

In my case , the final output has only 140 scrips.

My EoD is a stripped down version of the top 150 stocks.
Actual EoD file has around 1650 stocks.

Is that 10 stocks aren't in your top 500 ?
I never went through the file.
It just concats everything. So if you put 150 in a file and 1650 in another it would output 1800 (would remove duplicates in case of same dates)
Do you need only those 150 files?

Then I suggest the following
  1. Create a separate csv file with symbols only
  2. Regenerate the master file for all the symbols
  3. Use pandas to filter the data for your symbols
This way you can work with multiple lists

Python:
# This is just pseudo-code; so it won't work
import pandas as pd
df = pd.read_csv('master.csv')
my_List = pd.read_csv('my_symbols.csv').values.ravel()
new_df = df[df.isin(my_list)]

# In case you need an output
new_df.to_csv('my_filtered_list.csv')
 

ncube

Well-Known Member
I suggest the files be downloaded on a daily basis and the master file be regenerated each day by iterating through all the files in the directory.
I know this is time consuming and an inefficient approach but I think it helps the master file to be entirely dependent on the folder and regenerate it any time the user wants (if he has a confusion)
What'[s your take on it?
That would not be required as we can download multiday EOD NSE bhavcopy using some tools which are already available, once the files are downloaded one can just run a windows copy command (eg: >>cp * all_eod.txt) and then append it with master file in one shot.
 

ncube

Well-Known Member
I never went through the file.
It just concats everything. So if you put 150 in a file and 1650 in another it would output 1800 (would remove duplicates in case of same dates)
Do you need only those 150 files?

Then I suggest the following
  1. Create a separate csv file with symbols only
  2. Regenerate the master file for all the symbols
  3. Use pandas to filter the data for your symbols
This way you can work with multiple lists

Python:
# This is just pseudo-code; so it won't work
import pandas as pd
df = pd.read_csv('master.csv')
my_List = pd.read_csv('my_symbols.csv').values.ravel()
new_df = df[df.isin(my_list)]

# In case you need an output
new_df.to_csv('my_filtered_list.csv')
@UberMachine if possible if you can include the function code from my script into your code then it will be great as people can run your code as a standalone program without requiring any changes in the jupyter notebook. The code snippet in my function takes care of auto-merging the files based on the stock list in the master file one is using.
 

travi

Well-Known Member
Thanks guys, will continue tomo.
so far just tested code from post#128, its working, what it does however needs to be reviewed in longterm for certain changes.

whatever stocks were present in original test file stockdata_20180726.csv have been trimmed to only the stocks in the sample .txt file uploaded by me.
Symbols that weren't common in either data set were discarded in the master file.

will test the other code snippets later.
 

UberMachine

Well-Known Member
@UberMachine if possible if you can include the function code from my script into your code then it will be great as people can run your code as a standalone program without requiring any changes in the jupyter notebook. The code snippet in my function takes care of auto-merging the files based on the stock list in the master file one is using.
Definitely.
I haven't got a hang of pair trade. (Watching lectures on Quantopian on the aame subject)
Once I get it, would definitely do with lot more options.

@ncube Regarding auto-merging, it would be better if you include the dates also. (The first file I downloaded only had serial numbers)
I also curious to know how would auto-merging work. (This is a topic of its own)
My suggestion would be to dump details in a database or HDF file from a directory, scan the directory for changes mark the file, append it and so on.
Do you have any free tools for it (I am developing one on my own since I can't find any comprehensive tool)

That's it for the day
Good night