Python Course With Building a Fintech Investment AI – Lesson 2: Pandas and Getting Financial Data

Intro

Welcome to the second lesson of the awesome Python course for AI!

In the last lesson, we built the first database migrations and made the project setup.

Here is the URL:

Python course with building a fintech investment AI – Lesson 1: Start the project

Today, we will build super exciting stuff and will focus on the financial database that we will use to train our AI later.

We will work with pandas, some financial plugins, and will do a bit of refactoring.

As well, I will teach you how to create CLI commands in Python.

I cannot wait to teach you all of those powerful skills, so let’s start ASAP!

And if you prefer video, here is the youtube version:

1. Change types in migrations

In the last episode, we created DB migrations and database models with some types.

Today we will change them a bit to have the possibility of using prices with many numbers after the decimal.

To do that, we need just change types in variable “date” to “db.Date()”.

Next, we need to change all Integers to “db.Numeric()”.

Take a look at the example below.

id = db.Column(db.Integer, primary_key=True)

company = db.Column(db.String())

date = db.Column(db.Date())

openPrice = db.Column(db.Numeric())

highPrice = db.Column(db.Numeric())

lowPrice = db.Column(db.Numeric())

closePrice = db.Column(db.Numeric())

volume = db.Column(db.Numeric())

2. Delete migrations folder and migrate again

Now, we should save our changes and update the database with the new changes.

I would suggest initing DB from scratch.

As the first step, you need to delete the “migrations” directory (if you created it in the previous lesson).

Next, open terminal and type:

flask db init

Next:

flask db migrate

Next:

flask db migrate

3. Create module prices

Great!

We can focus on the next features now, and go into the prices module.

Let’s start by creating the directory named “prices”.

Create that directory in the root of the folder.

Next, go into the created directory and create the file named “prices.py”.

4.Install necessary dependencies

Before we go into the proper development, we should remember about installing all of the necessary dependencies that we will use for building AI.

First, we need to create yfinance, that will help us to connect with the financial API, and download all necessary data about the prices.

Open the terminal and type:

pip install yfinance

Next, we need to install pandas:

pip install yfinance

The last one that we need to install is pandas datareader that we will use to handling our financial data:

pip install pandas_datareader

5.Import all necessary dependencies

So, we have prepared all the necessary stuff that we needed for development.

Now, we can go into the prices.py file, and import all of these dependencies.

Let’s take a look at the example below:

from datetime import date, timedelta

import yfinance as yf

import pandas as pd

from pandas_datareader import data as pdr

from app import app, db

from models.prices import PriceModel

6. Override yfinance with pandas

Next, we need to override yfinance with pandas.

That line of code should be below imports inside the prices.py file.

yf.pdr_override()

7. Setup CLI

We could do that by typing/commenting on the functions as we had done in golang.

But it’d be a shame if we weren’t using the user-friendly features of the flask.

We can define CLI commands very, very easily.

It’s enough to use the “app” decorator and define the command that should fire the logic that is below the command definition.

Let’s take a look:

@app.cli.command("get-prices")

8. Create function getPrices

Below the CLI command, we need to specify a function that we will want to fire.

In this case, we need a function named “getPrices”.

We will use that to start getting prices. 

@app.cli.command("get-prices")

def getPrices():

9. Create a list of stocks that we will need to monitor

In the next step, we should specify company symbols that we would like to get pricing for.

We should create a list and put there all the NASDAQ symbols as strings.

I’ve used the heaviest 50 of the NASDAQ, but you can specify whatever companies that you would like to get data for.

stockList = ['MSFT', 'AAPL', 'AMZN', 'FB', 'GOOGL', 'GOOG', 'GOOGL', 'INTC', 'NVDA', 'CSCO', 'ADBE', 'NFLX', 'PEP', 'PYP', 'CMCSA', 'TSLA', 'COST', 'GOOGL', 'AMGN', 'TMUS', 'AVGO', 'CHTR', 'TXN', 'GILD', 'QCOM', 'SBUX', 'INTU', 'MDLZ', 'VRTX', 'FISV', 'BKNG', 'ISRG', 'REGN', 'ADP', 'AMD', 'ATVI', 'CSX', 'BIIB', 'ILMN', 'MU', 'AMAT', 'JD', 'ADSK', 'MELI', 'ADI', 'LRCX', 'MNST', 'WBA', 'EXC', 'KHC', 'LULU', 'EA']

10. Iterate through the list of stocks and call function getPrice

In this step, we should take all of the NASDAQ symbols and put them inside the for loop that will iterate through them.

For every iteration of the stock, we should call the function named “getPrice” and pass “stock” as an argument.

for stock in stockList:

 getPrice(stock)

11. Create function getPrice

To have the possibility of using it, we need to create the function “getPrice”.

Setup “stock” as a param name in the function.

def getPrice(stock):

12. Setup date

Now, we can start focusing on the core functionality that we are interested in today, fetching data.

As a first step, we need to specify the date range that we are interested in.

I will set up today as the last day, and day that was 31 days ago, that will be my time-range for the stock prices.

today = date.today()

monthAgo = today - timedelta(days=31)

13. Download stock prices

In the next step, we can start getting data!

Look at the example below. We use pandas_datareader.

Next, I’ve set up stock, that I want to get, the start date, and the end date that is today.

data = pdr.get_data_yahoo(stock, start=monthAgo, end=today)

14. Reformat data by pandas

To have a better life and easier manipulation of data, we need to reformat that a bit.

So we will be able to iterate through that data much more comfortable.

df = pd.DataFrame(data)

15. Iterate through all rows and pass data into the function addPrice

Now is the moment where we can iterate through all of the prices for the xx company.

We use “df.iterrows()” method here because all of the rows are from pandas.

Next, we call “addPrice”, and pass necessary data as arguments.

We pass the whole “row” that we will split later, next we pass index (it’s data here), but formatted.

And we pass “stock” to know what stock is that price for.

for index, row in df.iterrows():

  addPrice(row, index.strftime("%Y-%m-%d"), stock)

16. Create function addPrice

To make using the “addPrice” possible, we need to define that function.

Use “price”, “date”, and “stock”, as params.

def addPrice(price, date, stock):

17. Setup the data for the PriceModel

We are close to the final.

In this step, we should set up the data that we’ve got from financial API.

We need to fit that to our PriceModel, that we’ve created in the Python course Lesson1.

new_price = PriceModel(date=date, company=stock, openPrice=price['Open'], highPrice=price['High'], lowPrice=price['Low'], closePrice=price['Close'], volume=price[‚Volume'])

18. Add price into the database

If our data is ready, we have nothing else to do than just create a new record inside the database.

Don’t forget about „db.session.commit()”, that will update our database.

db.session.add(new_price)

db.session.commit()

19. Import getPrices in app.py

Congratulations!

The prices module is ready now. You can import it inside the app.py.

Add this line of code below the “PriceModel” import.

from prices.prices import getPrices

20. Run get-prices

You can test your application now.

Before you start it, make sure the flask app variable is configured properly (we did it in Lesson 1).

If yes, just open a terminal in the project and type:

flask get-prices

Conclusion

Congratulations!

Now your project has hundreds of financial records.

Those are ready to start working with Artificial Intelligence, which we will build.

What is more important, you’ve learned where to look for the stock prices, how to select them, and how to save them into the database. 

Code repository for the Python course Lesson 2 is here:

https://github.com/Duomly/python-ai-investment-fintech/tree/Python-AI-course-Lesson-2

In the next lesson, we will focus on building the first AI logic for our algorithm.

Keep learning, and I’m super excited I have a chance to teach you how to build powerful investment AI that can predict stock prices!

Thanks for reading,

Radek from Duomly